Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the input vector #39

Open
Otis-Dylan opened this issue Sep 6, 2021 · 2 comments
Open

Questions about the input vector #39

Otis-Dylan opened this issue Sep 6, 2021 · 2 comments

Comments

@Otis-Dylan
Copy link

Hi, after reading the paper and the code, I have some questions about the input vector
1、I noticed that the number of executors, which is assigned to the node's DAG, is included in the input vector. Could I ask why decima takes this as one of the feature of a node?
2、the action consists of twp parts, selecting node and selecting the number of executors. I noticed that decima will calculate each job's upper limit. Could I just calcuate the node's upper limit, I mean if it is feasible that if I change the second part of the action into calculating each node's upper limit. Could I ask the reason why Decima choose to calculate each job's upper limit?
Really look forward to your reply, thank you very much :)

@hongzimao
Copy link
Owner

Thanks for your questions.

  1. The scheduler needs to know how many available executors are at disposal when making the scheduling decision. In other words, when the number of available executors are different, the optimal scheduling decision might be different. For example, when there is only 1 executor available, decima will have to prioritize the executor to the most important node. But if there are 50 free executors, decima might schedule executors to two parent stages in a job (i.e., when executors are too few, only running one parent stage can be suboptimal).

  2. We associate the parallelism to DAG mainly because we want to reduce the problem complexity. Our paper section 5.2, parallelism limit section, paragraph "Decima’s action specifies job-level parallelism, as opposed fine-grained stage-level parallelism...." explains this point in more details.

Hope these help!

@jahidhasanlinix
Copy link

Well explained.

I just have a question regarding #1, You said Decima will have to prioritize the executor to the most important node => My question here is, How does it really prioritize the executor to do such a job? For example: I have multiple jobs in the DAG and some are free or no dependency, now how I can able to teach the patterns of the node to communicate with the executor to complete the job and in an optimized way (I guess here ML concept is involved). Is there any special parameter you would like to add so I can have a good understanding of it? Also which part of the code actually allows DAG to do such operation with the GNN, how it's actually triggered here in the GNN part from the DAG job state.

Another thing, so far what I have found and understand, I just want to share and get your opinion, Is it possible to create DAG without RDD in the spark env, I was trying to figure it out but did not get any better explanation here, would you like add some points here.

Again thank you so much and look forward to getting some valuable information from you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants