-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about the project's structure #7
Comments
hi Xiecheng My final implementation is pure in Yarn and only use Java, you can check my commit logs to see which part of codes in YARN I have changed. Actually, the most of the codes I implemented are in NodeManager and Capacity Scheduler. And also, if you just started working on Hadoop ecosystem, the newly released hadoop-3.0 has a whole new implementation for Docker management and is not compatible with Hadoop-2.7 version. Wei Chen |
Hi, @yncxcw sorry to disturb you again. I have another question about the application scene of big-c system. As many papers mentioned, the long jobs will consume most cluster's resources and if the short jobs are scheduled after the long jobs, the head-of-line blocking will happen. But I wonder whether the head-of-line blocking problem exists in the real production cluster. As the Alibaba company's public cluster data wiki told us, the utilization rate of resources in the real production cluster can always be lower than 50%, so in my opinion, above the half of servers in the real production cluster have enough idle resources to execute the short jobs, and we won't see the head-of-line blocking problem in the real production cluster. What's your opinion about the head-of-line blocking problem in the real production cluster? |
Hi, xiecheng That's fine. I am happy to discuss research questions.
1 is a special case of 2, like to ensure SLA of short jobs and avoid head-of-line blocking, the cluster needs to reserve part of resources in case the burst of short jobs will not be blocked by long jobs. |
@yncxcw Thanks again for your detailed explanation.
Thanks in advance:smile:~ |
hi, xiecheng That's OK. For 1, yes, the motivation of these projects is to minimize the queueing delays, either queued at master node, like yarn or queuing at slave node, like sparrow. For2. yes, our design purpose is to have a mechanism to implement "preemption without killing", like the traditional OS. One thing should be noticed here is we preempt before the jobs are scheduled to target nodes since the resource manager has the full picture of cluster utilization and can make the optimal decisions.
Wei |
@yncxcw That's great. Thanks for your patient explanation. So the big-c system doesn't need extra special servers to execute short jobs, and the final goal of preemptive scheduling is to make full use of cluster's resources and save the costing. Your explanation cleared up my misunderstandings about this system. |
That's OK~ |
Hi, Chen Wei, I have read your ATC paper and look through this repository. And I have some questions about the project's structure.
Thanks in advance:smile:~
The text was updated successfully, but these errors were encountered: