New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add elastic scheduler design doc #1887
add elastic scheduler design doc #1887
Conversation
关于优先级这块有一个小细节,机器学习训练job下的pod通常分为master(仅有一个,也可以视为index=0 worker)和worker两种角色,引入弹性机制之后,希望能:优先创建master,删除elastic pod时优先删除worker。 大的思路有两种
关于Priority 有一个比较麻烦的地方,设置pod.Priority 必须通过PriorityClass 来进行,这个对上游的operator 就不太友好了:这个PriorityClass是用户来创建 还是operator 来创建?前者不方便,后者operator太麻烦。为此,有以下几种方法干预task.Priority
|
LGTM overall :) |
/lgtm overall. |
Perhaps we can submit this design to kubeflow community. |
All these design are based on operator users. How about users makeing use of vcjob directly? |
Hi @qiankunli , Thanks for this great work. I think this is a great contribution that might benefit Volcano users globally. We will really appreciate if we can try communicating in english. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm )
docs/design/elastic-scheduler.md
Outdated
![](images/elastic-scheduler-job1-3.png) | ||
|
||
2. Allocate action(already implemented) | ||
- Create minAvailable pods(all job) first and then create elastic pods if there are free resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description here is slightly inaccurate. All pods will be created initially, but minAvailable pods will be scheduled first and then schedule elastic pods if there are free resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is more accurate, I have changed the doc
Signed-off-by: bert.li <qiankun.li@qq.com> update design doc of elastic-scheduler Signed-off-by: Chenxi Jiang <chenxi.jiang.seu@gmail.com> fix some words Signed-off-by: bert.li <qiankun.li@qq.com> fix word Signed-off-by: bert.li <qiankun.li@qq.com>
f474c25
to
39ceee6
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: k82cn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
design doc for #1884