Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement job scale up and down #787

Merged
merged 12 commits into from
May 8, 2020

Conversation

hzxuzhonghu
Copy link
Collaborator

@hzxuzhonghu hzxuzhonghu commented Apr 30, 2020

Address #782

Note:

  1. You are responsible to watch the hostfile changes under /etc/volcano if you plan to do scale up/down later.
  2. The env VC_{task name}_HOSTS and VC_{task name}_NUM can not be used when you are using scale up/down.

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 30, 2020
@@ -215,6 +215,7 @@ func NewJobController(
// Register actions
state.SyncJob = cc.syncJob
state.KillJob = cc.killJob
state.UpdateJob = cc.updateJobFn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we handle this in SyncJob?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I have tried and it is tricky. Maybe a separate prin future, there can be lots of flags to control different behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to statefulset controller, in syncJob, it just make sure pod & replicas keep sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any update?

@hzxuzhonghu
Copy link
Collaborator Author

Note: there is one leftover issue i can not tackle:
In the svc plugin we do inject envs "VC_%s_HOSTS" and "VC_%s_NUM". They can not be changed without restart pod.
What's lucky is that: I have not heard any user adopted them

@k82cn
Copy link
Member

k82cn commented Apr 30, 2020

Note: there is one leftover issue i can not tackle:
In the svc plugin we do inject envs "VC_%s_HOSTS" and "VC_%s_NUM". They can not be changed without restart pod.
What's lucky is that: I have not heard any user adopted them

Update this limitation into design doc :)

@hzxuzhonghu
Copy link
Collaborator Author

Update this limitation into design doc :)

How about delete these two envs? IMO, it is a little hacky to do so and error prone.

@k82cn
Copy link
Member

k82cn commented Apr 30, 2020

Update this limitation into design doc :)

How about delete these two envs? IMO, it is a little hacky to do so and error prone.

That's ok to me; please also highlight in the document :)

@TravisBuddy
Copy link

Travis tests have failed

Hey @hzxuzhonghu,
Please read the following log in order to understand the failure reason.
It'll be awesome if you fix what's wrong and commit the changes.

TravisBuddy Request Identifier: 5ade14e0-8f79-11ea-b91d-fdb092ef3eb4

@hzxuzhonghu
Copy link
Collaborator Author

@k82cn take another round of review

@k82cn
Copy link
Member

k82cn commented May 8, 2020

/lgtm
/approve

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label May 8, 2020
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu, k82cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2020
@volcano-sh-bot volcano-sh-bot merged commit 2a495c5 into volcano-sh:master May 8, 2020
@hzxuzhonghu hzxuzhonghu deleted the elastic-2 branch May 8, 2020 08:30
volcano-sh-bot added a commit that referenced this pull request May 8, 2020
…7-origin-release-0.4

Automated cherry pick of #787: Implement Job scale up down
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants