New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 698 - Implement nomad restart JOBID
and nomad run -restart jobfile
#3949
Conversation
I haven't heard anything about this PR. This functionality seems to be desired by many people...are the deficiencies that need to be addressed before this can be merged? |
This feature is useful because there are instances where you wish to restart the existing job definition, triggering any update/canary logic that exists. There are two ways to restart a job: `nomad run -restart jobfile.nomad` If the job already exists, a restart/update will be triggered even if the job definition is identical to the previous version. If the job does not exist it will be registered and executed. `nomad restart jobid` This will restart an existing job. This creates a new job version and triggers the update/canary logic.
@groggemans thanks for the feedback! I've updated the PR to resolve the tests that were failing due to this change. The remaining failed tests seem to also be failing on |
This is quite a useful feature indeed, and I hope it gets merged soon. |
nomad restart JOBID
and nomad run -restart jobfile
@shantanugadgil I updated the title, thanks for the feedback and support! |
Any comments on this PR from the hashicorp folks? @dadgar |
@dadgar is there anything I need to do to help get this merged into the mainline? Thanks! |
Hey @maihde, Awesome work on this and sorry for the long delay giving you a response. This is definitely a feature the team wants and we actually planned on implementing in 0.9.X. Unfortunately the PR as it is doesn't quite address all the points we would want. The issues I see:
a) Force b) Rolling Restart * Both CLI syntaxes are just spitball ideas
From a project perspective, the job struct should define the users intention on what should be run. The restart field is a transition but is not the desired state, and thus shouldn't be in the job spec. From a technical perspective, this works currently because any time the user submits a new job the restart field is cleared. In future releases of Nomad, there will likely be the job and a set of job variables that are interpolated in to define the final job. Thus we can imagine that the job doesn't change every time the user makes a change (changes a variable). Thus we want to capture the restart outside of the job spec. This is part of the reason getting you a review has taken a while since we didn't quite have the mechanism to do this until more recently: https://github.com/hashicorp/nomad/blob/master/nomad/structs/structs.go#L5669 As for next steps, unfortunately the PR as it is can't be merged and the delta between what we would like is quite large. If this is something you are interested in taking on let me know and we can potentially collaborate on an RFC before more code is written. Alternatively, given the amount of work, the team is also happy to implement this in 0.9.X. Thanks for the patience, |
Alex, thanks for the detailed explanation and suggestions. I'm happy to
work on this to meet your teams specification if doing so will get the
feature into the baseline sooner! Let me know how you would like to
collaborate on the RFC.
Regards,
Michael
…On Tue, Jul 17, 2018 at 4:04 PM, Alex Dadgar ***@***.***> wrote:
Hey @maihde <https://github.com/maihde>,
Awesome work on this and sorry for the long delay giving you a response.
This is definitely a feature the team wants and we actually planned on
implementing in 0.9.X. Unfortunately the PR as it is doesn't quite address
all the points we would want. The issues I see:
1. Extensibility
a) Force
It would be nice to allow the operator to be able to do $ nomad job
restart -force <job> in which the update stanza is ignored.
b) Rolling Restart
For jobs that do not define a update stanza, ability to do a rolling
restart: `$ nomad job restart -rolling-interval=30s -rolling-parallel=2
1. Shadow Restart Field
From a project perspective, the job struct should define the users
intention on what should be run. The restart field is a transition but is
not the desired state, and thus shouldn't be in the job spec.
From a technical perspective, this works currently because any time the
user submits a new job the restart field is cleared. In future releases of
Nomad, there will likely be the job and a set of job variables that are
interpolated in to define the final job. Thus we can imagine that the job
doesn't change every time the user makes a change (changes a variable).
Thus we want to capture the restart outside of the job spec.
This is part of the reason getting you a review has taken a while since we
didn't quite have the mechanism to do this until more recently:
https://github.com/hashicorp/nomad/blob/master/nomad/
structs/structs.go#L5669
------------------------------
As for next steps, unfortunately the PR as it is can't be merged and the
delta between what we would like is quite large. If this is something you
are interested in taking on let me know and we can potentially collaborate
on an RFC before more code is written. Alternatively, given the amount of
work, the team is also happy to implement this in 0.9.X.
Thanks for the patience,
Alex
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3949 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAOCIiE1QcyyxB7OadGszG6sxx1FQ6WNks5uHkNlgaJpZM4Sf1H9>
.
|
Any news on if this made it into 0.9? Prometheus managed by nomad is a great example of something that needs a periodic kick to read new config files, which nomad can't currently do. |
@maihde @the-maldridge We definitely appreciate your work. I am closing this PR since its outdated, and also based on @dadgar's comment above. Job restart is slated for 0.9.2 or 0.9.3. We are also implementing allocation restart/task restart before that in the next point release after 0.9 is out. |
Thanks for the update, I'll keep an eye out for 0.9.x. |
@preetapan Do you have any updates on a release date for job restart? |
@jpclarks |
Thank you. I noticed the recent addition of support for alloc restart, but we really need the ability to restart jobs as described in #698. |
It's still on the roadmap - but in the meantime you should be able to use the api to write a rolling job restarter until we get it into nomad core with the alloc restart api |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Implements
nomad restart JOBID
andnomad run -restart jobfile.nomad
. Also includes one commit for a minor comment error.