-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/rolling update #66
Conversation
Would it add too much complexity to allow maxSurge and maxUnavailable on the same config like deployments do? |
I'd rather not change any behaviour and/or scheduler config at this moment @andrehp. Do you think this is a must have? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree @cscatolini. Maestro already has the query string maxsurge
which suffices many cases
controller/utils.go
Outdated
db, clientset, configYAML, scheduler) | ||
if err != nil { | ||
if err == nil || !strings.Contains(err.Error(), "not found") { | ||
logger.WithField("pod", pod.GetName()).Debugf("pod still exists, deleting again") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log the error here if not nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
I wouldn't say it's a must have, but it might be nice to speed up deploys, though I do agree it might not be the best idea to change this behavior right now.
I did a quick search on the docs about maxsurge but didn't find anything. But I think maestro's maxsurge is different from a k8s deployment's, isn't it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments. I'll review this again once you made the changes we've discussed.
controller/utils.go
Outdated
@@ -276,51 +271,83 @@ func waitTerminatingPods( | |||
exit := true | |||
select { | |||
case <-ticker.C: | |||
for _, pod := range deletedPods { | |||
if operationManager.WasCanceled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't you have the same check here?
if operationManager != nil && operationManager.WasCanceled() {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it should
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
controller/utils.go
Outdated
} | ||
|
||
if len(createdPod.Status.Phase) == 0 { | ||
//HACK! Trying to detect if we are running unit tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't this have any unintended side effects? Maybe add a logger here?
You can talk to @lftakakura. We don't want major changes but we believe that creating 25% new pods before deleting previous versions may be a big burden in our infrastructure. He is thinking about some way to speed the process and will update this PR soon.
Yes. It is a global maestro's config instead of per scheduler. The default we're using in prod is 25%. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Create rooms before deleting old ones * Remove unnecessary comments * Fix rolling-update flow * Create pod and delete old one after in each chunk * Remove waitCreatingAndDeleteOldPods * Log error if not able to get pod in waitTerminatingPods and waitCreatingPods * WIP: Use goroutines to perform rolling update * Use goroutines to perform rolling update
* Create rooms before deleting old ones * Remove unnecessary comments * Fix rolling-update flow * Create pod and delete old one after in each chunk * Remove waitCreatingAndDeleteOldPods * Log error if not able to get pod in waitTerminatingPods and waitCreatingPods * WIP: Use goroutines to perform rolling update * Use goroutines to perform rolling update
* Create rooms before deleting old ones * Remove unnecessary comments * Fix rolling-update flow * Create pod and delete old one after in each chunk * Remove waitCreatingAndDeleteOldPods * Log error if not able to get pod in waitTerminatingPods and waitCreatingPods * WIP: Use goroutines to perform rolling update * Use goroutines to perform rolling update
Deploy pods before deleting them in a rolling-update respecting maxSurge