Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trigger deployment as success when new replicaSet has reached min you need #2143

Conversation

kiich
Copy link
Contributor

@kiich kiich commented Mar 20, 2017

Current check for deployment as success is to get related replicasets and then wait till all pods are ready.

This PR is to fine tune this logic where we get the new replica set associated to the deployment, wait until this new replicaset has reached minimum we require which is calculated from number of replicas needed in deployment minus value of maxUnavailable.

Once this value is reached, we treat the deployment is ready.

@k8s-ci-robot
Copy link

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Mar 20, 2017
@kiich
Copy link
Contributor Author

kiich commented Mar 20, 2017

I signed it!

@thomastaylor312 thomastaylor312 self-requested a review March 20, 2017 18:08
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 21, 2017
Copy link
Contributor

@thomastaylor312 thomastaylor312 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a smart logic update. Just a few changes and then I'll do some more testing with it

for _, r := range rs.Items {
list, err := getPods(client, value.Namespace, r.Spec.Selector.MatchLabels)
currentReplicaSet, err := client.ReplicaSets(value.Namespace).Get(r.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this Get have to happen? They should be up to date from the Replica sets query above. You should just be able to just use rs.Items

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Amended.

if err != nil {
return false, err
}
newDeployment := struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you use this struct in 3 places, it might just be cleaner/easier to create it as a type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep good point. Created as type in same file as I wasn't 100% sure where types used just in this file should go. Let me know if it needs to go to another place.

@kiich
Copy link
Contributor Author

kiich commented Mar 24, 2017

Not entirely sure why circleci failed with the error on memory limit after my latest commit. @thomastaylor312 is there anything extra I can do from my side?

@thomastaylor312
Copy link
Contributor

@kiich I am guessing that isn't a problem with your code. We have been hitting the limit with some CI things. I'll test this afternoon or over the weekend. Thanks for the quick response!

@thomastaylor312 thomastaylor312 added this to the 2.3.0 milestone Mar 24, 2017
@@ -59,6 +61,12 @@ type Client struct {
SchemaCacheDir string
}

// Deployments hold associated replicaSets for deployments
type Deployments struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be unexported because other packages won't need to use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - fixing.

@thomastaylor312
Copy link
Contributor

@kiich I am done testing and things look good! I had one comment about this. It looks like it works as intended, but it does change the current behavior in one way. Right now as the --wait flag functions, it doesn't matter how many replicas you have, it just matters that all of the pods are ready. With this, if someone has replicas set to 1 and hasn't set maxUnavailable to 0 in their Update Strategy (1 is the default) or to type: Recreate, it will return as ready when it actually isn't. I know that is perfectly logical with how Deployments work, but I worry about other users being confused when the behavior changes on them. We can make it clear in the release notes, but if you have any other ideas on what we could do, I'd love to hear them.

Could you please update the documentation here to reflect the change in behavior in 2.3 and also the --help documentation for the rollback, upgrade, and install commands? Once that is in, we should be good to merge. Sorry for the back and forth on this, I am really excited to see this land!

@kiich
Copy link
Contributor Author

kiich commented Mar 27, 2017

@thomastaylor312 Great to hear that! Good point on the replicas set to 1 and maxUnavailable set to/default to 1 case - the only thing I can think of right now is to catch when the "minAvailable" (deployment replicas - maxUnavailable basically) is 0 and set to 1 in deploymentsReady to ensure behaviour is consistent with other scenarios?

Do you want me to wait updating the docs until we figure out the above?

@kiich
Copy link
Contributor Author

kiich commented Mar 27, 2017

@thomastaylor312 Although thinking about it a bit more, by setting "minAvailable" to 1 in deploymentsReady when user didn't explicitly ask for it kind of feels wrong to me where as highlighting the change in behaviour in dcoumentation/release notes might suffice. (i.e. leaving the behaviour as per current PR)

Let me know your thoughts.

@kiich
Copy link
Contributor Author

kiich commented Mar 27, 2017

Also @thomastaylor312 when using type: Recreate I was able to observe that the new wait logic still waited until the new replicaset was ready instead of returning as ready immediately?

@thomastaylor312
Copy link
Contributor

@kiich Sorry about that. I had written that and then tested again and forgot to remove it. type: Recreate works as it should.

I think we'll just go with the documentation route on this one. Once that is pushed up, I'll read through it and then merge if it looks good

@thomastaylor312
Copy link
Contributor

@kiich Just circling back on this one. We want to get this in to 2.3 and so are just waiting on that documentation before merging.

@kiich
Copy link
Contributor Author

kiich commented Apr 1, 2017 via email

@kiich
Copy link
Contributor Author

kiich commented Apr 3, 2017

@thomastaylor312 Apologies for the delay - Docs updated and hopefully what you were looking for.

I had one question in that I wasn't sure how much detail to go into regards to the scenario where replicas is set to 1 and maxUnavilable not set to 0 - I assumed we could put most of the explanation in the release notes rather than in the --help and md doc?

@thomastaylor312
Copy link
Contributor

thomastaylor312 commented Apr 3, 2017

Yes, that is correct. I would at least mention it in the md doc though. Also, I don't think you need to worry about updating all the man pages right now. I think we do that right before the release. Thank you for thinking of that though! Otherwise, I think mentioning about the maxUnavailable gotcha would be the last small change

@kiich
Copy link
Contributor Author

kiich commented Apr 3, 2017

Ah got it - further update to the md doc done. Let me know if that is enough/need further clarification.

Sorry about the extra changes in the man pages and cobra generated md docs - pretty new to this so was too keen to make sure everywhere was updated!

@thomastaylor312
Copy link
Contributor

@kiich That documentation change looks great. I can give this a merge as soon as you rebase and if you can pop off the cobra and man doc changes. Thanks for all of the hard work!

@thomastaylor312
Copy link
Contributor

Oh, and I'll figure out tests. It looks like the memory issue/spurious failure we have been running into lately

… you need which is number of replicas minus maxUnavailable
@kiich kiich force-pushed the deploymentsReady-when-newRS-has-minimumReplicas branch from 427d018 to e3655bb Compare April 4, 2017 13:25
@kiich
Copy link
Contributor Author

kiich commented Apr 4, 2017

That's great! Commits squashed and doc changes removed. Let me know if anything needs doing!

@thomastaylor312
Copy link
Contributor

@kiich This still needs a rebase against master to solve the merge conflict.

@kiich
Copy link
Contributor Author

kiich commented Apr 4, 2017

Ah apologies - I didn't get around in working on the conflict. I've used the github feature to resolve it as I was interested in trying that out - hope this is still ok.

@thomastaylor312 thomastaylor312 merged commit 7cbed3f into helm:master Apr 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. feature in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants