Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[controller] Ensure k8s statefulset statuses are fresh #271

Merged
merged 5 commits into from
Feb 11, 2021

Conversation

schallert
Copy link
Collaborator

In debugging #268, I discovered that in all cases where we updated more
than 1 statefulset, it was because the observed generation on the set
was 1 less than the set's actual generation. This means that the
Kubernetes statefulset controller had not yet processed the update, and
that the information on Status that we check (ready replicas, current
replicas, etc.) was not up-to-date.

Also out of paranoia, I changed the update behavior to ensure that we
keep the old object meta fields around when we send the Update to
Kubernetes. I think if this isn't the case, we potentially overwrite a
set even if it had changed since we last process it.

This PR ensures that we stop processing the cluster if any of the set
statuses are not fresh, and that we use familiar conflict methods on set
updated.

Closes #268.

In debugging #268, I discovered that in all cases where we updated more
than 1 statefulset, it was because the observed generation on the set
was 1 less than the set's actual generation. This means that the
Kubernetes statefulset controller had not yet processed the update, and
that the information on `Status` that we check (ready replicas, current
replicas, etc.) was not up-to-date.

Also out of paranoia, I changed the update behavior to ensure that we
keep the old object meta fields around when we send the `Update` to
Kubernetes. I think if this isn't the case, we potentially overwrite a
set even if it had changed since we last process it.

This PR ensures that we stop processing the cluster if any of the set
statuses are not fresh, and that we use familiar conflict methods on set
updated.

Closes #268.
@@ -73,6 +73,7 @@ const (

var (
errOrphanedPod = errors.New("pod does not belong to an m3db cluster")
errNoSetLabel = errors.New("pod does not have a parent statefulset error")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused. Will remove.

// If any of the statefulsets aren't ready, wait until they are as we'll get
// another event (ready == bootstrapped)
for _, sts := range childrenSets {
c.logger.Info("processing set",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to decide whether these logs are too noisy. If we had them, however, debugging this issue would have been much easier.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you simply make these debug logs and then make it easy to turn on debug logging? (might already be easy by just changing the config file?)

@codecov
Copy link

codecov bot commented Feb 10, 2021

Codecov Report

Merging #271 (e111e5b) into master (81aaae1) will increase coverage by 0.47%.
The diff coverage is 91.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #271      +/-   ##
==========================================
+ Coverage   75.62%   76.10%   +0.47%     
==========================================
  Files          32       32              
  Lines        2343     2381      +38     
==========================================
+ Hits         1772     1812      +40     
+ Misses        427      426       -1     
+ Partials      144      143       -1     

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 81aaae1...e111e5b. Read the comment docs.

Copy link
Collaborator

@jeromefroe jeromefroe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@schallert schallert merged commit 81b6f3a into master Feb 11, 2021
@schallert schallert deleted the schallert/check_pod_revisions branch February 11, 2021 04:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

m3db-operator updates multiple statefulsets at the same time
4 participants