Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization of pod batch restart in cluster upgrade #667

Closed
wants to merge 2 commits into from

Conversation

xiedeyantu
Copy link
Contributor

Optimize the statefulset update, patching labels first will not trigger forbidden error when updating statefulset.
codes will not enter the subsequent process of deleting and reconstructing statefulset.
Pods are no longer restarted when new shards/replicas are added, modified and deleted.

…er freezing forbidden error when updating statefulset.

codes will not enter the subsequent process of deleting and reconstructing statefulset.
Pods are no longer restarted when new shards/replicas are added, modified and deleted.
@xiedeyantu
Copy link
Contributor Author

@Slach

@xiedeyantu
Copy link
Contributor Author

@sunsingerus

@xiedeyantu
Copy link
Contributor Author

@alex-zaitsev could you take a look at this pr please?

@xiedeyantu
Copy link
Contributor Author

@sunsingerus Excuse me, is there any problem with this pr, or is there any better solution to this problem?

@@ -253,7 +253,7 @@ func (w *worker) updateCHI(old, new *chop.ClickHouseInstallation) error {
WithStatusAction(new).
M(new).F().
Info("reconcile started")
w.a.V(2).M(new).F().Info("action plan\n%s\n", actionPlan.String())
//w.a.V(2).M(new).F().Info("action plan\n%s\n", actionPlan.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it intentional to disable this log line? (If so: why not remove it completely?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When debugging, this will report null pointer exception. I wonder if it is an environment problem, so do this.

@aksdb
Copy link
Contributor

aksdb commented Apr 1, 2021

I tried this patchset / PR in our cluster, since I had a similar problems (pods being restarted completely when labels change), but now they don't update at all. If I change the clickhouse version in a podtemplate and apply the new custom resource definition, no new changes are rolled out. The labels are updated, but that's it :-/
Without this PR those changes are immediately applied.

@alex-zaitsev
Copy link
Member

I tried this patchset / PR in our cluster, since I had a similar problems (pods being restarted completely when labels change), but now they don't update at all. If I change the clickhouse version in a podtemplate and apply the new custom resource definition, no new changes are rolled out. The labels are updated, but that's it :-/
Without this PR those changes are immediately applied.

PR is incorrect. In order to solve the original issue we will remove those confusing labels at all. They were more for debugging purposes.

@xiedeyantu
Copy link
Contributor Author

I tried this patchset / PR in our cluster, since I had a similar problems (pods being restarted completely when labels change), but now they don't update at all. If I change the clickhouse version in a podtemplate and apply the new custom resource definition, no new changes are rolled out. The labels are updated, but that's it :-/
Without this PR those changes are immediately applied.

This PR is to solve the problem of pod restart. The pod tag has forgotten to update in the code. You can be updated without my PR, which will inevitably lead to the restart of pod.

@xiedeyantu
Copy link
Contributor Author

I tried this patchset / PR in our cluster, since I had a similar problems (pods being restarted completely when labels change), but now they don't update at all. If I change the clickhouse version in a podtemplate and apply the new custom resource definition, no new changes are rolled out. The labels are updated, but that's it :-/
Without this PR those changes are immediately applied.

PR is incorrect. In order to solve the original issue we will remove those confusing labels at all. They were more for debugging purposes.

Now, each sts only controls one pod. I don't think there is a need to configure tags on pods. Just configure tags on sts. I don't know what's wrong with my PR.

@xiedeyantu xiedeyantu closed this Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants