Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not exit reconciliation if scaling errors #90

Merged
merged 2 commits into from
Oct 12, 2023

Conversation

jhalterman
Copy link
Member

@jhalterman jhalterman commented Oct 5, 2023

Following #82, downscaling attempts are prevented in a zone if some other zone has non-ready pods. One situation where this could happen is if zone-C is rolling around the same time that zone-A starts experiencing evictions. In this case, zone C may also want to downscale some nodes, but is unable to because zone A has non-ready nodes, and since reconciliation exits when downscaling fails, zone C can't finish rolling its pods either.

This change allows reconciliation to continue even if scaling fails, so that updates may proceed.

@jhalterman jhalterman requested a review from a team as a code owner October 5, 2023 01:01
@jhalterman jhalterman marked this pull request as draft October 5, 2023 01:03
This change continues reconciliation if scaling fails, so that updates may proceed if possible.
@jhalterman jhalterman force-pushed the do-not-return-on-scaling-error branch from 9529901 to f267b32 Compare October 5, 2023 01:09
Copy link
Contributor

@56quarters 56quarters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, let's try it out

@jhalterman jhalterman marked this pull request as ready for review October 12, 2023 19:59
@jhalterman jhalterman merged commit 2a9f728 into grafana:main Oct 12, 2023
6 checks passed
@jhalterman jhalterman deleted the do-not-return-on-scaling-error branch October 12, 2023 20:07
56quarters added a commit to grafana/mimir that referenced this pull request Nov 3, 2023
Pulls in fixes to live-lock issues triggered by the combination of
HPA scaling changes and rollouts happening at the same time.

See grafana/rollout-operator#90
See grafana/rollout-operator#92

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
56quarters added a commit to grafana/mimir that referenced this pull request Nov 3, 2023
Pulls in fixes to live-lock issues triggered by the combination of
HPA scaling changes and rollouts happening at the same time.

See grafana/rollout-operator#90
See grafana/rollout-operator#92

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants