Skip to content

Upgrade engines on MSR hosts individually to avoid downtime#265

Merged
kke merged 3 commits into
masterfrom
finetune-msr-engine-upgrade
Dec 8, 2020
Merged

Upgrade engines on MSR hosts individually to avoid downtime#265
kke merged 3 commits into
masterfrom
finetune-msr-engine-upgrade

Conversation

@kke
Copy link
Copy Markdown
Collaborator

@kke kke commented Dec 7, 2020

Currently it is quite easy to end up in a situation where all the MSR hosts are down simultaneously during the upgrade.

(for example, with 18 worker nodes and 2 msr nodes, when performing the upgrades in 10% chunks, the two msr nodes will be upgraded simultaneously)

This PR changes the engine upgrades on MSR hosts happen individually like it is already done for the managers. After the upgrade a healthcheck is performed, if the MSR did not resume, the upgrade process will be halted.

I doubt anyone has so many MSR hosts that running the upgrades in 10% batches like for the MKE workers would make much change in process duration.

(reported by @mgueye01 on slack)

@kke kke added the non-breaking change Does not change functionality or require user actions label Dec 7, 2020
@kke kke requested review from jasmingacic and jnummelin December 7, 2020 09:15
@kke kke changed the title Upgrade engines on MSR hosts individually Upgrade engines on MSR hosts individually to avoid downtime Dec 7, 2020
Comment thread pkg/product/mke/phase/upgrade_engine.go Outdated
@kke kke merged commit afcd08e into master Dec 8, 2020
@kke kke deleted the finetune-msr-engine-upgrade branch December 8, 2020 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

non-breaking change Does not change functionality or require user actions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant