Electing a 0.8.x leader during an upgrade can cause a panic in older servers #2889
Labels
type/bug
Feature does not function as expected
type/crash
The issue description contains a golang panic and stack trace
Milestone
Haven't seen this in the wild, but this code could possibly cause older Consul servers to panic if a 0.8.x server gains leadership during an upgrade:
https://github.com/hashicorp/consul/blob/v0.8.0/consul/leader.go#L156-L162
The panic would occur when an older server gets a Raft log entry for the autopilot config, which it won't understand. To avoid this until fixed, make sure to upgrade the followers before updating the current leader.
This is probably pretty rare since most folks upgrade the leader last to avoid unnecessary elections, but the consequences are high enough to make it worth avoiding. We could have the autopilot loop skip out if not all servers are at least at the right version and have it create the config, so that way it's created quickly once all the servers are upgraded, even if there's not a leader transition.
The text was updated successfully, but these errors were encountered: