Upgrading from v2.3 through v3.0 and v3.1 to v3.2 results in panic #9480
The docs recommend upgrading from v2.3 to v3.2 by first upgrading to each minor version along the way, however there seems to be an issue if you perform this transition too quickly, specifically if there are no writes to the v3 backend or there are no snapshots produced during v3.0 or v3.1 then this causes v3.2 to panic on startup.
To reproduce this, start with an etcd v2.3 server which does have a snapshot (this bug does not occur if no snapshots have taken place yet). Stop the server and replace it with a v3.0 server. Everything seems fine, next stop the server and replace it with a v3.1 server. Again everything is fine. Finally, stop the server and replace it with a v3.2 server and witness this panic when the server starts up:
The bug seems to have been introduced in this patch from last year.
It this case, the new
Do you have any v3 data?
My minimal repro steps only require using
The very low
You can also check the contents of the
At this point, remove the server container with
You'll see from the logs of that container that it's up and has migrated from 2.3 to 3.0 and enabled v3 features:
Remove this container again to upgrade it to v3.1:
You'll see from the logs of that container that it's up and has migrated from 3.0 to 3.1 and enabled v3.1 features:
Remove this container again to upgrade it to v3.2:
This container will exit soon after it starts. Use
This was referenced
Mar 23, 2018
@gyuho I see that you added comments in upgrade checklist to not upgrade the server unless you migrate v3 data. The comments were added to v3.0, v3.1, v3.2, v3.3, v3.4
I think you can remove v3.0 and v3.1 from the list as the bug was introduced in v3.2+
I'm running 3.1.x in prod with v2 data and I think @jlhawn also mentioned it works.
Yes. Only upgrade to 3.2 with no v3 keys will panic (no consistent index has been set). It's not expected though... but we've decided to keep it as it is (sorry, too late to backport a fix to all 3.x branches), because bypassing it requires too much of manual unsafe operations. Safest workaround is write some dummy v3 keys.