Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.3.21 crashing on every node with "etcdmain: walpb: crc mismatch" #11918

Closed
Roguelazer opened this issue May 19, 2020 · 10 comments · Fixed by #11924
Closed

3.3.21 crashing on every node with "etcdmain: walpb: crc mismatch" #11918

Roguelazer opened this issue May 19, 2020 · 10 comments · Fixed by #11924

Comments

@Roguelazer
Copy link

I am attempting to do a rolling upgrade from 3.3.20 to 3.3.21, and every node has failed with "etcdmain: walpb: crc mismatch". For the first node, I removed it and re-bootstrapped it successfully, and it can stop and start fine on 3.3.21 after re-bootstrapping, but now that the second node in the cluster is crashing on startup with the same error, I'm a little suspicious that this is actually a bug in 3.3.21.

None of these machines have ever suffered any hardware failure or unexpected shutdown, and I have been upgrading them to every 3.3.x patch release without ever seeing this issue before.

These nodes are mostly used with V2-protocol-speaking applications, so almost all the data is in the V2 store. Not sure if that makes a difference.

@tangcong
Copy link
Contributor

pr #11888 introduce this issue, i have submitted a pr #11924 to fix this bug. @gyuho @jpbetz

@tangcong
Copy link
Contributor

tangcong commented May 20, 2020

Once wal file is purged, whether it is an existing cluster or a new cluster, restarting etcd(v3.4.8/v3.3.21) will encounter this issue. Please wait for the new release version to solve this problem. thanks. @Roguelazer

@wcollin
Copy link

wcollin commented May 21, 2020

upgrade from 3.3.19 to 3.3.21 ,i sovle this issue by step,but i don't know why.:

  1. remove etcd01 (version 3.3.19)
  2. add etcd01 to cluster (version 3.3.19)
  3. upgrade 3.3.19 to 3.3.21

@tangcong
Copy link
Contributor

@wcollin wal file maybe has not been purged, if you restart etcd now, it will also be crash. v3.3.22,v3.4.9 is not found. @gyuho

@wcollin
Copy link

wcollin commented May 21, 2020

@tangcong thanks a lot ,i will fallback to 3.3.19 and waiting for new version.

@Roguelazer
Copy link
Author

I have verified that 3.3.22 appears to fix this issue.

It does now print a bunch of other unexpected errors when restarting a node:

2020-05-21 20:49:38.996422 W | etcdserver: failed to apply request,took 36.66µs,request header:<ID:9858223652514886896 > lease_revoke:<id:08cf722e4c80f4b7>,resp size:31,err is lease not found
2020-05-21 20:49:38.996545 W | etcdserver: failed to apply request,took 6.129µs,request header:<ID:9858223652514886898 > lease_revoke:<id:08cf722e4c80f4c8>,resp size:31,err is lease not found
2020-05-21 20:49:38.996607 W | etcdserver: failed to apply request,took 3.356µs,request header:<ID:9858223652514886899 > lease_revoke:<id:08cf722e4c80f4bc>,resp size:31,err is lease not found
2020-05-21 20:49:38.996662 W | etcdserver: failed to apply request,took 3.7µs,request header:<ID:9858223652514886900 > lease_revoke:<id:08cf722e4c80f4c0>,resp size:31,err is lease not found
2020-05-21 20:49:38.996748 W | etcdserver: failed to apply request,took 4.63µs,request header:<ID:9858223652514886901 > lease_revoke:<id:08cf722e4c80f4c4>,resp size:31,err is lease not found

The node comes up successfully, though, and appears to function normally

@tangcong
Copy link
Contributor

tangcong commented May 21, 2020

@Roguelazer it is normal behavior, it will print warn log when etcd server failed to apply request , it plays a key role in troubleshooting.

@karolsteve
Copy link

I'm getting the same issue. any news ?

@jmhbnz
Copy link
Member

jmhbnz commented Nov 10, 2023

I'm getting the same issue. any news ?

Hi @karolsteve - etcd 3.3.21 is a three and a half year old release which is no longer supported by the etcd project. Countless bugs and security concerns have been addressed in later releases. Please upgrade to a modern etcd release as a soon as possible.

@karolsteve
Copy link

okay thanks @jmhbnz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

5 participants