You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see the above sequence of system calls when etcd appends a user data item to its wal file. Now, if a crash happens just before the 4th operation (fdatasync), and if the file system reorders the 2nd append and the 3rd append (this reordering is possible on commonly used file systems such as ext4 ordered mode), during recovery, the server will crash with the following error in its debug-log-file.
...timestamp... I | etcdmain: etcd Version: 2.3.0
...timestamp... I | etcdmain: Git SHA: 3719912
...timestamp... I | etcdmain: Go Version: go1.6
...timestamp... I | etcdmain: Go OS/Arch: linux/amd64
...timestamp... I | etcdmain: setting maximum number of CPUs to 40, total number of available CPUs is 40
...timestamp... N | etcdmain: the server is already initialized as member before, starting as etcd member...
...timestamp... I | etcdmain: listening for peers on http://172.17.0.2:2380
...timestamp... I | etcdmain: listening for client requests on http://172.17.0.2:2379
...timestamp... I | etcdserver: name = infra0
...timestamp... I | etcdserver: data dir = /data/etcd/infra0.etcd/
...timestamp... I | etcdserver: member dir = /data/etcd/infra0.etcd//member
...timestamp... I | etcdserver: heartbeat = 100ms
...timestamp... I | etcdserver: election = 1000ms
...timestamp... I | etcdserver: snapshot count = 10000
...timestamp... I | etcdserver: advertise client URLs = http://172.17.0.2:2379
...timestamp... C | etcdserver: read wal error (walpb: crc mismatch) and cannot be repaired
Two nodes in a three node cluster can easily get into this state and so the majority of servers can go unusable. Thus, the third node in the cluster cannot make progress alone as there is no majority, rendering the cluster unavailable.
Although the window of vulnerability is small, this is a potential problem that can be fixed in etcd's recovery code after a crash.
The text was updated successfully, but these errors were encountered:
Possible cluster unavailability
1 rename(source="/data/etcd/infra0.etcd/member/wal/0000000000000001-000000000000001b.wal.tmp", dest="/data/etcd/infra0.etcd/member/wal/0000000000000001-000000000000001b.wal")
2 append("/data/etcd/infra0.etcd/member/wal/0000000000000001-000000000000001b.wal", offset=88, count=4096)
3 append("/data/etcd/infra0.etcd/member/wal/0000000000000001-000000000000001b.wal", offset=4184, count=4242)
4 fdatasync("/data/etcd/infra0.etcd/member/wal/0000000000000001-000000000000001b.wal")
I see the above sequence of system calls when etcd appends a user data item to its wal file. Now, if a crash happens just before the 4th operation (fdatasync), and if the file system reorders the 2nd append and the 3rd append (this reordering is possible on commonly used file systems such as ext4 ordered mode), during recovery, the server will crash with the following error in its debug-log-file.
...timestamp... I | etcdmain: etcd Version: 2.3.0
...timestamp... I | etcdmain: Git SHA: 3719912
...timestamp... I | etcdmain: Go Version: go1.6
...timestamp... I | etcdmain: Go OS/Arch: linux/amd64
...timestamp... I | etcdmain: setting maximum number of CPUs to 40, total number of available CPUs is 40
...timestamp... N | etcdmain: the server is already initialized as member before, starting as etcd member...
...timestamp... I | etcdmain: listening for peers on http://172.17.0.2:2380
...timestamp... I | etcdmain: listening for client requests on http://172.17.0.2:2379
...timestamp... I | etcdserver: name = infra0
...timestamp... I | etcdserver: data dir = /data/etcd/infra0.etcd/
...timestamp... I | etcdserver: member dir = /data/etcd/infra0.etcd//member
...timestamp... I | etcdserver: heartbeat = 100ms
...timestamp... I | etcdserver: election = 1000ms
...timestamp... I | etcdserver: snapshot count = 10000
...timestamp... I | etcdserver: advertise client URLs = http://172.17.0.2:2379
...timestamp... C | etcdserver: read wal error (walpb: crc mismatch) and cannot be repaired
Two nodes in a three node cluster can easily get into this state and so the majority of servers can go unusable. Thus, the third node in the cluster cannot make progress alone as there is no majority, rendering the cluster unavailable.
Although the window of vulnerability is small, this is a potential problem that can be fixed in etcd's recovery code after a crash.
The text was updated successfully, but these errors were encountered: