Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on startup: log snapshot index mismatch #43

Closed
aphyr opened this issue May 5, 2020 · 3 comments
Closed

Panic on startup: log snapshot index mismatch #43

aphyr opened this issue May 5, 2020 · 3 comments

Comments

@aphyr
Copy link
Collaborator

aphyr commented May 5, 2020

With Jepsen 6c063da, Redis f88f866, and Redis-Raft b9ee410, testing with network partitions, process crashes, pauses, and membership changes resulted in nodes panicking on restart, due to a mismatch between log initial indices and snapshots.

lein run test-all --raft-version b9ee410 --time-limit 600 --nemesis partition,kill,pause,member --follower-proxy --test-count 16 --concurrency 4n

20200505T133553.000-0400.zip

In this test run, several keys exhibited split-brain anomalies: multiple timelines of a single key were concurrently observed by distinct subsets of nodes, resulting in the apparent loss of some, or all, committed writes. This issue (possibly a separate problem) is described in #44.

Minutes later, multiple nodes observed a gap between their snapshot and initial log index when restarted by Jepsen. Logs also contained 0 entries. Immediately prior to the panics, Jepsen killed n2, n3, and n5, recovered from a network partition, isolated n2, n3, and n4 away from n1 and n5, then started all nodes.

...
2020-05-05 13:39:18,798{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:kill	:majority
2020-05-05 13:39:21,005{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:kill	{"n2" "", "n3" "", "n5" ""}
2020-05-05 13:39:24,005{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:resume	:all
2020-05-05 13:39:24,109{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:resume	{"n1" "", "n2" "", "n3" "", "n4" "", "n5" ""}
2020-05-05 13:39:27,215{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:join	"n5"
2020-05-05 13:39:27,224{GMT}	WARN	[jepsen nemesis] jepsen.core: Process :nemesis crashed
2020-05-05 13:39:27,224{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:join	"n5"	indeterminate: Command exited with non-zero status 124 on node n4:
2020-05-05 13:39:30,329{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:leave	"n1"
2020-05-05 13:39:30,337{GMT}	WARN	[jepsen nemesis] jepsen.core: Process :nemesis crashed
2020-05-05 13:39:30,337{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:leave	"n1"	indeterminate: Command exited with non-zero status 124 on node n4:
2020-05-05 13:39:33,337{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:stop-partition	nil
2020-05-05 13:39:33,544{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:stop-partition	:network-healed
2020-05-05 13:39:36,544{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:start-partition	:majority
2020-05-05 13:39:36,649{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:start-partition	[:isolated {"n5" #{"n2" "n4" "n3"}, "n1" #{"n2" "n4" "n3"}, "n2" #{"n5" "n1"}, "n4" #{"n5" "n1"}, "n3" #{"n5" "n1"}}]
2020-05-05 13:39:39,649{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:start	:all
2020-05-05 13:39:39,856{GMT}	INFO	[jepsen nemesis] jepsen.util: :nemesis	:info	:start	{"n1" "", "n2" "", "n3" "", "n4" "", "n5" ""}

At 10:39:39, Jepsen started n2, and it immediately panicked with Log initial index (1482) does not match snapshot last index (1400), aborting.

2020-05-05 10:39:39 Jepsen starting redis-server --bind 0.0.0.0 --dbfilename redis.rdb --loadmodule /opt/redis/redisraft.so loglevel=debug raft-log-filename=raftlog.db raft-log-max-file-size=32000 raft-log-max-cache-size=1000000 follower-proxy=yes
30457:C 05 May 2020 10:39:39.772 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
30457:C 05 May 2020 10:39:39.772 # Redis version=999.999.999, bits=64, commit=f88f8661, modified=0, pid=30457, just started
30457:C 05 May 2020 10:39:39.772 # Configuration loaded
30457:M 05 May 2020 10:39:39.773 * Increased maximum number of open files to 10032 (it was originally set to 1024).
30457:M 05 May 2020 10:39:39.773 * Running mode=standalone, port=6379.
30457:M 05 May 2020 10:39:39.773 # Server initialized
30457:M 05 May 2020 10:39:39.773 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
30457:M 05 May 2020 10:39:39.773 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
30457:M 05 May 2020 10:39:39.774 * <redisraft> RedisRaft starting, arguments: loglevel=debug raft-log-filename=raftlog.db raft-log-max-file-size=32000 raft-log-max-cache-size=1000000 follower-proxy=yes
30457:M 05 May 2020 10:39:39.774 * Module 'redisraft' loaded from /opt/redis/redisraft.so
30457:M 05 May 2020 10:39:39.775 * Loading RDB produced by version 999.999.999
30457:M 05 May 2020 10:39:39.775 * RDB age 115 seconds
30457:M 05 May 2020 10:39:39.775 * RDB memory usage when created 1.58 Mb
30457:M 05 May 2020 10:39:39.775 * DB loaded from disk: 0.000 seconds
30457:M 05 May 2020 10:39:39.775 * Ready to accept connections
30457:05 May 10:39:40.276 Loading: Redis loading complete, snapshot LOADED
30457:05 May 10:39:40.276 Loading: Snapshot: applied term=4 index=1400
30457:05 May 10:39:40.276 Snapshot configuration loaded. Raft state:
30457:05 May 10:39:40.276   node <unknown?>
30457:05 May 10:39:40.276 Loading: Log loaded, 0 entries, snapshot last term=4, index=1482
30457:05 May 10:39:40.276 Loading: Log starts from snapshot term=4, index=1482
30457:05 May 10:39:40.276 

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
REDIS RAFT PANIC
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Log initial index (1482) does not match snapshot last index (1400), aborting.

Subsequent restarts of n2 panicked as well.

n3 was also started at 10:39:39, and panicked too:

30592:05 May 10:40:16.888 Loading: Snapshot: applied term=4 index=1402
30592:05 May 10:40:16.888 Snapshot configuration loaded. Raft state:
30592:05 May 10:40:16.888   node <unknown?>
30592:05 May 10:40:16.888 Loading: Log loaded, 0 entries, snapshot last term=4, index=1478
30592:05 May 10:40:16.888 Loading: Log starts from snapshot term=4, index=1478
30592:05 May 10:40:16.888 

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
REDIS RAFT PANIC
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Log initial index (1478) does not match snapshot last index (1402), aborting.

Two minutes later, at 10:41:35, Jepsen started n4, and it too panicked:

30876:05 May 10:41:35.684 Loading: Snapshot: applied term=4 index=1173
30876:05 May 10:41:35.684 Snapshot configuration loaded. Raft state:
30876:05 May 10:41:35.684   node <unknown?>
30876:05 May 10:41:35.684 Loading: Log loaded, 0 entries, snapshot last term=4, index=1390
30876:05 May 10:41:35.684 Loading: Log starts from snapshot term=4, index=1390
30876:05 May 10:41:35.684 

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
REDIS RAFT PANIC
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Log initial index (1390) does not match snapshot last index (1173), aborting.

n1 and n5 did not panic.

@aphyr
Copy link
Collaborator Author

aphyr commented May 5, 2020

Since the split-brain happened minutes before the restart panics, the panics affected nodes on both sides of the split, and because it looks like we can have nodes panic without exhibiting split-brain, I'm tentatively gonna split this into two separate issues. Might turn out to be the same underlying cause, of course. It also looks like panics are pretty common, whereas split brain is less frequent.

Here's a couple cases with just panics:

20200505T132533.000-0400.zip

20200505T143531.000-0400.zip

@aphyr aphyr changed the title Split brain & write loss, possibly involving log snapshot index mismatch Panic on startup: log snapshot index mismatch May 5, 2020
yossigo added a commit that referenced this issue May 8, 2020
When a snapshot is taken the log is rewritten to begin right after the
last applied entry. The process of switching the log got its index
truncated, making all existing entries inaccessible - an attempt to
access them would assert.

Note: truncation *is* generally desired because the log and the index
cannot be atomically updated in the filesystem. For this reason, we
generally truncate and re-build the index when opening the log on
process start. In this case however the index should have been reused
(or at least re-built, although there's really no reason to do that).

This seems to be the root cause for #42 and possibly #43 as well.
@aphyr
Copy link
Collaborator Author

aphyr commented May 9, 2020

Looks like this is still present in 2d1cf30, unfortunately. :-(

20200509T111927.000-0400.zip

@aphyr
Copy link
Collaborator Author

aphyr commented Jun 15, 2020

Looks resolved as of e0123a9.

@aphyr aphyr closed this as completed Jun 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant