Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply_snapshot(server_id, install_snapshot): Assertion `! _snapshot_application_done.contains(from)' failed #15222

Closed
kbr-scylla opened this issue Aug 30, 2023 · 2 comments
Assignees
Milestone

Comments

@kbr-scylla
Copy link
Contributor

A follower may receive apply_snapshot request from a leader while it's already processing a previous snapshot apply request from that same leader. This can happen after a connection between the leader and follower is severed and a new connection is created.

But we have an assertion that this would never happen.

ERROR 2023-08-29 11:05:13,273 [shard 0] raft - apply_snapshot[a15c78d8-ad4e-40d1-ba6e-3e8d948a5496]: ignore outdated snapshot 19995803-df74-435f-b36e-346c8a527033/45 current one is 19995803-df74-435f-b36e-346c8a527033/45, commit_idx=45
scylla: raft/server.cc:1154: virtual future<snapshot_reply> raft::server_impl::apply_snapshot(server_id, install_snapshot): Assertion `! _snapshot_application_done.contains(from)' failed.
Aborting on shard 0.
Backtrace:
  /home/bhalevy/dev/scylla/build/dev/seastar/libseastar.so+0x3b57fc
  /home/bhalevy/dev/scylla/build/dev/seastar/libseastar.so+0x3e0852
  /lib64/libc.so.6+0x3db6f
  /lib64/libc.so.6+0x8e843
  /lib64/libc.so.6+0x3dabd
  /lib64/libc.so.6+0x2687e
  /lib64/libc.so.6+0x2679a
  /lib64/libc.so.6+0x36146
  0x2773f04
  0x26ca3b5
  0x26cd93a
  /home/bhalevy/dev/scylla/build/dev/seastar/libseastar.so+0x3c402f
  /home/bhalevy/dev/scylla/build/dev/seastar/libseastar.so+0x3c5117
  /home/bhalevy/dev/scylla/build/dev/seastar/libseastar.so+0x3c45e9
  /home/bhalevy/dev/scylla/build/dev/seastar/libseastar.so+0x33f108
  /home/bhalevy/dev/scylla/build/dev/seastar/libseastar.so+0x33e342
  0xfbe229
  0xff5020
  0xfbd31c
  /lib64/libc.so.6+0x27b49
  /lib64/libc.so.6+0x27c0a
  0xfbbe24
------ Ending test dev/topology_experimental_raft.test_tablets.1::test_tablet_metadata_propagates_with_schema_changes_in_snapshot_mode ------

logs: https://github.com/scylladb/scylladb/files/12465785/testlog.tar.gz (the above is from scylla-91.log)

cc @gleb-cloudius

kbr-scylla pushed a commit that referenced this issue Aug 31, 2023
… may happen

server_impl::apply_snapshot() assumes that it cannot receive a snapshots
from the same host until the previous one is handled and usually this is
true since a leader will not send another snapshot until it gets
response to a previous one. But it may happens that snapshot sending
RPC fails after the snapshot was sent, but before reply is received
because of connection disconnect. In this case the leader may send
another snapshot and there is no guaranty that the previous one was
already handled, so the assumption may break.

Drop the assert that verifies the assumption and return an error in this
case instead.

Fixes: #15222

Message-ID: <ZO9JoEiHg+nIdavS@scylladb.com>
@mykaul mykaul added this to the 5.4 milestone Oct 19, 2023
@denesb
Copy link
Contributor

denesb commented Dec 18, 2023

@kbr-scylla please evaluate for backport.

kbr-scylla pushed a commit that referenced this issue Dec 19, 2023
… may happen

server_impl::apply_snapshot() assumes that it cannot receive a snapshots
from the same host until the previous one is handled and usually this is
true since a leader will not send another snapshot until it gets
response to a previous one. But it may happens that snapshot sending
RPC fails after the snapshot was sent, but before reply is received
because of connection disconnect. In this case the leader may send
another snapshot and there is no guaranty that the previous one was
already handled, so the assumption may break.

Drop the assert that verifies the assumption and return an error in this
case instead.

Fixes: #15222

Message-ID: <ZO9JoEiHg+nIdavS@scylladb.com>
(cherry picked from commit 55f047f)
@kbr-scylla
Copy link
Contributor Author

Backported to 5.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants