From ae5e68239fb90214873d00a616e8cf509f7a840d Mon Sep 17 00:00:00 2001 From: Michael Davis Date: Wed, 23 Aug 2023 16:30:35 -0500 Subject: [PATCH] rabbit_stream_queue: Only update leader PID on stream_leader_change event `rabbit_stream_coordinator` is the only authority on a stream queue's leader PID, so we should only trust that process for notifications of leader changes. The call to `update_leader_pid/2` from the `update/2` callback may trigger the effects of a leader change for the stream queue's state without interacting with the stream coordinator. This is noticeable on the Khepri branch where the leader failover test case flakes in cases where the channel process reads a stale record from the metadata store with an old leader PID and mistakenly triggers the leader change to the stale leader PID. Removing this block stops the flake without removing the leader change mechanism to resend unconfirmed messages: the handler of the `stream_leader_change` event from the stream coordinator still triggers leader information changes to the queue's state. This bug is also possible on main but is not reproducible in practice because mnesia updates fast enough that we are very unlikely to read a stale leader PID when looking up the queue record. (cherry picked from commit ec0ffde20cad6d4b672e70a5bf150f7bed14806a) --- deps/rabbit/src/rabbit_stream_queue.erl | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/deps/rabbit/src/rabbit_stream_queue.erl b/deps/rabbit/src/rabbit_stream_queue.erl index 043d0e0e1a1..f898f868092 100644 --- a/deps/rabbit/src/rabbit_stream_queue.erl +++ b/deps/rabbit/src/rabbit_stream_queue.erl @@ -777,8 +777,7 @@ close(#stream_client{readers = Readers}) -> update(Q, State) when ?is_amqqueue(Q) -> - Pid = amqqueue:get_pid(Q), - update_leader_pid(Pid, State). + State. update_leader_pid(Pid, #stream_client{leader = Pid} = State) -> State;