From 93e4e888723eb5104440eccbbac1750ffb681e8a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Hoguin?= Date: Tue, 21 May 2024 15:37:02 +0200 Subject: [PATCH] CQ: Fix entry missing from cache leading to crash on read The issue comes from a mechanic that allows us to avoid writing to disk when a message has already been consumed. It works fine in normal circumstances, but fan-out makes things trickier. When multiple queues write and read the same message, we could get a crash. Let's say queues A and B both handle message Msg. * Queue A asks store to write Msg * Queue B asks store to write Msg * Queue B asks store to delete Msg (message was immediately consumed) * Store processes Msg write from queue A * Store writes Msg to current file * Store processes Msg write from queue B * Store notices queue B doesn't need Msg anymore; doesn't write * Store clears Msg from the cache * Queue A tries to read Msg * Msg is missing from the cache * Queue A tries to read from disk * Msg is in the current write file and may not be on disk yet * Crash The problem is that the store clears Msg from the cache. We need all messages written to the current file to remain in the cache as we can't guarantee the data is on disk when comes the time to read. That is, until we roll over to the next file. The issue was that a match was wrong, instead of matching a single location from the index, the code was matching against a list. The error was present in the code for almost 13 years since commit 2ef30dc95eaa7e04a0371fef0de8154f40a36685. --- deps/rabbit/src/rabbit_msg_store.erl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/deps/rabbit/src/rabbit_msg_store.erl b/deps/rabbit/src/rabbit_msg_store.erl index ddc53963bec4..b5be0acce203 100644 --- a/deps/rabbit/src/rabbit_msg_store.erl +++ b/deps/rabbit/src/rabbit_msg_store.erl @@ -907,7 +907,7 @@ handle_cast({write, CRef, MsgRef, MsgId, Flow}, %% the normal logic for that in write_message/4 and %% maybe_roll_to_new_file/2. case index_lookup(MsgId, State) of - [#msg_location { file = File }] + #msg_location { file = File } when File == State #msstate.current_file -> ok; _ ->