Queue process termination and restart causes consumer to stop receiving messages #13369
-
Describe the bugWe are noticing some odd behavior when our queue processes crash. The reason for the crashing is hard to identify, we are assuming some windows security software is interacting poorly with Rabbit. When the queues do crash, our consumers tied to those queues stop receiving messages. When the queues automatically restart the consumers still do not receive messages. The channels used by these consumers do not encounter a ChannelShutdown event, so our software is not detecting that anything has gone wrong with the rabbit process. Snippet of Queue crash stack:
Snippet of Queue Restart The below link seems to be a related issue, and the guidance was to check in on security software. We have done this to the best of our ability by disabling the security software that was exposed to use, but we don't fundamentally control the host, so its possible that other security software is interacting with RabbitMQ. Any guidance on:
Reproduction steps1.Have a consumer with a channel that is bound to Queue A Expected behavior
Additional context
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 8 replies
-
Versions: |
Beta Was this translation helpful? Give feedback.
-
is the line you are looking for. A filesystem operation has failed with Queue replicas and channels are completely independent from each other, although channels can monitor queues. Channels do retry some operations with a delay when a queue does not have an elected leader (in the case of quorum queues and streams). So for non-transient data, use one of those two. For transient (specifically exclusive, since non-exclusive non-durable queues are going away later in the 4.x series), there already is #12949 which is not as trivial to address as it may sound. |
Beta Was this translation helpful? Give feedback.
-
Hi guys, This issue appears to be related to what we previously encountered here. We still experience this problem occasionally, but only on Windows. It seems to be caused by an unknown antivirus blocking our process, leading to consumer crashes with no recovery options. Does anyone know of a reference page listing similar cases and recommended AV exclusions to mitigate this issue? Such a resource could also be useful for troubleshooting similar problems. Thanks in advance! |
Beta Was this translation helpful? Give feedback.
Personally I'd recommend removing any and all "security" or antivirus software from Windows servers running RabbitMQ.
Otherwise, create exclusions for RabbitMQ's data in
%APPDATA%\RabbitMQ
as well as for the installation directory for RabbitMQ as well as Erlang.