Unknown acks (e.g. after network partition heals) should be handled gracefully #255

michaelklishin · 2015-08-06T12:00:46Z

Moving this from a legacy tracker (https://bugzilla.rabbitmq.com/show_bug.cgi?id=26308) as it is still relevant:

Logs from [redacted] show lots of:

=ERROR REPORT==== 28-Jul-2014::16:27:50 ===
** Generic server <0.271.0> terminating
** Last message in was {'$gen_cast',{ack,[842683],<7670.3568.247>}}
** When Server state == {q,
                         {amqqueue, <snip>,
                         undefined,undefined,undefined,7,running}
** Reason for termination == 
** {{case_clause,{empty,{[],[]}}},
    [{rabbit_queue_consumers,subtract_acks,4,[]},
     {rabbit_queue_consumers,subtract_acks,3,[]},
     {rabbit_amqqueue_process,subtract_acks,4,[]},
     {rabbit_amqqueue_process,handle_cast,2,[]},
     {gen_server2,handle_msg,2,[]},
     {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}

In all cases:

* It was immediately after running_partitioned_network
* The ack was from the other side of the partition
* The queue had not failed over

More investigation details:

OK, it's actually a bit harder to hit than suggested above, but it's doable. The queue throws away any acktags for the channel when it sees it go down, but is also happy to re-establish a channel record for the same channel. The channel is blissfully unaware.

This means that as well as the crash in this bug, there's also some wrong behaviour that can result if the channel happens to ack with acktag 1 from the first time round, the queue will consider it to be for acktag 1 the second time around - which might not be the same message.

It's somewhat racy to hit as we need to be able to get a channel to talk to a remote queue again after being partitioned from it - but Mnesia will remove the queue record. Therefore I added the following patch to reproduce:

diff -r b649bbf7d910 src/rabbit_channel.erl
--- a/src/rabbit_channel.erl    Wed Jul 30 12:51:15 2014 +0100
+++ b/src/rabbit_channel.erl    Wed Jul 30 14:08:49 2014 +0100
@@ -730,7 +730,18 @@
     check_read_permitted(QueueName, State),
     case rabbit_amqqueue:with_exclusive_access_or_die(
            QueueName, ConnPid,
-           fun (Q) -> rabbit_amqqueue:basic_get(
+           fun (Q) ->
+                   C = case get(slow) of
+                           undefined -> 1;
+                           N         -> N + 1
+                       end,
+                   put(slow, C),
+                   case C of
+                       3 -> io:format("SLEEPING~n"),
+                            timer:sleep(20000);
+                       _ -> ok
+                   end,
+                   rabbit_amqqueue:basic_get(
                         Q, self(), NoAck, rabbit_limiter:pid(Limiter))
            end) of
         {ok, MessageCount,

This will ensure the 3rd basic.get on a channel is very slow.

Then create a two node cluster {rabbit, hare} with a short net ticktime (say 5s). Create a queue "test" on hare with some messages. Then on rabbit:

rr(amqp_connection).
{ok, Conn} = amqp_connection:start(#amqp_params_network{}).
{ok, Ch} = amqp_connection:open_channel(Conn).
amqp_channel:call(Ch, #'basic.get'{queue = <<"test">>, no_ack=false}).
amqp_channel:call(Ch, #'basic.get'{queue = <<"test">>, no_ack=false}).
amqp_channel:call(Ch, #'basic.get'{queue = <<"test">>, no_ack=false}).
amqp_channel:call(Ch, #'basic.ack'{delivery_tag = 2}).

This will pause between the 2nd and 3rd basic.get. Meanwhile:

# for port in 25672 25673 ; do iptables -A INPUT -p tcp --dport ${port} -j DROP ; done

(wait for partition)

# for port in 25672 25673 ; do iptables -D INPUT -p tcp --dport ${port} -j DROP ; done

The partition should heal, then the channel does a 3rd basic get. The queue thinks it's the first, and creates a new channel record with acktag 1. Then the channel sends acktag 2. Result: boom.

The text was updated successfully, but these errors were encountered:

michaelklishin · 2015-08-06T12:09:53Z

Discussed solutions:

1) establish some UUID for the channel-queue pair, generated by the queue the first time it hears of a channel and returned by the queue when acking. If a queue gets some acks from a UUID it has never seen before, it ignores them.

Downside: complicated.

2) Make each acktag into a UUID and have the queue ignore unrecognised ones.

Downside: expensive (probably).

3) Have the channel forget acktags when it sees a queue go down.

Downside: not obvious it completely eliminates the problem if the queue sees the channel down before the channel sees the queue down.

4) Just ignore unrecognised acktags.

Downside: still possible to have incorrect behaviour even if no crash.

Option 2 may be worth it for the next feature release. 3 and 4 may work as temporary workarounds that do not change inter-node communication protocol. The work on switching to Raft will open up additional options.

michaelklishin · 2015-12-14T16:57:42Z

So #496 implements option 4. We may try a variation on 1 in the future, or even make internal ack transfer a part of the Raft log (once that is integrated).

michaelklishin added bug effort-low labels Aug 6, 2015

michaelklishin self-assigned this Aug 6, 2015

michaelklishin assigned hairyhum and unassigned michaelklishin Dec 3, 2015

This was referenced Dec 14, 2015

Generate pseudo-random delivery tags rabbitmq/rabbitmq-common#35

Closed

Ignore unknown channel acks in queue #496

Merged

michaelklishin closed this as completed in #496 Dec 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unknown acks (e.g. after network partition heals) should be handled gracefully #255

Unknown acks (e.g. after network partition heals) should be handled gracefully #255

michaelklishin commented Aug 6, 2015

michaelklishin commented Aug 6, 2015

michaelklishin commented Dec 14, 2015

Unknown acks (e.g. after network partition heals) should be handled gracefully #255

Unknown acks (e.g. after network partition heals) should be handled gracefully #255

Comments

michaelklishin commented Aug 6, 2015

michaelklishin commented Aug 6, 2015

michaelklishin commented Dec 14, 2015