Transient mirrored classic queues are not deleted when there are no replicas available for promotion #2045

velimir · 2019-06-27T09:50:44Z

Source of all mentioned files are available here: https://github.com/velimir/rmq-transient-queue-recovery

Reproduction steps

Clone repo and make it current working directory

Start the cluster and wait until all nodes are up and running:

$ ./start.sh
Creating network "rmq-transient-queue-recovery_default" with the default driver
Creating rmq1 ...
Creating rmq2 ...
Creating rmq3 ...
$ docker ps
CONTAINER ID        IMAGE                                                 COMMAND                  CREATED              STATUS              PORTS                                                                                                                NAMES
a591dcd47350        rabbitmq:3.7.15-management                            "docker-entrypoint.s…"   34 seconds ago       Up 32 seconds       4369/tcp, 5671/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:5472->5672/tcp, 0.0.0.0:5473->5673/tcp, 0.0.0.0:15472->15672/tcp   rmq3
b4a8f4e8c54e        rabbitmq:3.7.15-management                            "docker-entrypoint.s…"   56 seconds ago       Up 55 seconds       4369/tcp, 5671/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:5572->5672/tcp, 0.0.0.0:5573->5673/tcp, 0.0.0.0:15572->15672/tcp   rmq2
22b6a66badf2        rabbitmq:3.7.15-management                            "docker-entrypoint.s…"   About a minute ago   Up About a minute   4369/tcp, 5671/tcp, 0.0.0.0:5672-5673->5672-5673/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:15672->15672/tcp                 rmq1

Command creates a cluster of 3 nodes with the following config:

[
 {rabbit,
  [
   {cluster_nodes, {['rabbit@rmq1',
                     'rabbit@rmq2',
                     'rabbit@rmq3'], disc}},
   {cluster_partition_handling, autoheal}
  ]
 },

 {rabbitmq_management,
  [
   {load_definitions, "/var/lib/rabbitmq/definitions.json"}
  ]}
].

And the following policy:

{
    "vhost": "/",
    "name": "rmq-two",
    "pattern": "^rmq-two-.*$",
    "apply-to": "queues",
    "definition": {
        "ha-mode": "nodes",
        "ha-params": [
            "rabbit@rmq2",
            "rabbit@rmq3"
        ],
        "ha-sync-mode": "automatic"
    },
    "priority": 0
}

Create a transient queue rmq-two-queue and send some messages
```
$ ./setup-queue.py -q rmq-two-queue -d -t
```
Gracefully stop rmq2
```
$ docker stop rmq2
```

Note that queue on node rmq3 was promoted to master

rmq3    | 2019-06-27 09:18:31.068 [info] <0.943.0> Mirrored queue 'rmq-two-queue' in vhost '/': Promoting slave <rabbit@rmq3.3.943.0> to master

Gracefully stop rmq3

Note master shutdown on rmq3

rmq3    | 2019-06-27 09:20:59.926 [warning] <0.943.0> Mirrored queue 'rmq-two-queue' in vhost '/': Stopping all nodes on master shutdown since no synchronised slave is available

Start rmq3
Note pid reference did not change on nodes rmq1 and rmq3

See Queue states section below for details.

Try to list queues on node rmq1 (or rmq3)

$ docker exec -it rmq1 rabbitmqctl list_queues name pid slave_pids synchronised_slave_pids
Timeout: 60.0 seconds ...
Listing queues for vhost / ...

09:26:25.793 [error] Discarding message {'$gen_call',{<0.753.0>,#Ref<0.989368845.173015041.56949>},{info,[name,pid,slave_pids,synchronised_slave_pids]}} from <0.753.0> to <0.943.0> in an old incarnation (3) of this node (1)

Start rmq2
Note queue has not been recovered, pid on all nodes refer to old incarnation of the queue (before rmq3 shutdown).

Try to list queues again

$ docker exec -it rmq1 rabbitmqctl list_queues name pid slave_pids synchronised_slave_pids
Timeout: 60.0 seconds ...
Listing queues for vhost / ...

09:31:08.102 [error] Discarding message {'$gen_call',{<0.928.0>,#Ref<0.989368845.173015041.62682>},{info,[name,pid,slave_pids,synchronised_slave_pids]}} from <0.928.0> to <0.943.0> in an old incarnation (3) of this node (1)

Queue states

All nodes are up

`rmq1`

#amqqueue{name = #resource{virtual_host = <<"/">>,
                           kind = queue,name = <<"rmq-two-queue">>},
          durable = false,auto_delete = false,exclusive_owner = none,
          arguments = [],pid = <5879.1110.0>,
          slave_pids = [<5880.943.0>],
          sync_slave_pids = [<5880.943.0>],
          recoverable_slaves = [rabbit@rmq3],
          policy = [{vhost,<<"/">>},
                    {name,<<"rmq-two">>},
                    {pattern,<<"^rmq-two-.*$">>},
                    {'apply-to',<<"queues">>},
                    {definition,[{<<"ha-mode">>,<<"nodes">>},
                                 {<<"ha-params">>,
                                  [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                 {<<"ha-sync-mode">>,<<"automatic">>}]},
                    {priority,0}],
          operator_policy = undefined,
          gm_pids = [{<5880.944.0>,<5880.943.0>},
                     {<5879.1111.0>,<5879.1110.0>}],
          decorators = [],state = live,policy_version = 0,
          slave_pids_pending_shutdown = [],vhost = <<"/">>,
          options = #{user => <<"guest">>}}]

`rmq2`

#amqqueue{name = #resource{virtual_host = <<"/">>,
                           kind = queue,name = <<"rmq-two-queue">>},
          durable = false,auto_delete = false,exclusive_owner = none,
          arguments = [],pid = <0.1110.0>,
          slave_pids = [<5880.943.0>],
          sync_slave_pids = [<5880.943.0>],
          recoverable_slaves = [rabbit@rmq3],
          policy = [{vhost,<<"/">>},
                    {name,<<"rmq-two">>},
                    {pattern,<<"^rmq-two-.*$">>},
                    {'apply-to',<<"queues">>},
                    {definition,[{<<"ha-mode">>,<<"nodes">>},
                                 {<<"ha-params">>,
                                  [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                 {<<"ha-sync-mode">>,<<"automatic">>}]},
                    {priority,0}],
          operator_policy = undefined,
          gm_pids = [{<5880.944.0>,<5880.943.0>},
                     {<0.1111.0>,<0.1110.0>}],
          decorators = [],state = live,policy_version = 0,
          slave_pids_pending_shutdown = [],vhost = <<"/">>,
          options = #{user => <<"guest">>}}]

`rmq3`

#amqqueue{name = #resource{virtual_host = <<"/">>,
                           kind = queue,name = <<"rmq-two-queue">>},
          durable = false,auto_delete = false,exclusive_owner = none,
          arguments = [],pid = <5879.1110.0>,
          slave_pids = [<0.943.0>],
          sync_slave_pids = [<0.943.0>],
          recoverable_slaves = [rabbit@rmq3],
          policy = [{vhost,<<"/">>},
                    {name,<<"rmq-two">>},
                    {pattern,<<"^rmq-two-.*$">>},
                    {'apply-to',<<"queues">>},
                    {definition,[{<<"ha-mode">>,<<"nodes">>},
                                 {<<"ha-params">>,
                                  [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                 {<<"ha-sync-mode">>,<<"automatic">>}]},
                    {priority,0}],
          operator_policy = undefined,
          gm_pids = [{<0.944.0>,<0.943.0>},
                     {<5879.1111.0>,<5879.1110.0>}],
          decorators = [],state = live,policy_version = 0,
          slave_pids_pending_shutdown = [],vhost = <<"/">>,
          options = #{user => <<"guest">>}}]

`rmq2` is down

`rmq1`

#amqqueue{name = #resource{virtual_host = <<"/">>,
                           kind = queue,name = <<"rmq-two-queue">>},
          durable = false,auto_delete = false,exclusive_owner = none,
          arguments = [],pid = <5880.943.0>,slave_pids = [],
          sync_slave_pids = [],recoverable_slaves = [],
          policy = [{vhost,<<"/">>},
                    {name,<<"rmq-two">>},
                    {pattern,<<"^rmq-two-.*$">>},
                    {'apply-to',<<"queues">>},
                    {definition,[{<<"ha-mode">>,<<"nodes">>},
                                 {<<"ha-params">>,
                                  [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                 {<<"ha-sync-mode">>,<<"automatic">>}]},
                    {priority,0}],
          operator_policy = undefined,
          gm_pids = [{<5880.944.0>,<5880.943.0>}],
          decorators = [],state = live,policy_version = 0,
          slave_pids_pending_shutdown = [],vhost = <<"/">>,
          options = #{user => <<"guest">>}}]

`rmq3`

#amqqueue{name = #resource{virtual_host = <<"/">>,
                           kind = queue,name = <<"rmq-two-queue">>},
          durable = false,auto_delete = false,exclusive_owner = none,
          arguments = [],pid = <0.943.0>,slave_pids = [],
          sync_slave_pids = [],recoverable_slaves = [],
          policy = [{vhost,<<"/">>},
                    {name,<<"rmq-two">>},
                    {pattern,<<"^rmq-two-.*$">>},
                    {'apply-to',<<"queues">>},
                    {definition,[{<<"ha-mode">>,<<"nodes">>},
                                 {<<"ha-params">>,
                                  [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                 {<<"ha-sync-mode">>,<<"automatic">>}]},
                    {priority,0}],
          operator_policy = undefined,
          gm_pids = [{<0.944.0>,<0.943.0>}],
          decorators = [],state = live,policy_version = 0,
          slave_pids_pending_shutdown = [],vhost = <<"/">>,
          options = #{user => <<"guest">>}}]

`rmq2` and `rmq3` are down

`rmq1`

#amqqueue{name = #resource{virtual_host = <<"/">>,
                           kind = queue,name = <<"rmq-two-queue">>},
          durable = false,auto_delete = false,exclusive_owner = none,
          arguments = [],pid = <5880.943.0>,slave_pids = [],
          sync_slave_pids = [],recoverable_slaves = [],
          policy = [{vhost,<<"/">>},
                    {name,<<"rmq-two">>},
                    {pattern,<<"^rmq-two-.*$">>},
                    {'apply-to',<<"queues">>},
                    {definition,[{<<"ha-mode">>,<<"nodes">>},
                                 {<<"ha-params">>,
                                  [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                 {<<"ha-sync-mode">>,<<"automatic">>}]},
                    {priority,0}],
          operator_policy = undefined,gm_pids = [],decorators = [],
          state = live,policy_version = 0,
          slave_pids_pending_shutdown = [],vhost = <<"/">>,
          options = #{user => <<"guest">>}}]

`rmq3` is up

`rmq1`

#amqqueue{name = #resource{virtual_host = <<"/">>,
                           kind = queue,name = <<"rmq-two-queue">>},
          durable = false,auto_delete = false,exclusive_owner = none,
          arguments = [],pid = <5880.943.0>,slave_pids = [],
          sync_slave_pids = [],recoverable_slaves = [],
          policy = [{vhost,<<"/">>},
                    {name,<<"rmq-two">>},
                    {pattern,<<"^rmq-two-.*$">>},
                    {'apply-to',<<"queues">>},
                    {definition,[{<<"ha-mode">>,<<"nodes">>},
                                 {<<"ha-params">>,
                                  [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                 {<<"ha-sync-mode">>,<<"automatic">>}]},
                    {priority,0}],
          operator_policy = undefined,gm_pids = [],decorators = [],
          state = live,policy_version = 0,
          slave_pids_pending_shutdown = [],vhost = <<"/">>,
          options = #{user => <<"guest">>}}]

`rmq3`

[#amqqueue{name = #resource{virtual_host = <<"/">>,
                            kind = queue,name = <<"rmq-two-queue">>},
           durable = false,auto_delete = false,exclusive_owner = none,
           arguments = [],pid = <0.943.0>,slave_pids = [],
           sync_slave_pids = [],recoverable_slaves = [],
           policy = [{vhost,<<"/">>},
                     {name,<<"rmq-two">>},
                     {pattern,<<"^rmq-two-.*$">>},
                     {'apply-to',<<"queues">>},
                     {definition,[{<<"ha-mode">>,<<"nodes">>},
                                  {<<"ha-params">>,
                                   [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                  {<<"ha-sync-mode">>,<<"automatic">>}]},
                     {priority,0}],
           operator_policy = undefined,gm_pids = [],decorators = [],
           state = live,policy_version = 0,
           slave_pids_pending_shutdown = [],vhost = <<"/">>,
           options = #{user => <<"guest">>}}]

All nodes are up (`rmq1`, `rmq2` and `rmq3`)

`rmq1`

[#amqqueue{name = #resource{virtual_host = <<"/">>,
                            kind = queue,name = <<"rmq-two-queue">>},
           durable = false,auto_delete = false,exclusive_owner = none,
           arguments = [],pid = <5880.943.0>,
           slave_pids = [<5879.453.0>],
           sync_slave_pids = [],recoverable_slaves = [],
           policy = [{vhost,<<"/">>},
                     {name,<<"rmq-two">>},
                     {pattern,<<"^rmq-two-.*$">>},
                     {'apply-to',<<"queues">>},
                     {definition,[{<<"ha-mode">>,<<"nodes">>},
                                  {<<"ha-params">>,
                                   [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                  {<<"ha-sync-mode">>,<<"automatic">>}]},
                     {priority,0}],
           operator_policy = undefined,
           gm_pids = [{<5879.454.0>,<5879.453.0>}],
           decorators = [],state = live,policy_version = 0,
           slave_pids_pending_shutdown = [],vhost = <<"/">>,
           options = #{user => <<"guest">>}}]

`rmq2`

[#amqqueue{name = #resource{virtual_host = <<"/">>,
                            kind = queue,name = <<"rmq-two-queue">>},
           durable = false,auto_delete = false,exclusive_owner = none,
           arguments = [],pid = <5880.943.0>,
           slave_pids = [<0.453.0>],
           sync_slave_pids = [],recoverable_slaves = [],
           policy = [{vhost,<<"/">>},
                     {name,<<"rmq-two">>},
                     {pattern,<<"^rmq-two-.*$">>},
                     {'apply-to',<<"queues">>},
                     {definition,[{<<"ha-mode">>,<<"nodes">>},
                                  {<<"ha-params">>,
                                   [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                  {<<"ha-sync-mode">>,<<"automatic">>}]},
                     {priority,0}],
           operator_policy = undefined,
           gm_pids = [{<0.454.0>,<0.453.0>}],
           decorators = [],state = live,policy_version = 0,
           slave_pids_pending_shutdown = [],vhost = <<"/">>,
           options = #{user => <<"guest">>}}]

`rmq3`

[#amqqueue{name = #resource{virtual_host = <<"/">>,
                            kind = queue,name = <<"rmq-two-queue">>},
           durable = false,auto_delete = false,exclusive_owner = none,
           arguments = [],pid = <0.943.0>,
           slave_pids = [<5879.453.0>],
           sync_slave_pids = [],recoverable_slaves = [],
           policy = [{vhost,<<"/">>},
                     {name,<<"rmq-two">>},
                     {pattern,<<"^rmq-two-.*$">>},
                     {'apply-to',<<"queues">>},
                     {definition,[{<<"ha-mode">>,<<"nodes">>},
                                  {<<"ha-params">>,
                                   [<<"rabbit@rmq2">>,<<"rabbit@rmq3">>]},
                                  {<<"ha-sync-mode">>,<<"automatic">>}]},
                     {priority,0}],
           operator_policy = undefined,
           gm_pids = [{<5879.454.0>,<5879.453.0>}],
           decorators = [],state = live,policy_version = 0,
           slave_pids_pending_shutdown = [],vhost = <<"/">>,
           options = #{user => <<"guest">>}}]

The text was updated successfully, but these errors were encountered:

michaelklishin · 2019-06-27T09:59:09Z

Our team does not consider a couple of things at play here fixable:

All nodes are stopped at the same time. There is no way they can coordinate while being shut down at the same time. Hopefully with feature flags, this would not be a common scenario even for feature version upgrades. 4.0 will require the majority of nodes to be available for schema operations.
Transient mirrored queues. How much sense does this combination make? Why would a user want things to be both transient and replicated for availability?

velimir · 2019-06-27T11:45:52Z

All nodes are stopped at the same time.

They are not stopped at the same time. 2 out of 3 nodes are stopped one after another.

Why would a user want things to be both transients and replicated for availability?

For example to have a compromise between availability and disk when it's necessary?

Consider the following scenario: cluster of rmq{1,3}, pause_minority. ha-node is set to [rmq1]. In case of a network hiccup between rmq1 and the rest of the cluster, once rmq1 is recovered after a pause minority, all transient queues with a master on rmq1 will be unavailable.

I didn't expect that such a simple scenario would cause a major hiccup on a client side (all operations fail with a timeout error). If some scenarios are not valid/acceptable from rabbitmq perspective I would think it is supposed to be rejected by RabbitMQ on a validation step to protect users from a disaster.

michaelklishin · 2019-06-27T12:08:00Z

Queue durability is orthogonal to how much disk activity it performs.

The message in the log says

Stopping all nodes on master shutdown since no synchronised slave is available

You can choose to elect an out-of-sync mirror. I don't know how that should work for transient queues, however, the spec leaves all the distributed aspects entirely out of scope (as do most messaging protocols, unfortunately).

Semantically using transient entities (as in AMQP 0-9-1) and asking the system to replicate them makes no sense. The above example only confirms this: if you pin a transient queue to two nodes and they are both down, what should happen? Should this queue exist or not when the nodes come back up and promotion of out-of-sync mirrors is allowed? It is transient and all of its designated nodes were offline at some point; but it is also mirrored and promotion is allowed. I don't have an answer, as this combination of settings is too unlikely to be set intentionally.

For 4.0 we are considering removing mirroring for transient queues, or maybe even removing classic mirroring entirely in favour of quorum and other queue types which are a lot more purpose-built.
Not every combination of queue properties make sense and sometimes it is impossible to come up with a sensible behaviour. Our team's time is better spent elsewhere.

@kjnilsson

velimir · 2019-06-27T12:36:39Z

The above example only confirms this

The above example confirms that the queue could end up in an unresponsive state using a set of configurations that is acceptable by RabbitMQ.

if you pin a transient queue to two nodes and they are both down, what should happen?

I would be happy with any consistent decision here, but regardless of the end state, it should not break the clients and allow them to either to re-declare the queue or re-use the queue.

michaelklishin · 2019-06-27T12:56:47Z

@velimir the decision is that this combination of settings is conflicting. Which is why we plan on either making it invalid, quietly ineffective (exclusive queues are never mirrored already) or remove mirrored queues entirely. If you have the time to look into what the current implementation could do, you are welcome to contribute an improvement. I don't know how you'd resolve the contradiction between
the transient property ("please remove this when all nodes that host this queue shut down") and mirroring with promotion of out-of-sync replicas enabled ("please keep a replica around even if consistency cannot be guaranteed"). But when a queue becomes unpromotable, perhaps it can be removed if it is transient.

gerhard · 2019-06-27T14:12:30Z

This feels related to #1501

A potential quick fix would be for clients to try deleting a queue if it cannot be re-declared.

A better fix would be to extend the code introduced to fix #1501. We would be happy to receive your proposed code changes.

@hairyhum might want to weigh in on this, he did most (maybe all?) of the work for #1501

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: canonical#104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: #104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is refered to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefuly more stable. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e

When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is referred to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is traditionally controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. More recently, Oslo Notification has introduced support for Quorum queues [7]. These are a successor to durable classic queues, however it isn't yet clear if they are a good fit for OpenStack in general [8]. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. Note that this includes Quorum queues which are always durable. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefully more stable. In the future, we can then consider promoting specific queues to quorum queues, in cases where message loss can result in failure states which are hard to recover from. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication [7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933 [8] https://www.rabbitmq.com/quorum-queues.html#use-cases Partial-Bug: #1954925 Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e

* Update kolla-ansible from branch 'master' to 6e267aed1df3dff12a1940613d532079e3262ba7 - Merge "Remove classic queue mirroring for internal RabbitMQ" - Remove classic queue mirroring for internal RabbitMQ When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is referred to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is traditionally controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. More recently, Oslo Notification has introduced support for Quorum queues [7]. These are a successor to durable classic queues, however it isn't yet clear if they are a good fit for OpenStack in general [8]. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. Note that this includes Quorum queues which are always durable. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefully more stable. In the future, we can then consider promoting specific queues to quorum queues, in cases where message loss can result in failure states which are hard to recover from. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication [7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933 [8] https://www.rabbitmq.com/quorum-queues.html#use-cases Partial-Bug: #1954925 Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e

When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is referred to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is traditionally controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. More recently, Oslo Notification has introduced support for Quorum queues [7]. These are a successor to durable classic queues, however it isn't yet clear if they are a good fit for OpenStack in general [8]. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. Note that this includes Quorum queues which are always durable. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefully more stable. In the future, we can then consider promoting specific queues to quorum queues, in cases where message loss can result in failure states which are hard to recover from. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication [7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933 [8] https://www.rabbitmq.com/quorum-queues.html#use-cases Partial-Bug: #1954925 Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e

Backport note: This patch has been updated to retain the existing behaviour by default. A temporary variable, rabbitmq_remove_ha_all_policy, has been added which may be set to true in order to remove the ha-all policy. In order to support changing the policy without upgrading, the the ha-all policy is removed on deploys, in addition to upgrades. When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is referred to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is traditionally controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. More recently, Oslo Notification has introduced support for Quorum queues [7]. These are a successor to durable classic queues, however it isn't yet clear if they are a good fit for OpenStack in general [8]. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. Note that this includes Quorum queues which are always durable. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefully more stable. In the future, we can then consider promoting specific queues to quorum queues, in cases where message loss can result in failure states which are hard to recover from. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication [7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933 [8] https://www.rabbitmq.com/quorum-queues.html#use-cases Partial-Bug: #1954925 Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e (cherry picked from commit 6bfe192)

Backport note: This patch has been updated to retain the existing behaviour by default. A temporary variable, rabbitmq_remove_ha_all_policy, has been added which may be set to true in order to remove the ha-all policy. In order to support changing the policy without upgrading, the the ha-all policy is removed on deploys, in addition to upgrades. When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is referred to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is traditionally controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. More recently, Oslo Notification has introduced support for Quorum queues [7]. These are a successor to durable classic queues, however it isn't yet clear if they are a good fit for OpenStack in general [8]. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. Note that this includes Quorum queues which are always durable. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefully more stable. In the future, we can then consider promoting specific queues to quorum queues, in cases where message loss can result in failure states which are hard to recover from. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication [7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933 [8] https://www.rabbitmq.com/quorum-queues.html#use-cases Partial-Bug: #1954925 Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e (cherry picked from commit 6bfe192) (cherry picked from commit 425ead5)

Backport note: This patch has been updated to retain the existing behaviour by default. A temporary variable, rabbitmq_remove_ha_all_policy, has been added which may be set to true in order to remove the ha-all policy. In order to support changing the policy without upgrading, the the ha-all policy is removed on deploys, in addition to upgrades. When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is referred to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is traditionally controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. More recently, Oslo Notification has introduced support for Quorum queues [7]. These are a successor to durable classic queues, however it isn't yet clear if they are a good fit for OpenStack in general [8]. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. Note that this includes Quorum queues which are always durable. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefully more stable. In the future, we can then consider promoting specific queues to quorum queues, in cases where message loss can result in failure states which are hard to recover from. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication [7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933 [8] https://www.rabbitmq.com/quorum-queues.html#use-cases Partial-Bug: #1954925 Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e (cherry picked from commit 6bfe192) (cherry picked from commit 425ead5) (cherry picked from commit 8e1c98d)

Backport note: This patch has been updated to retain the existing behaviour by default. A temporary variable, rabbitmq_remove_ha_all_policy, has been added which may be set to true in order to remove the ha-all policy. In order to support changing the policy without upgrading, the the ha-all policy is removed on deploys, in addition to upgrades. When OpenStack is deployed with Kolla-Ansible, by default there are no durable queues or exchanges created by the OpenStack services in RabbitMQ. In Rabbit terminology, not being durable is referred to as `transient`, and this means that the queue is generally held in memory. Whether OpenStack services create durable or transient queues is traditionally controlled by the Oslo Notification config option: `amqp_durable_queues`. In Kolla-Ansible, this remains set to the default of `False` in all services. The only `durable` objects are the `amq*` exchanges which are internal to RabbitMQ. More recently, Oslo Notification has introduced support for Quorum queues [7]. These are a successor to durable classic queues, however it isn't yet clear if they are a good fit for OpenStack in general [8]. For clustered RabbitMQ deployments, Kolla-Ansible configures all queues as `replicated` [1]. Replication occurs over all nodes in the cluster. RabbitMQ refers to this as 'mirroring of classic queues'. In summary, this means that a multi-node Kolla-Ansible deployment will end up with a large number of transient, mirrored queues and exchanges. However, the RabbitMQ documentation warns against this, stating that 'For replicated queues, the only reasonable option is to use durable queues: [2]`. This is discussed further in the following bug report: [3]. Whilst we could try enabling the `amqp_durable_queues` option for each service (this is suggested in [4]), there are a number of complexities with this approach, not limited to: 1) RabbitMQ is planning to remove classic queue mirroring in favor of 'Quorum queues' in a forthcoming release [5]. 2) Durable queues will be written to disk, which may cause performance problems at scale. Note that this includes Quorum queues which are always durable. 3) Potential for race conditions and other complexity discussed recently on the mailing list under: `[ops] [kolla] RabbitMQ High Availability` The remaining option, proposed here, is to use classic non-mirrored queues everywhere, and rely on services to recover if the node hosting a queue or exchange they are using fails. There is some discussion of this approach in [6]. The downside of potential message loss needs to be weighed against the real upsides of increasing the performance of RabbitMQ, and moving to a configuration which is officially supported and hopefully more stable. In the future, we can then consider promoting specific queues to quorum queues, in cases where message loss can result in failure states which are hard to recover from. [1] https://www.rabbitmq.com/ha.html [2] https://www.rabbitmq.com/queues.html [3] rabbitmq/rabbitmq-server#2045 [4] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit [5] https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/ [6] https://fuel-ccp.readthedocs.io/en/latest/design/ref_arch_1000_nodes.html#replication [7] https://bugs.launchpad.net/oslo.messaging/+bug/1942933 [8] https://www.rabbitmq.com/quorum-queues.html#use-cases Partial-Bug: #1954925 Change-Id: I91d0e23b22319cf3fdb7603f5401d24e3b76a56e (cherry picked from commit 6bfe192) (cherry picked from commit 425ead5)

michaelklishin closed this as completed Jun 27, 2019

michaelklishin changed the title ~~Transient HA queues are not deleted when all host nodes are down/up~~ Transient mirrored classic queues are not deleted when there are no replicas available for promotion Jun 27, 2019

lathiat mentioned this issue Jul 8, 2021

RabbitMQ unsynchronised queue issues canonical/hotsos#104

Closed

nicolasbock added a commit to nicolasbock/hotsos that referenced this issue Oct 9, 2021

Add log output for unsynchronized queues

ef886c4

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

nicolasbock added a commit to nicolasbock/hotsos that referenced this issue Oct 9, 2021

Add log output for unsynchronized queues

7720b59

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: canonical#104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

nicolasbock mentioned this issue Oct 9, 2021

Add log output for unsynchronized queues canonical/hotsos#151

Merged

nicolasbock added a commit to nicolasbock/hotsos that referenced this issue Oct 12, 2021

Add log output for unsynchronized queues

5a52829

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: canonical#104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

nicolasbock added a commit to nicolasbock/hotsos that referenced this issue Oct 15, 2021

Add log output for unsynchronized queues

abbc856

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: canonical#104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

nicolasbock added a commit to nicolasbock/hotsos that referenced this issue Oct 15, 2021

Add log output for unsynchronized queues

17c5b18

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: canonical#104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

nicolasbock added a commit to nicolasbock/hotsos that referenced this issue Oct 15, 2021

Add log output for unsynchronized queues

f390d2e

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: canonical#104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

nicolasbock added a commit to nicolasbock/hotsos that referenced this issue Oct 15, 2021

Add log output for unsynchronized queues

47cdf99

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: canonical#104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

dosaboy pushed a commit to canonical/hotsos that referenced this issue Oct 15, 2021

Add log output for unsynchronized queues

dbdc88a

See upstream issue [1] for more details. [1] rabbitmq/rabbitmq-server#2045 Closes: #104 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>

cityofships mentioned this issue Feb 24, 2022

Wallaby: Remove classic queue mirroring for internal RabbitMQ stackhpc/kolla-ansible#193

Closed

dosaboy mentioned this issue Feb 28, 2022

Expand messages raised for rabbitmq issues canonical/hotsos#290

Closed

dougszumski mentioned this issue Mar 14, 2022

Allow removal of classic queue mirroring for internal RabbitMQ stackhpc/kolla-ansible#198

Merged

dougszumski mentioned this issue Mar 14, 2022

Allow removal of classic queue mirroring for internal RabbitMQ stackhpc/kolla-ansible#199

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transient mirrored classic queues are not deleted when there are no replicas available for promotion #2045

Transient mirrored classic queues are not deleted when there are no replicas available for promotion #2045

velimir commented Jun 27, 2019

michaelklishin commented Jun 27, 2019

velimir commented Jun 27, 2019

michaelklishin commented Jun 27, 2019 •

edited

velimir commented Jun 27, 2019

michaelklishin commented Jun 27, 2019 •

edited

gerhard commented Jun 27, 2019 •

edited

Transient mirrored classic queues are not deleted when there are no replicas available for promotion #2045

Transient mirrored classic queues are not deleted when there are no replicas available for promotion #2045

Comments

velimir commented Jun 27, 2019

Reproduction steps

Queue states

All nodes are up

rmq1

rmq2

rmq3

rmq2 is down

rmq1

rmq3

rmq2 and rmq3 are down

rmq1

rmq3 is up

rmq1

rmq3

All nodes are up (rmq1, rmq2 and rmq3)

rmq1

rmq2

rmq3

michaelklishin commented Jun 27, 2019

velimir commented Jun 27, 2019

michaelklishin commented Jun 27, 2019 • edited

velimir commented Jun 27, 2019

michaelklishin commented Jun 27, 2019 • edited

gerhard commented Jun 27, 2019 • edited

`rmq1`

`rmq2`

`rmq3`

`rmq2` is down

`rmq1`

`rmq3`

`rmq2` and `rmq3` are down

`rmq1`

`rmq3` is up

`rmq1`

`rmq3`

All nodes are up (`rmq1`, `rmq2` and `rmq3`)

`rmq1`

`rmq2`

`rmq3`

michaelklishin commented Jun 27, 2019 •

edited

michaelklishin commented Jun 27, 2019 •

edited

gerhard commented Jun 27, 2019 •

edited