Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Management API queue view change from version upgrade #3080

Closed
b-borden opened this issue Jun 2, 2021 · 2 comments
Closed

Management API queue view change from version upgrade #3080

b-borden opened this issue Jun 2, 2021 · 2 comments

Comments

@b-borden
Copy link

b-borden commented Jun 2, 2021

Hi there,

I was doing a rolling upgrade of my test cluster broker from RabbitMQ version 3.8.14 to 3.8.16 and found a behaviour change in the management API. For an incoming 3.8.16 node, it’s /api/queues response is missing several fields, such as state (examples at the bottom). This missing fields, however, can still be seen with:

  • rabbitmqctl list_queues name state ...
  • rabbitmqctl eval '{ok, Q} = rabbit_amqqueue:lookup(rabbit_misc:r(<<"/">>, queue, <<"test">>)), rabbit_amqqueue:info(Q).'
  • by querying /api/queues on a remaining 3.8.14 node

As a note, I did not encounter this when upgrading my broker from 3.8.13 to 3.8.14.

Diving into the code, I’ve traced the source of the difference as far as pg2:get_members/1 vs. pg:get_members/2 (updated for OTP 24 support)

It looks like previously, pg2 would have returned a remote PID. Now it returns a local PID of a process that does not seem to have this data.

rabbitmqctl -n node_3_8_14 eval '[Pid] = [P || P <- pg2:get_members(management_db)], Pid.'
<11599.708.0>  # <--- local
rabbitmqctl -n node_3_8_16 eval '[Pid] = [P || P <- pg:get_members(rabbitmq_management, management_db)], Pid.'
<11599.1650.0> # <--- remote
rabbitmqctl -n node_3_8_16 eval '[Pid] = [P || P <- pg2:get_members(management_db)], Pid.'
<11932.708.0>  # <--- local

Once the 3.8.14 node is removed, then the management API returns the expected result.

Reproduction

  1. Start a single 3.8.14 node
  2. Apply the following HA queue policies:
  3. ha-mode: all
    ha-sync-mode: automatic
  4. Create a durable classic mirrored queue
  5. Start a 3.8.16 node and have it join the cluster

Example /api/queues responses

New 3.8.16 node:

{
  "garbage_collection": {
    "max_heap_size": -1,
    "min_bin_vheap_size": -1,
    "min_heap_size": -1,
    "fullsweep_after": -1,
    "minor_gcs": -1
  },
  "consumer_details": [],
  "arguments": {},
  "auto_delete": false,
  "deliveries": [],
  "durable": true,
  "exclusive": false,
  "incoming": [],
  "name": "test",
  "node": "rabbit@ip5",
  "slave_nodes": [
    "rabbit@ip1",
    "rabbit@ip2"
  ],
  "synchronised_slave_nodes": [
    "rabbit@ip2",
    "rabbit@ip1"
  ],
  "type": "classic",
  "vhost": "/"
}

Existing 3.8.14 node:

{
  "consumer_details": [],
  "arguments": {},
  "auto_delete": false,
  "backing_queue_status": {
    "avg_ack_egress_rate": 0,
    "avg_ack_ingress_rate": 0,
    "avg_egress_rate": 0,
    "avg_ingress_rate": 0,
    "delta": [
      "delta",
      "undefined",
      0,
      0,
      "undefined"
    ],
    "len": 0,
    "mirror_seen": 0,
    "mirror_senders": 0,
    "mode": "lazy",
    "next_seq_id": 0,
    "q1": 0,
    "q2": 0,
    "q3": 0,
    "q4": 0,
    "target_ram_count": "infinity"
  },
  "consumer_utilisation": null,
  "consumers": 0,
  "deliveries": [],
  "durable": true,
  "effective_policy_definition": {
    "ha-mode": "all",
    "ha-sync-mode": "automatic",
    "max-length": 8000000,
    "overflow": "reject-publish",
    "queue-mode": "lazy"
  },
  "exclusive": false,
  "exclusive_consumer_tag": null,
  "garbage_collection": {
    "fullsweep_after": 65535,
    "max_heap_size": 0,
    "min_bin_vheap_size": 46422,
    "min_heap_size": 233,
    "minor_gcs": 69
  },
  "head_message_timestamp": null,
  "idle_since": "2021-05-31 19:17:39",
  "incoming": [],
  "memory": 13912,
  "message_bytes": 0,
  "message_bytes_paged_out": 0,
  "message_bytes_persistent": 0,
  "message_bytes_ram": 0,
  "message_bytes_ready": 0,
  "message_bytes_unacknowledged": 0,
  "messages": 0,
  "messages_details": {
    "rate": 0
  },
  "messages_paged_out": 0,
  "messages_persistent": 0,
  "messages_ram": 0,
  "messages_ready": 0,
  "messages_ready_details": {
    "rate": 0
  },
  "messages_ready_ram": 0,
  "messages_unacknowledged": 0,
  "messages_unacknowledged_details": {
    "rate": 0
  },
  "messages_unacknowledged_ram": 0,
  "name": "test",
  "node": "rabbit@ip5",
  "operator_policy": null,
  "policy": "policy",
  "recoverable_slaves": [
    "rabbit@ip1",
    "rabbit@ip4",
    "rabbit@ip3",
    "rabbit@ip2"
  ],
  "reductions": 74803,
  "reductions_details": {
    "rate": 0
  },
  "single_active_consumer_tag": null,
  "slave_nodes": [
    "rabbit@ip1",
    "rabbit@ip2"
  ],
  "state": "running",
  "synchronised_slave_nodes": [
    "rabbit@ip2",
    "rabbit@ip1"
  ],
  "type": "classic",
  "vhost": "/"
}
@michaelklishin
Copy link
Member

michaelklishin commented Jun 2, 2021

Versions prior to 3.8.16 do not have pg and versions starting with 3.8.16 are not guaranteed to have pg2. I'm afraid there isn't much we can do. We very intentionally did not try to develop a bridge module: our team has reinvented enough things that modern OTP has that we try to avoid doing that as much as possible. We are willing to accept this behavior and focus on other things.

The two versions are interoperable in other ways (e.g. a client would not know what version it is connected to). As soon
as all nodes are upgraded, the behavior is consistent again as all nodes use the same group membership module,
and all queue stats are available from any node again.

@michaelklishin
Copy link
Member

This issue is specific to metric collection that uses the management plugin. If metrics are collected using Prometheus, every node will only serve its own stats. Therefore it should not matter what process group membership module is used on its cluster peers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants