-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load a backlog of messages on the cluster before upgrade #64
Comments
I'm having a bit of a 🤯 here. Initial backlog is set to 100k, message size is set to 16 bytes, we have 4 queues, 2 mirror with ha-all and 2 quorum as of here: Therefore each node should have 100k messages, per queue (leader or mirror) times 16 bytes, therefore: 100.000 x 4 x 16 = 6400000 bytes However, I observe the node memory going up to ~600 Mb 🤯 Moreover, the memory reports from a node shows ~89Mb for quorum queue and ~100ish MB for mirrors. These figures change over time as the backlog is being drained or consumed. Still, what the 🤯 |
We could look at the Erlang grafana dashboard made by the core team to see where the memory is being used up. Admittedly, the maths is an oversimplification since it does not account for how Erlang uses the memory. Also, Gerhard's TGIR: RMQ ate my RAM |
Definitely reach out to our friends - there are known rough edges, especially with quorum queues, so it would be one of them (known or not yet known). |
ContextWe reached out to the Core team with our analysis and expectations. We deployed Prometheus-Grafana in dev2-bunny and we could not observe anything outstanding or massively obvious explaining the behaviour. We are waiting for the Core team to provide some insights regarding the memory utilisation. We did a rolling restart of a 3-node RMQ cluster with 1.5M ready messages of size 16 bytes on each node, using 3 classic mirrored queues. We observed that nodes 1 and 2 rolled fairly quick and pushed the queue masters to node 0. Subsequently, node 0 became mirror sync critical and it stayed in The problem was made worse by the memory usage being close to the high memory watermark (one node got OOM killed) and the synchronisation took a relatively long time > 5 minutes. Even though RabbitMQ was not unavailable per se since we were able to connect to, however the queues were "unavailable" because they were synchronising for a very long time. Conclussions
|
And the answer to the mystery is in RabbitMQ docs:
If we consider minimum values for metadata and attributes, we have 736 bytes of messages size. If we estimate 1024 bytes of metadata, 1040 bytes would be our message size. This by 1.5M messages is roughly 1 GB and 1.4 GB. |
ContextTweaked the default values in Using an initial backlog of 120.000 messages per queue of size 16 bytes, we are able to generate a load of ~70-80% of the high water mark (~800 MB). This link sheds light on how to calculate the total message size. We are using four queues, two of each type, quorum and mirror. The publisher rate is set at 100 messages per second and consumer processing time is 1 millisecond. Using this restrictions, we are able to keep ready messages in the queues at all times during the test duration (120 seconds). The unavailability period is still set to 30 seconds. Lower values feel too aggressive and may report false positives. We could consider testing with 20 seconds threshold, although we should explore how long does a leader election or master relocation takes in our setup to ensure we are not setting a too stretch value. The following screenshots show the memory available before hitting high memory water mark, the number of ready messages and number of incoming/outgoing messages. |
Verified today that the pipeline is running well with an initial backlog of messages, according to the tool configuration. We have to let it run and generate some data to analise if there is any signs of data loss or unavailability. |
Is your feature request related to a problem? Please describe.
At the moment, we do not load a backlog of messages in our cluster before we upgrade and test the results. This can be seen on these lines:
Loading a backlog of messages is useful as it tests our logic that certain nodes that are critical to synchronization should wait for the sync to complete before being rolled. Otherwise, messages may be lost.
Describe the solution you'd like
The solution has two parts -
The size of the backlog
From the rabbitmq memory docs, we know that paging starts happening at 50% of the memory high watermark. The idea here is that paging will further add time for the synchronisation to occur. This in turn increases the likelihood that nodes need to wait for sync before they can be rolled. So, let's set the backlog to be above 50% of the memory high watermark to create this situation.
RabbitMQ internals and Maths 🤓
vm_memory_high_watermark.absolute
vm_memory_high_watermark_paging_ratio
Setting the value
The RabbitTestTool has a flag to set the initial backlog (
initialPublish
perhaps), and the topology file also includes the size of each message. Set this combination such that (number of messages) * (size of a message) is about 70% of the high watermark, the value calculated in the previous section.At this point, we have a backlog, and messages are being paged to disk.
The text was updated successfully, but these errors were encountered: