CQ shared store write optimisations #8507

lhoguin · 2023-06-09T10:08:44Z

Another round of CQ shared message store optimisations and refactors, this time around writes.

The first commit is more a refactor than anything, although no longer calling fsync does help in some scenarios. The second commit should provide a noticeable boost in performance when there are no consumers.

@mkuratczyk I have renamed the branch.

deps/rabbit/src/rabbit_msg_store.erl

Before they would only be sent periodically or when rolling over to a new file.

We know the messages are on disk or were acked so there is no need to do sets intersections/subtracts in this scenario.

Instead of having the message store send a message to the queue with the confirms for messages ignored due to the flying optimisation, we have the queue handle the confirms directly when removing the messages. This avoids sending potentially 1 Erlang message per 1 AMQP message to the queue.

Also make use of the opened file for multi-reads instead of opening/reading/closing each time.

The way I initially did this the maybe_gc would be triggered based on candidates from 15s ago, but run against candidates from just now. This is sub-optimal because when messages are consumed rapidly, just-now candidates are likely to be in a file about to be deleted, and we don't want to run compaction on those. Instead, when sending the maybe_gc we also send the candidates we had at the time. Then 15s later we check if the file still exists. If it's gone, great! No compaction to do.

lhoguin · 2023-06-20T12:54:45Z

This is ready for review/merge.

mkuratczyk · 2023-06-20T18:03:42Z

Full test results:
https://snapshots.raintank.io/dashboard/snapshot/oo6eAYwUI2exFVFxdt6mTQIsb7AT2hrt?orgId=2

The biggest difference is in confirm latency, when using -c 1 (here for 5kb messages, but similar results for all tested):

And when publishing to a long queue (no consumers):

(the sudden drop in throughput is when the queues reach their max-length - dropping messages is currently quite expensive)

michaelklishin · 2023-06-20T18:23:55Z

So, a 40% drop in publisher confirm latency on one workload, and a 60% drop on another (both have publishers and consumers online). Not bad.

essen force-pushed the lh-msg-store-bis branch from d9a9a5f to 914f61a Compare June 9, 2023 10:13

lhoguin commented Jun 9, 2023

View reviewed changes

deps/rabbit/src/rabbit_msg_store.erl Outdated Show resolved Hide resolved

essen force-pushed the lh-msg-store-bis branch from 914f61a to 5e1a83a Compare June 9, 2023 10:40

This comment was marked as outdated.

Sign in to view

essen force-pushed the lh-msg-store-bis branch from 2b24f09 to 839f3bf Compare June 16, 2023 08:28

lhoguin added 7 commits June 19, 2023 12:22

CQ: Don't use FHC for writes in shared store

9d848d2

CQ: Send confirms when flushing to disk in shared store

b4085c5

Before they would only be sent periodically or when rolling over to a new file.

CQ: Fast-confirm when flushing data to disk

98e1975

We know the messages are on disk or were acked so there is no need to do sets intersections/subtracts in this scenario.

Fix a Dialyzer warning

5fc7e91

Refactor rabbit_msg_file:pread into rabbit_msg_store

b6f8fcd

Also make use of the opened file for multi-reads instead of opening/reading/closing each time.

CQ: Make sure we keep the updated CState when using read_many

8c013ad

essen force-pushed the lh-msg-store-bis branch from 3c7c402 to 8c013ad Compare June 19, 2023 10:22

lhoguin added 2 commits June 19, 2023 17:36

CQ: Add a few todos for later

df2f718

essen force-pushed the lh-msg-store-bis branch from af1fbff to df2f718 Compare June 20, 2023 12:11

lhoguin marked this pull request as ready for review June 20, 2023 12:54

lhoguin requested a review from mkuratczyk June 20, 2023 12:54

michaelklishin added this to the 3.13.0 milestone Jun 20, 2023

mkuratczyk approved these changes Jun 20, 2023

View reviewed changes

mkuratczyk merged commit 985f4e8 into main Jun 20, 2023
16 checks passed

mkuratczyk deleted the lh-msg-store-bis branch June 20, 2023 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CQ shared store write optimisations #8507

CQ shared store write optimisations #8507

lhoguin commented Jun 9, 2023

This comment was marked as outdated.

lhoguin commented Jun 20, 2023

mkuratczyk commented Jun 20, 2023

michaelklishin commented Jun 20, 2023 •

edited

CQ shared store write optimisations #8507

CQ shared store write optimisations #8507

Conversation

lhoguin commented Jun 9, 2023

This comment was marked as outdated.

lhoguin commented Jun 20, 2023

mkuratczyk commented Jun 20, 2023

michaelklishin commented Jun 20, 2023 • edited

michaelklishin commented Jun 20, 2023 •

edited