Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.0: use Ra checkpoints in rabbit_fifo for sub-linear time recovery of QQs on boot #10487

Draft
wants to merge 9 commits into
base: qq-v4
Choose a base branch
from

Conversation

the-mikedavis
Copy link
Member

@the-mikedavis the-mikedavis commented Feb 5, 2024

This is the companion PR for rabbitmq/ra#415

rabbit_fifo currently has an ad-hoc checkpointing system where it saves {release_cursor, RaftIdx, State} effects in-memory periodically and emits them as the release cursor moves up. By building checkpointing into ra we can save the checkpoints on disk instead, reducing QQ memory footprint and enabling us to recover even very long queues in logarithmic time (w.r.t. the length of the queue). See rabbitmq/ra#415 for in-depth details about checkpoints.

Connects #8261

@the-mikedavis the-mikedavis self-assigned this Feb 5, 2024
@mergify mergify bot added the bazel label Feb 5, 2024
@michaelklishin michaelklishin changed the title Use Ra checkpoints in rabbit_fifo Use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot Feb 5, 2024
@michaelklishin michaelklishin added this to the 4.0.0 milestone Feb 5, 2024
@michaelklishin michaelklishin changed the title Use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot 4.0: use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot Feb 5, 2024
@the-mikedavis the-mikedavis changed the title 4.0: use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot 4.0: use Ra checkpoints in rabbit_fifo for sub-linear time recovery of QQs on boot Feb 8, 2024
@kjnilsson kjnilsson mentioned this pull request Feb 28, 2024
12 tasks
@the-mikedavis the-mikedavis changed the base branch from main to qq-v4 February 29, 2024 15:50
@the-mikedavis the-mikedavis force-pushed the md-ra-checkpoints branch 3 times, most recently from 6ef156b to 309600f Compare February 29, 2024 20:49
@kjnilsson kjnilsson force-pushed the qq-v4 branch 5 times, most recently from 75291c7 to bb89f2d Compare March 5, 2024 14:29
@kjnilsson kjnilsson force-pushed the qq-v4 branch 3 times, most recently from a479434 to b6d9b85 Compare March 8, 2024 10:56
@kjnilsson kjnilsson mentioned this pull request Apr 2, 2024
7 tasks
@kjnilsson kjnilsson force-pushed the qq-v4 branch 3 times, most recently from e5db089 to 4bd8c1f Compare April 30, 2024 16:27
Create the new version but not including any changes yet.

fix

QQ: force delete followers after leader has terminated.

Also try a longer sleep for mqtt_shared_SUITE so that the
delete operation stands a chance to time out and move on
to the forced deletion stage.

In some mixed machine version scenarios some followers will never
apply the poison pill command so we may as well force delete them
just in case.

QQ: skip test in amqp_client that cannot pass with mixed machine versions

QQ: remove dead code

Code relating to prior machine versions and state conversions.

formatting / readability

rabbit_fifo_prop_SUITE fixes
Also update rabbit_fifo_* suites to test more relevant code versions
where applicable.

add ff mock

QQ: always use the updated credit mode format

QQv4: use more compact consumer reference in settle, credit, return

This introudces a new type: consumer_key() which is either the consumer_id
or the raft index the checkout was processed at. If the consumer is
using one of the updated credit spec formats rabbit_fifo will use the
raft index as the primary key for the consumer such that the rabbit
fifo client can then use the more space efficient integer index
instead of the full consumer id in subsequent commands.

There is compatibility code to still accept the consumer id in
settle, return, discard and credit commands but this is slighlyt
slower and of course less space efficient.

The old form will be used in cases where the fifo client may have
already remove the local consumer state (as happens after a cancel).

Lots of test refactorings of the rabbit_fifo_SUITE to begin to use
the new forms.
rabbit_fifo_prop_SUITE refactoring and other fixes.

fixss

bzl

bzl

fixes
Single active consumers will be activated if they have a higher priority
than the currently active consumer. if the currently active consumer
has pending messages, no further messages will be assigned to the
consumer and the activation of the new consumer will happen once
all pending messages are settled. This is to ensure processing order.

Consumers with the same priority will internally be ordered to
favour those with credit then those that attached first.

QQ: add SAC consumer priority integration tests

Dialyzer fix

QQ: add check for ff in tests
This option immediately removes and returns all messages for a
consumer instead of the softer 'cancel' option which keeps the
consumer around until all pending messages have been either
settled or returned.

This involves a change to the rabbit_queue_type:cancel/5 API
to rabbit_queue_type:cancel/3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants