Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crdt: Add batching support #1346

Merged
merged 4 commits into from
Apr 30, 2021
Merged

crdt: Add batching support #1346

merged 4 commits into from
Apr 30, 2021

Conversation

hsanjuan
Copy link
Collaborator

This adds batching support to crdt-consensus per #1008 . The crdt component can now take
advantage of the BatchingState, which uses the batching-crdt datastore. In
batching mode, the crdt datastore groups any Add and Delete operations
in a single delta (instead of just 1, as it does by default).

Batching is enabled in the crdt configuration section by setting MaxBatchSize
and MaxBatchAge. These two settings control when a batch is committed,
either by reaching a maximum number of pin/unpin operations, or by reaching a
maximum age.

Batching unlocks large pin-ingestion scalability for clusters, but should be
set according to expected work loads. An additional, hidden MaxQueueSize
parameter provides the ability to perform backpressure on Pin/Unpin
requests. When more than MaxQueueSize pin/unpins are waiting to be included in
a batch, the LogPin/LogUnpin operations will fail. If this happens, it is
means cluster cannot commit batches as fast as pins are arriving. Thus,
MaxQueueSize should be increase (to accommodate bursts), or the batch size
increased (to perform less commits and hopefully handle the requests faster).

Note that the underlying CRDT library will auto-commit when batch deltas reach
1MB of size.

Fixes #1008.

This adds batching support to crdt-consensus per #1008 . The crdt component can now take
advantage of the BatchingState, which uses the batching-crdt datastore. In
batching mode, the crdt datastore groups any Add and Delete operations
in a single delta (instead of just 1, as it does by default).

Batching is enabled in the crdt configuration section by setting MaxBatchSize
**and** MaxBatchAge. These two settings control when a batch is committed,
either by reaching a maximum number of pin/unpin operations, or by reaching a
maximum age.

Batching unlocks large pin-ingestion scalability for clusters, but should be
set according to expected work loads. An additional, hidden MaxQueueSize
parameter provides the ability to perform backpressure on Pin/Unpin
requests. When more than MaxQueueSize pin/unpins are waiting to be included in
a batch, the LogPin/LogUnpin operations will fail. If this happens, it is
means cluster cannot commit batches as fast as pins are arriving. Thus,
MaxQueueSize should be increase (to accommodate bursts), or the batch size
increased (to perform less commits and hopefully handle the requests faster).

Note that the underlying CRDT library will auto-commit when batch deltas reach
1MB of size.
@hsanjuan
Copy link
Collaborator Author

Tests missing.

@hsanjuan hsanjuan added this to the Release v0.13.3 milestone Apr 30, 2021
@hsanjuan hsanjuan merged commit 3e0f3f1 into master Apr 30, 2021
@hsanjuan hsanjuan deleted the feat/1008-crdt-batch branch April 30, 2021 18:15
@hsanjuan hsanjuan mentioned this pull request Jun 15, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CRDT-Batching support
2 participants