-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to set max operation numbers in a single rocksdb batch #4044
Conversation
--- ## Motivation In rocksdb, the memory usage is related to the batch size. The more operations in a single batch, the more memory is consumed. Expose the configuration to allow control the batch size.
157d701
to
0c20fbe
Compare
@@ -548,17 +560,31 @@ public void remove(byte[] key) throws IOException { | |||
@Override | |||
public void clear() { | |||
writeBatch.clear(); | |||
batchCount = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user invoke flush() directly, we should also reset the batchCount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. Users can flush multiple times using the same batch. Even you flush the batch, the batch's content doesn't clean up. If we reset it after flushing, the count is not accurate. So I follow the API to reset the batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If manual flush is completed and the batchCount reaches the batchSize again, does it mean that the operations in this batch will be applied to the database twice?
batch.put(toArray(6), toArray(6)); | ||
assertEquals(1, batch.batchCount()); | ||
|
||
batch.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here trigger flush() directly, the batchCount
should be 0.
} | ||
|
||
@Override | ||
public void deleteRange(byte[] beginKey, byte[] endKey) throws IOException { | ||
try { | ||
writeBatch.deleteRange(beginKey, endKey); | ||
countBatchAndFlushIfNeeded(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to discuss whether deleteRange should also be performed within a batch. A deleteRange operation can potentially delete a large number of keys. If we include multiple deleteRange operations within a single batch, I am concerned that it may introduce issues.
The previous core dump #3734 issue could be related to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be another issue. This PR aims to limit the batch size. We can handle that with another PR.
I was tested with this code to verify the batch impaction on memory.
-- Line 241 in c22136b
Finally, when bookie runs the garbage collection and removes the ledger, it will be OOM killed because of the large batch. Pulsar already has a proposal about configuring the compacted topic ledger retention, apache/pulsar#19665. |
@@ -540,6 +551,7 @@ public void put(byte[] key, byte[] value) throws IOException { | |||
public void remove(byte[] key) throws IOException { | |||
try { | |||
writeBatch.delete(key); | |||
countBatchAndFlushIfNeeded(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we try to remove a key from the writeBatch, but the writeBatch has been triggered flushed in put
method, will it impact the correctness of the key value stored in the RocksDB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't. The batch should only take affect when the action starts to do in the rocksdb, we're just pending our operations into the batch to avoid calling the real action method multiple times.
See here: https://github.com/facebook/rocksdb/blob/9a034801cead6421bcf82b506b77e3b2251f1edb/include/rocksdb/write_batch.h#L9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
--- In rocksdb, the memory usage is related to the batch size. The more operations in a single batch, the more memory is consumed. Expose the configuration to allow control the batch size. (cherry picked from commit ad0ed21)
--- ## Motivation In rocksdb, the memory usage is related to the batch size. The more operations in a single batch, the more memory is consumed. Expose the configuration to allow control the batch size. (cherry picked from commit ad0ed21)
--- ## Motivation In rocksdb, the memory usage is related to the batch size. The more operations in a single batch, the more memory is consumed. Expose the configuration to allow control the batch size. (cherry picked from commit ad0ed21)
…4044) --- ## Motivation In rocksdb, the memory usage is related to the batch size. The more operations in a single batch, the more memory is consumed. Expose the configuration to allow control the batch size.
Motivation
In rocksdb, memory usage is related to batch size. The more operations in a single batch, the more memory is consumed. Expose the configuration to allow control the batch size.