-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add async or sync for :entry location index's rocksdb write #3103
base: master
Are you sure you want to change the base?
Conversation
@dlg99 @eolivelli @pkumar-singh @zymap @hangc0276 @lordcheng10 @merlimat |
@hangc0276 @merlimat please take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are on your way!
I left a suggestion for the configuration entry.
We must also add comprehensive tests
bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java
Outdated
Show resolved
Hide resolved
ok,thanks for reviewed, I will add testcases for this function |
3e33c5c
to
a345c80
Compare
yeah, I have added two testcase: EntryLocationIndexAsyncTest/EntryLocationIndexSyncTest @eolivelli |
...keeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'm not sure I understand why we need to have 2 different modes here. Do both methods provide the same guarantees or not? |
In the case of multiple replications, when the user pursues the writing speed and does not need to guarantee the success rate of each replication, the asynchronous writing method like LedgerMetadataIndex gives the user more choices @merlimat |
I still cannot see a case where flushing the entry location index (ledgerId, entryId, offset) becomes the bottleneck, compared to:
If it is not the bottleneck, then why reduce the guarantees? (eg: if we miss the async update, we have effectively lost data in the bookie). |
yes, in the step : entrylog flush |
I'm not sure I follow here. My point is that if flushing entry logs takes 90% and RocksDB takes 10% of time (I made up these number here), then adding the risk of losing data to shave time on the 10% portion doesn't make much sense. |
ok,I understand what you mean, let me collect the time and proportion of the three parts of flush @merlimat |
74b825b
to
4ba94b7
Compare
4ba94b7
to
5cf1d53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens in case of unclean shutdown ?
will we lose some entries in the index ?
5cf1d53
to
6fdba80
Compare
rerun failure checks |
Descriptions of the changes in this PR:
Motivation
bookie timed to flush latency mainly composed of three pieces:
1. flush-entrylog: it's the latency for flushing entrylog
2. flush-locations-index: it's the latency for flushing entrly location index, use sync mode to flush
3. flush-ledger-index: it's the latency for flushing ledger metadata index, this index(LedgerMetadataIndex) use async mode to flush
reduce entry location index flush latency to reduce bookie's flush latency,
so add async mode for entry location index's write:
1. default sync mode: not change the original logic
2. async mode : need update config to open, this mode is to speed up writing, this mode is the same as bookie's another index: LedgerMetadataIndex
sync is different from async:
sync mode:
1. create a batch;
4. add msg to the batch
5. call method(batch.flush) to flush the batch
sync mode:
1. just call method(locationsDb.sync) to write the data
2. the rocksdb server will be timed to flush the data
Changes