Single buffer for small add requests #3783

merlimat · 2023-02-17T23:47:31Z

Motivation

Similar to #3597, this is targeting AddRequest for using a single buffer when serializing them.

Modifications:

Changed DigestManager to not return a ByteBufList in all conditions
If the payload is small, allocate a buffer large enough for the headers and the payload.

codecov-commenter · 2023-02-18T00:35:44Z

Codecov Report

Merging #3783 (5b50e06) into master (1f8de8f) will decrease coverage by 7.85%.
The diff coverage is 61.66%.

@@             Coverage Diff              @@
##             master    #3783      +/-   ##
============================================
- Coverage     56.31%   48.47%   -7.85%     
+ Complexity     5524     4751     -773     
============================================
  Files           473      473              
  Lines         40952    40998      +46     
  Branches       5235     5246      +11     
============================================
- Hits          23064    19874    -3190     
- Misses        15790    19144    +3354     
+ Partials       2098     1980     -118

Flag	Coverage Δ
client	`44.18% <61.66%> (-0.11%)`	⬇️
remaining	`29.60% <16.66%> (?)`
replication	`?`
tls	`20.92% <53.33%> (-0.09%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...java/org/apache/bookkeeper/proto/BookieClient.java	`50.00% <ø> (ø)`
.../org/apache/bookkeeper/proto/BookieClientImpl.java	`64.00% <ø> (+0.36%)`	⬆️
.../java/org/apache/bookkeeper/proto/AuthHandler.java	`52.56% <33.33%> (-28.56%)`	⬇️
...pache/bookkeeper/proto/PerChannelBookieClient.java	`78.26% <44.44%> (-2.11%)`	⬇️
...he/bookkeeper/client/LedgerFragmentReplicator.java	`76.47% <50.00%> (+1.06%)`	⬆️
...g/apache/bookkeeper/proto/BookieProtoEncoding.java	`75.29% <66.66%> (-5.03%)`	⬇️
...pache/bookkeeper/proto/checksum/DigestManager.java	`72.26% <69.44%> (-1.53%)`	⬇️
...ava/org/apache/bookkeeper/client/PendingAddOp.java	`87.92% <100.00%> (-0.92%)`	⬇️
...rg/apache/bookkeeper/metastore/MetastoreTable.java	`0.00% <0.00%> (-100.00%)`	⬇️
.../bookkeeper/metastore/MetastoreScannableTable.java	`0.00% <0.00%> (-100.00%)`	⬇️
... and 189 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

hangc0276 · 2023-02-20T09:54:40Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/checksum/DigestManager.java

+
+        // Compute checksum over the headers
+        update(buf);
+        update(data);


Do we need to unwrap the data ByteBuf if it is CompositeByteBuf?

I think for v2 protocol, we shouldn't be getting any ComposityByteBuf

When Pulsar enabled KOP AppendIndexMetadataInterceptor, the payload will be CompositeByteBuf, and the update(data) will copy data to the heap. Please refer to: #2701

hangc0276 · 2023-02-27T02:44:15Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/checksum/DigestManager.java

+        buf.readerIndex(0);
+
+        if (isSmallEntry) {
+            buf.writeBytes(data);


buf.writeBytes(unwrap) ?

hangc0276 · 2023-02-27T02:44:49Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/checksum/DigestManager.java

+            data.release();
+            return buf;
+        } else {
+            return ByteBufList.get(buf, data);


ByteBufList.get(buf, unwrap)?

hangc0276 · 2023-03-06T14:29:23Z

@merlimat Would you please rebase the master?

merlimat · 2023-03-06T14:36:29Z

There are still a couple of tests failing

StevenLuMT · 2023-03-09T02:58:39Z

rerun failure checks

hangc0276 · 2023-03-20T15:28:02Z

@merlimat I fixed the failed test in hangc0276#32, Please help take a look. If you have no objections, I will push the fix to your branch, thanks.

merlimat · 2023-03-20T17:38:21Z

Thanks @hangc0276 !

* Fix memory leak issue of reading small entries (apache#3844) * Make read entry request recyclable (apache#3842) * Make read entry request recyclable * Move recycle to finally block * Fix test and comments * Fix test * Avoid unnecessary force write. (apache#3847) * Avoid unnecessary force write. * code clean. * fix style * Correct the running job name for the test group (apache#3851) --- ### Motivation The running tests job name doesn't match the tests. Correct the job name. * add timeout for two flaky timeout tests (apache#3855) * add V2 protocal and warmupMessages support for benchMark (apache#3856) * disable trimStackTrack for code-coverage profile (apache#3854) * Fix bkperf log directory not found (apache#3858) ### Motivation When using the bkperf command `bin/bkperf journal append -j data -n 100000000 --sync true` to test the BookKeeper journal performance, it failed with the following exception ``` [0.002s][error][logging] Error opening log file '/Users/hangc/Downloads/tmp/tc/batch/ta/bookkeeper-all-4.16.0-SNAPSHOT/logs/bkperf-gc.log': No such file or directory [0.002s][error][logging] Initialization of output 'file=/Users/hangc/Downloads/tmp/tc/batch/ta/bookkeeper-all-4.16.0-SNAPSHOT/logs/bkperf-gc.log' using options 'filecount=5,filesize=64m' failed. Invalid -Xlog option '-Xlog:gc=info:file=/Users/hangc/Downloads/tmp/tc/batch/ta/bookkeeper-all-4.16.0-SNAPSHOT/logs/bkperf-gc.log::filecount=5,filesize=64m', see error log for details. Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. ``` The root cause is that the `logs` directory was not created. ### Modifications Create the `logs` directory before bkperf started. * [improve] Fix indexDirs upgrade failed (apache#3762) * fix indexDirs upgrade failed * Bump checkstyle-plugin from 3.1.2 to 3.2.1 (apache#3850) * [Flaky] Fix flaky test in testRaceGuavaEvictAndReleaseBeforeRetain (apache#3857) * Fix flaky test in testRaceGuavaEvictAndReleaseBeforeRetain * format code * Fix NPE in BenchThroughputLatency (apache#3859) * Update website to 4.15.4 (apache#3862) --- ### Motivation Update website to 4.15.4 * change rocksDB config level_compaction_dynamic_level_bytes to CFOptions (apache#3860) ### Motivation After PR apache#3056 , Bookkeeper set `level_compaction_dynamic_level_bytes=true` as `TableOptions` in `entry_location_rocksdb.conf.default` , which will cause `level_compaction_dynamic_level_bytes` lose efficacy and will cause rocksDB .sst file compact sort chaos when update bookie release. As RocksDB conf, `level_compaction_dynamic_level_bytes` need set as `CFOptions` https://github.com/facebook/rocksdb/blob/master/examples/rocksdb_option_file_example.ini <img width="703" alt="image" src="https://user-images.githubusercontent.com/84127069/224640399-d5481fe5-7b75-4229-ac06-3d280aa9ae6d.png"> <img width="240" alt="image" src="https://user-images.githubusercontent.com/84127069/224640621-737d0a42-4e01-4f38-bd5a-862a93bc4b32.png"> ### Changes 1. Change `level_compaction_dynamic_level_bytes=true` from `TableOptions` to `CFOptions` in `entry_location_rocksdb.conf.default` ; * Correct the running job flag for the test group. (apache#3865) * Release note for 4.15.4 (apache#3831) --- ### Motivation Release note for 4.15.4 * Add trigger entry location index rocksDB compact interface. (apache#3802) ### Motivation After the bookie instance running long time, the bookie entry location index rocksDB `.sst` file size maybe expand to 20-30GB as one ledger data dir's location index in some case, which will cause the rocksDB scan operator cost more time and cause the bookie client request timeout. Add trigger entry location index rocksDB compact REST API which can trigger entry location rocksDB compaction and get the compaction status. The full range entry location index rocksDB compact will cause the entry location index dir express higher IOUtils. So we'd better trigger the entry location rocksDB compact by the api in low data flow period. **Some case before rocksDB compact:** <img width="232" alt="image" src="https://user-images.githubusercontent.com/84127069/220893469-e6fbc1a3-c767-4ffe-8ae9-f05ad1833c50.png"> <img width="288" alt="image" src="https://user-images.githubusercontent.com/84127069/220891359-dc37e139-37b0-461b-8001-dcc48517366c.png"> **After rocksDB compact:** <img width="255" alt="image" src="https://user-images.githubusercontent.com/84127069/220891419-24267fa7-348c-4fbd-8b3e-70a99840bce5.png"> ### Changes 1. Add REST API to trigger entry location index rocksDB compact. * Pick the higher leak detection level between netty and bookkeeper. (apache#3794) ### Motivation 1. Pick the higher leak detection level between netty and bookkeeper. 2. Enhance the bookkeeper leak detection value match rule, now it's case insensitive. There are detailed information about it: https://lists.apache.org/thread/d3zw8bxhlg0wxfhocyjglq0nbxrww3sg * Disable code coverage and codecov report (apache#3863) ### Motivation There're two reasons that we want to disable the code coverage. 1. The current report result is not accurate. 2. We can't get the PR's unit test's code coverage because of the apache Codecov permission. * Add small files check in garbage collection (apache#3631) ### Motivation When we use `TransactionalEntryLogCompactor` to compact the entry log files, it will generate a lot of small entry log files, and for those files, the file usage is usually greater than 90%, which can not be compacted unless the file usage decreased. ![image](https://user-images.githubusercontent.com/5436568/201135615-4d6072f5-e353-483d-9afb-48fad8134044.png) ### Changes We introduce the entry log file size check during compaction, and the checker is controlled by `gcEntryLogSizeRatio`. If the total entry log file size is less than `gcEntryLogSizeRatio * logSizeLimit`, the entry log file will be compacted even though the file usage is greater than 90%. This feature is disabled by default and the `gcEntryLogSizeRatio` default value is `0.0` * [improvement] Delay all audit task when have a already delayed bookie check task (apache#3818) ### Motivation Fixes apache#3817 For details, see: apache#3817 ### Changes When there is an `auditTask` during the `lostBookieRecoveryDelay` delay, other detection tasks should be skipped. * Change order of doGcLedgers and extractMetaFromEntryLogs (apache#3869) * [Bugfix] make metadataDriver initialization more robust (apache#3873) Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com> * Enable CI for the streamstorage python client (apache#3875) * Fix compaction threshold default value precision problem. (apache#3871) * Fix compaction threshold precision problem. * Fix compaction threshold precision problem. * Single buffer for small add requests (apache#3783) * Single buffer for small add requests * Fixed checkstyle * Fixed treating of ComposityByteBuf * Fixed merge issues * Fixed merge issues * WIP * Fixed test and removed dead code * Removed unused import * Fixed BookieJournalTest * removed unused import * fix the checkstyle * fix failed test * fix failed test --------- Co-authored-by: chenhang <chenhang@apache.org> * Add log for entry log file delete. (apache#3872) * Add log for entry log file delete. * add log info. * Address the comment. * Address the comment. * revert the code. * Improve group and flush add-responses after journal sync (apache#3848) Descriptions of the changes in this PR: This is an improvement for apache#3837 ### Motivation 1. Now if the maxPendingResponsesSize is expanded large, it will not decrease. => We should make it flexible. 2. Now after prepareSendResponseV2 to the channel, then we trigger all channels to flush pendingSendResponses, maybe there is only a few channels that need to flush, but if we trigger all channels, it's a waste. => We only flush the channel which prepareSendResponseV2. --------- Co-authored-by: Penghui Li <penghui@apache.org> Co-authored-by: Yong Zhang <zhangyong1025.zy@gmail.com> Co-authored-by: Hang Chen <chenhang@apache.org> Co-authored-by: wenbingshen <oliver.shen999@gmail.com> Co-authored-by: ZhangJian He <shoothzj@gmail.com> Co-authored-by: lixinyang <84127069+Nicklee007@users.noreply.github.com> Co-authored-by: YANGLiiN <ielin@qq.com> Co-authored-by: Lishen Yao <yaalsn@gmail.com> Co-authored-by: Andrey Yegorov <8622884+dlg99@users.noreply.github.com> Co-authored-by: ZanderXu <zanderxu@apache.org> Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com> Co-authored-by: Matteo Merli <mmerli@apache.org>

--- ### Motivation apache#3783 changed the recovery add request flags. It causes when replication happening, if the ledger was fenced, the ledger can not be recovered. Because the recovery add flag is not added then the normal add is not allowed to a fenced ledger.

* Fix the autorecovery failed replicate by add entry fenced error --- ### Motivation #3783 changed the recovery add request flags. It causes when replication happening, if the ledger was fenced, the ledger can not be recovered. Because the recovery add flag is not added then the normal add is not allowed to a fenced ledger.

…he#4163) * Fix the autorecovery failed replicate by add entry fenced error --- ### Motivation apache#3783 changed the recovery add request flags. It causes when replication happening, if the ledger was fenced, the ledger can not be recovered. Because the recovery add flag is not added then the normal add is not allowed to a fenced ledger. (cherry picked from commit eb233b0)

Single buffer for small add requests

11d5854

merlimat requested review from hangc0276 and zymap February 17, 2023 23:47

Fixed checkstyle

4fd46b0

hangc0276 reviewed Feb 20, 2023

View reviewed changes

hangc0276 assigned merlimat Feb 20, 2023

hangc0276 added the type/improvement label Feb 20, 2023

hangc0276 added this to the 4.16.0 milestone Feb 20, 2023

hangc0276 requested review from dlg99 and eolivelli February 20, 2023 10:22

zymap approved these changes Feb 22, 2023

View reviewed changes

merlimat added 2 commits February 26, 2023 09:46

Fixed treating of ComposityByteBuf

7901773

Merge remote-tracking branch 'apache/master' into single-buffer-add

572310c

hangc0276 reviewed Feb 27, 2023

View reviewed changes

codelipenghui approved these changes Feb 27, 2023

View reviewed changes

merlimat added 2 commits February 26, 2023 21:29

Merge remote-tracking branch 'apache/master' into single-buffer-add

2b6632f

Fixed merge issues

5b50e06

hangc0276 approved these changes Feb 27, 2023

View reviewed changes

merlimat added 7 commits February 27, 2023 13:11

Merge remote-tracking branch 'apache/master' into single-buffer-add

62c0bdf

Fixed merge issues

cc3c343

WIP

8791faf

Fixed test and removed dead code

ca4a555

Removed unused import

e584323

Fixed BookieJournalTest

28d3581

removed unused import

3390bc5

hangc0276 and others added 4 commits March 20, 2023 09:48

fix the checkstyle

822eb3e

fix failed test

eeedef6

fix failed test

76c4768

Merge remote-tracking branch 'apache/master' into single-buffer-add

441f522

merlimat merged commit 234b817 into apache:master Mar 20, 2023

hangc0276 mentioned this pull request Apr 2, 2023

Fix promise lost in ClientSide AuthHandler #3902

Closed

michaeljmarshall mentioned this pull request Apr 14, 2023

[fix] DigestManager should not advance readerIndex #3919

Merged

zymap mentioned this pull request Dec 24, 2023

Fix the autorecovery failed replicate by add entry fenced error #4163

Merged

lhotari mentioned this pull request Jan 31, 2024

Fix checksum calculation bug when the payload is a CompositeByteBuf with readerIndex > 0 #4196

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single buffer for small add requests #3783

Single buffer for small add requests #3783

merlimat commented Feb 17, 2023

codecov-commenter commented Feb 18, 2023 •

edited

Loading

hangc0276 Feb 20, 2023

merlimat Feb 23, 2023

hangc0276 Feb 23, 2023

hangc0276 Feb 27, 2023

hangc0276 Feb 27, 2023

merlimat Feb 27, 2023

hangc0276 commented Mar 6, 2023

merlimat commented Mar 6, 2023

StevenLuMT commented Mar 9, 2023

hangc0276 commented Mar 20, 2023

merlimat commented Mar 20, 2023

Single buffer for small add requests #3783

Single buffer for small add requests #3783

Conversation

merlimat commented Feb 17, 2023

Motivation

codecov-commenter commented Feb 18, 2023 • edited Loading

Codecov Report

hangc0276 Feb 20, 2023

Choose a reason for hiding this comment

merlimat Feb 23, 2023

Choose a reason for hiding this comment

hangc0276 Feb 23, 2023

Choose a reason for hiding this comment

hangc0276 Feb 27, 2023

Choose a reason for hiding this comment

hangc0276 Feb 27, 2023

Choose a reason for hiding this comment

merlimat Feb 27, 2023

Choose a reason for hiding this comment

hangc0276 commented Mar 6, 2023

merlimat commented Mar 6, 2023

StevenLuMT commented Mar 9, 2023

hangc0276 commented Mar 20, 2023

merlimat commented Mar 20, 2023

codecov-commenter commented Feb 18, 2023 •

edited

Loading