Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-2028] Implement RockDbBasedMap as an alternate to DiskBasedMap in SpillableMap #3112

Closed
wants to merge 2 commits into from

Conversation

rmahindra321
Copy link

What is the purpose of the pull request

This pull request adds a new alternative based on RockDb for the Disk Based Map that is used within the ExternalSpillableMap. Our benchmark results shows that RockDb may improve performance significantly when the data set is large while available memory may be scarce. RockDb supports compression, efficient memory usage and native library, that may be more efficient in certain situations. By default, disk based map will be used, and a config change will be required to enable rocksDb.

In this PR, the rocksDB support is only enabled for HoodieMergeHandle, and a subsequent PR will extend it to all consumers of ExternalSpillableMap (tracked here HUDI-2044)

Brief change log

  • Adds a new alternative based on RockDb for the Disk Based Map that is used within the ExternalSpillableMap.
  • The support is currently added only for HoodieMergeHandle

Verify this pull request

This change added tests and can be verified as follows:

Added the unit test in TestSpillableRocksDBBasedMap
Updated the test for TestExternalSpillableMap

@rmahindra321 rmahindra321 changed the title Implement RockDbBasedMap as an alternate to DiskBasedMap in SpillableMap [HUDI-2028] Implement RockDbBasedMap as an alternate to DiskBasedMap in SpillableMap Jun 18, 2021
@rmahindra321 rmahindra321 marked this pull request as draft June 18, 2021 20:30
@hudi-bot
Copy link

hudi-bot commented Jun 18, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run travis re-run the last Travis build
  • @hudi-bot run azure re-run the last Azure build

@rmahindra321 rmahindra321 marked this pull request as ready for review June 18, 2021 21:49
@vinothchandar vinothchandar self-assigned this Jun 19, 2021
@vinothchandar vinothchandar added this to Ready for Review in PR Tracker Board Jun 19, 2021
@codecov-commenter
Copy link

codecov-commenter commented Jun 19, 2021

Codecov Report

Merging #3112 (17ccf1d) into master (cdb9b48) will increase coverage by 0.10%.
The diff coverage is 82.65%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3112      +/-   ##
============================================
+ Coverage     45.85%   45.95%   +0.10%     
- Complexity     5269     5296      +27     
============================================
  Files           908      910       +2     
  Lines         39332    39427      +95     
  Branches       4239     4246       +7     
============================================
+ Hits          18036    18119      +83     
- Misses        19451    19459       +8     
- Partials       1845     1849       +4     
Flag Coverage Δ
hudicli 39.95% <ø> (ø)
hudiclient 30.43% <16.66%> (-0.01%) ⬇️
hudicommon 47.92% <86.95%> (+0.34%) ⬆️
hudiflink 61.33% <ø> (ø)
hudihadoopmr 51.29% <ø> (ø)
hudisparkdatasource 66.52% <ø> (ø)
hudisync 51.73% <ø> (ø)
huditimelineservice 64.36% <ø> (ø)
hudiutilities 56.67% <ø> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ain/java/org/apache/hudi/io/HoodieMergeHandle.java 0.00% <0.00%> (ø)
...ache/hudi/common/util/collection/DiskBasedMap.java 80.88% <ø> (ø)
...java/org/apache/hudi/config/HoodieWriteConfig.java 17.03% <20.00%> (+0.03%) ⬆️
...i/common/util/collection/ExternalSpillableMap.java 76.36% <77.27%> (+4.93%) ⬆️
...mmon/util/collection/SpillableRocksDBBasedMap.java 86.36% <86.36%> (ø)
...apache/hudi/common/util/collection/RocksDBDAO.java 77.11% <96.15%> (+2.25%) ⬆️
...ache/hudi/table/action/rollback/RollbackUtils.java 0.00% <0.00%> (ø)
...apache/hudi/exception/HoodieMetadataException.java 0.00% <0.00%> (ø)
...apache/hudi/utilities/deltastreamer/DeltaSync.java 71.18% <0.00%> (+0.33%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cdb9b48...17ccf1d. Read the comment docs.

PR Tracker Board automation moved this from Ready for Review to Done Jun 20, 2021
@rmahindra321
Copy link
Author

Used the incorrect github account to generate the PR, created a new PR #3117

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants