Skip to content

[HUDI-1554] Introduced buffering for streams in HUDI.#2496

Closed
prashantwason wants to merge 1 commit intoapache:masterfrom
prashantwason:pw_io_buffering
Closed

[HUDI-1554] Introduced buffering for streams in HUDI.#2496
prashantwason wants to merge 1 commit intoapache:masterfrom
prashantwason:pw_io_buffering

Conversation

@prashantwason
Copy link
Member

What is the purpose of the pull request

Input and Output streams created in HUDI through calls to HoodieWrapperFileSystem do not include any buffering unless the underlying file system implements buffering.

This patch introduces buffering at the HoodieWrapperFileSystem level so that all types of reads and writes benefit from buffering.

Brief change log

HoodieWrapperFileSystem changed to introduce BufferedStreams.

Verify this pull request

This pull request is already covered by existing tests.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@prashantwason
Copy link
Member Author

@n3nash Please review as this may provide benefits for HDFS workloads.

Copy link
Contributor

@n3nash n3nash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prashantwason left some comments

@codecov-io
Copy link

codecov-io commented Jan 27, 2021

Codecov Report

Merging #2496 (b03b269) into master (23f2ef3) will increase coverage by 0.48%.
The diff coverage is 38.19%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2496      +/-   ##
============================================
+ Coverage     50.28%   50.77%   +0.48%     
- Complexity     3120     3182      +62     
============================================
  Files           430      436       +6     
  Lines         19565    19907     +342     
  Branches       2004     2041      +37     
============================================
+ Hits           9838    10107     +269     
- Misses         8924     8979      +55     
- Partials        803      821      +18     
Flag Coverage Δ Complexity Δ
hudicli 36.90% <ø> (-0.31%) 0.00 <ø> (ø)
hudiclient 100.00% <ø> (ø) 0.00 <ø> (ø)
hudicommon 51.12% <38.19%> (-0.40%) 0.00 <27.00> (ø)
hudiflink 43.21% <ø> (+10.17%) 0.00 <ø> (ø)
hudihadoopmr 33.16% <ø> (ø) 0.00 <ø> (ø)
hudisparkdatasource 69.46% <ø> (+3.60%) 0.00 <ø> (ø)
hudisync 48.61% <ø> (ø) 0.00 <ø> (ø)
huditimelineservice 66.49% <ø> (ø) 0.00 <ø> (ø)
hudiutilities 69.46% <ø> (-0.02%) 0.00 <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ Complexity Δ
...i/common/config/HoodieWrapperFileSystemConfig.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...apache/hudi/common/engine/HoodieEngineContext.java 50.00% <0.00%> (-16.67%) 1.00 <0.00> (ø)
.../hudi/common/fs/BufferedSizeAwareOutputStream.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...c/main/java/org/apache/hudi/common/fs/FSUtils.java 48.62% <0.00%> (-1.15%) 62.00 <2.00> (+1.00) ⬇️
...che/hudi/common/fs/TimedSizeAwareOutputStream.java 56.66% <33.33%> (ø) 4.00 <1.00> (?)
...apache/hudi/common/fs/HoodieWrapperFileSystem.java 25.00% <47.69%> (+2.67%) 52.00 <13.00> (+8.00)
.../org/apache/hudi/common/fs/TimedFSInputStream.java 50.00% <50.00%> (ø) 8.00 <8.00> (?)
...org/apache/hudi/common/model/HoodieFileFormat.java 100.00% <100.00%> (ø) 6.00 <3.00> (+3.00)
...che/hudi/common/table/log/HoodieLogFileReader.java 68.86% <100.00%> (+1.00%) 22.00 <0.00> (ø)
.../apache/hudi/common/fs/TimedFSDataInputStream.java 0.00% <0.00%> (-29.42%) 0.00% <0.00%> (-3.00%)
... and 32 more

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level question , should we always buffer or make this configurable for HDFS only?

@prashantwason
Copy link
Member Author

High level question , should we always buffer or make this configurable for HDFS only?

I don't have idea about other file systems and their inherent buffering. You can decide. I did not see an easy way to restrict this as HoodieWrapperFileSystem currently does not take any properties.

@prashantwason prashantwason force-pushed the pw_io_buffering branch 2 times, most recently from f6f692d to b43ddd3 Compare January 28, 2021 19:41
@vinothchandar vinothchandar self-assigned this Jan 29, 2021
@vinothchandar
Copy link
Member

cc @umehrot2 would this additional buffering pose inefficiencies for S3 FileSystem? TL;DR HDFS's DistributedFileSystem does not buffer reads, neither does the parquet reader. so this just cuts down a ton of RPC calls to NameNode.

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Minor comments. I think the 16MB buffer is okay to do regardless of DFS.

On passing configs, the way I can think of is to transfer the values from writeConfig to the hadoop configuration object. We should not make this layer aware of HoodieWriteConfig etc. For now, these constants may be ok.

@vinothchandar
Copy link
Member

Once we fix CI and the minor stuff, we can land

@prashantwason
Copy link
Member Author

On passing configs, the way I can think of is to transfer the values from writeConfig to the hadoop configuration object

Implemented this. @vinothchandar PTAL

@prashantwason
Copy link
Member Author

@n3nash I have implemented the on/off config. PTAL and approve.

Copy link
Contributor

@n3nash n3nash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prashantwason Thanks for adding the configurable option. This will help to rollout this feature with confidence that we can turn it off if we see issues. Left 1 comment, rest LGTM can merge once addressed.

Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @prashantwason : I don't have a lot of knowledge on buffering. Can you please link some references to understandbuffering in fileSystem if you have some. Also, if you have come across any links that talks about when to enable buffering and when to not enable, would be good as well. Will benefit folks like me who are trying to understand how buffering works and will also benefit users who are looking to decide whether to enable or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please try to be uniform. for getter methods, we have suffixed "ing" for buffer, but here we haven't. was it a conscious decision?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffering is a verb. So when asking "is buffering enabled" I have used the ing form. The size of the buffer is in bytes. So for getting the size of the buffer, I have used "buffer".

Buffering is enabled/disabled. Buffering does not have a size. Buffer has a size.

Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high level question.

Do we need to enable metrics FS(time aware, size aware) as well by default whenever buffering is enabled. Why not we make this on configurable too. Wondering if there will be any overhead, as we might be measuring metrics for very read/write calls to the FileSystem. Please correct me if my understanding is wrong. Users may not be interested in these metrics unless they want to debug something IMO. Let me know your thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have unit tests for this FS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being tested by all the unit tests which read data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why still name the class TimedFSInputStream based on the fact that the written bytes size is also recorded ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class does not record the written bytes.

@prashantwason
Copy link
Member Author

Do we need to enable metrics FS(time aware, size aware) as well by default whenever buffering is enabled

There are two parts to metrics in HUDI:

  1. Metrics in-memory (explained below)
  2. Publishing the metrics out (e.g to Graphana). This needs to be enabled explicitly (disabled by default) and requires external infrastructure.

Metrics within HUDI are implemented using Registry which simply maintains the key-value metric pairs in memory. Each metric itself is a AtomicLong held in a in-memory Hash-map.

Therefore the overhead of incrementing a metrics is:

  1. HasMap lookup to find the Counter
  2. AtomicLong.addAndGet()

So this should be negligible overhead on modern processors unless we are maintain millions of metrics.

I feel the checks of metrics enable everywhere (if-metrics-enabled-then-do-something) tend to make the code ugly and they dont provide any performance benefits.

@prashantwason prashantwason force-pushed the pw_io_buffering branch 2 times, most recently from 1d4120a to 113d4b1 Compare February 2, 2021 06:40
@prashantwason prashantwason force-pushed the pw_io_buffering branch 2 times, most recently from 2e1a1be to b03b269 Compare February 7, 2021 18:07
@nsivabalan nsivabalan added the priority:high Significant impact; potential bugs label Feb 11, 2021
@vinothchandar
Copy link
Member

@prashantwason can we get the PR to pass tests? I can take a final pass for landing. it'd be good to get this in

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change for this PR?

Input and Output streams created in HUDI through calls to HoodieWrapperFileSystem do not include any buffering unless the underlying file system implements buffering. This patch introduces buffering at the HoodieWrapperFileSystem level so that all types of reads and writes benefit from buffering.

The buffering can be controlled by the following properties:
  hoodie.fs.io.buffer.enabled (default true) enable/disable buffering
  hoodie.fs.io.buffer.data.min.size  (default 16MB) Minimum buffer size of data files and log files which are generally large in size
  hoodie.fs.io.buffer.min.size  (default 1MB) Minimum buffer size of non-data files which are generally smaller in size
@vinothchandar
Copy link
Member

@prashantwason I rebased this against master. still have some test failures. could you please take a look, so we can land this

@vinothchandar
Copy link
Member

@prashantwason ping!

@prashantwason
Copy link
Member Author

@vinothchandar With a re-test on HDFS, I have been able to verify that this patch reduces the total number of API calls but did not find any significant difference in performance. Were you able to test this on S3?

My main aim was to improve performance so cannot prove it right now. Should we mark this WIP for now till I can get more tests done?

@vinothchandar
Copy link
Member

I have not been able to test this on S3. let me pick it up later next week.

@vinothchandar
Copy link
Member

cc @nsivabalan @codope do any of you have cycles to test this PR out on top of s3 and see if any perf improvements happen (My guess is no).

@hudi-bot
Copy link
Collaborator

hudi-bot commented Nov 5, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua yihua added area:performance Performance optimizations writer-core labels Sep 7, 2022
@yihua
Copy link
Contributor

yihua commented Sep 7, 2022

@vinothchandar @nsivabalan @xushiyan if I understand correctly based on the discussion, this PR is ready to land after fixing the tests. The performance test on S3 is a plus. Or could we close this PR for now given other improvements like metadata table, record-level index, and log compaction?

@xushiyan xushiyan added the status:in-progress Work in progress label Oct 31, 2022
@github-actions github-actions bot added the size:L PR with lines of changes in (300, 1000] label Feb 26, 2024
@vinothchandar
Copy link
Member

closing this due to inactivity.. reopen if interested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:performance Performance optimizations priority:high Significant impact; potential bugs release-1.0.0 size:L PR with lines of changes in (300, 1000] status:in-progress Work in progress

Projects

Status: 🚧 Needs Repro
Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

9 participants