Fixed heading in blogs post page · Pull Request #5601 · apache/hudi

ghost · 2022-05-17T06:49:44Z

Tips

Thank you very much for contributing to Apache Hudi.
Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.

What is the purpose of the pull request

Fixed heading size in blogs post page

Brief change log

(for example:)

Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end.
Added HoodieClientWriteTest to verify the change.
Manually verified the change by running a job locally.

Committer checklist

Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

bhasudha · 2022-05-17T06:57:19Z

website/docs/compaction.md

 1. ***Compaction Scheduling***: This is done by the ingestion job. In this step, Hudi scans the partitions and selects **file
   slices** to be compacted. A compaction plan is finally written to Hudi timeline.
-1. ***Compaction Execution***: A separate process reads the compaction plan and performs compaction of file slices.
+1. ***Compaction Execution***: In this step the compaction plan is read and file slices are compacted.


These seem like already merged changes not relavant to this PR. Can you ensure you have rebased from apache/hudi ?

bhasudha · 2022-05-17T06:57:30Z

website/learn/faq.md


 That said, for obvious reasons of not blocking ingesting for compaction, you may want to run it asynchronously as well. This can be done either via a separate [compaction job](https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java) that is scheduled by your workflow scheduler/notebook independently. If you are using delta streamer, then you can run in [continuous mode](https://github.com/apache/hudi/blob/d3edac4612bde2fa9deca9536801dbc48961fb95/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L241) where the ingestion and compaction are both managed concurrently in a single spark run time.

+### What options do I have for asynchronous/offline compactions on MOR dataset?


Bhavani Sudha Saktheeswaran and others added 2 commits April 26, 2022 16:51

[DOCS] Add faq for async/offline compaction options

a235f8a

Fixed heading in blogPostPage

b9ee016

ghost closed this May 17, 2022

ghost reopened this May 17, 2022

bhasudha reviewed May 17, 2022

View reviewed changes

Jai Yadav and others added 4 commits May 17, 2022 12:27

Fix background size

b400797

Update compaction.md

47f4cf9

Update compaction.md

27a3897

Merge branch 'asf-site' into fixHeading

2938f00

ghost closed this May 17, 2022

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed heading in blogs post page#5601

Fixed heading in blogs post page#5601
ghost wants to merge 6 commits intoapache:asf-sitefrom
bhasudha:fixHeading

ghost commented May 17, 2022

Uh oh!

bhasudha May 17, 2022

Uh oh!

bhasudha May 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		That said, for obvious reasons of not blocking ingesting for compaction, you may want to run it asynchronously as well. This can be done either via a separate [compaction job](https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java) that is scheduled by your workflow scheduler/notebook independently. If you are using delta streamer, then you can run in [continuous mode](https://github.com/apache/hudi/blob/d3edac4612bde2fa9deca9536801dbc48961fb95/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L241) where the ingestion and compaction are both managed concurrently in a single spark run time.

		### What options do I have for asynchronous/offline compactions on MOR dataset?

Conversation

ghost commented May 17, 2022

Tips

What is the purpose of the pull request

Brief change log

Verify this pull request

Committer checklist

Uh oh!

bhasudha May 17, 2022

Choose a reason for hiding this comment

Uh oh!

bhasudha May 17, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant