Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-3911][DOCS][WIP] Add async indexing doc #5449

Closed
wants to merge 2 commits into from

Conversation

codope
Copy link
Member

@codope codope commented Apr 27, 2022

screencapture-localhost-3000-blog-2022-04-27-async-indexing-2022-04-27-18_51_05

@codope codope changed the title [HUDI-3911][DOCS] Add async indexing doc [HUDI-3911][DOCS][WIP] Add async indexing doc Apr 28, 2022
@yihua yihua added the docs label Apr 28, 2022
@yihua yihua added this to Under Discussion PRs in PR Tracker Board via automation Apr 28, 2022
@bhasudha
Copy link
Contributor

bhasudha commented Apr 28, 2022

@codope I read through the blog. Thanks for the great writeup. I have a couple of thoughts/questions from an external perspective.

  • Should we name this blog to explicitly mention as Asynchronous metadata indexing using Hudi. This will avoid confusion with the current indexing concept here for users - https://hudi.apache.org/docs/indexing ?
  • Also I see that we have a category of Services in our Docs. There is a page for clustering, compaction, etc in there. Since this is one of the table services, should we also add a page on metadata indexing under Services. It can have a brief description and point to this blog for detailed content.
  • One last thing is, seeing the description, I also wanted to understand what happens if we disable hoodie.metadata.index.async . What is the behavior then ? Does it run inline ? What are the implications. It would be good to see those explained as well.

Let me know your thoughts.

@nsivabalan
Copy link
Contributor

I like the motivation section 👏

@codope
Copy link
Member Author

codope commented Apr 30, 2022

@codope I read through the blog. Thanks for the great writeup. I have a couple of thoughts/questions from an external perspective.

  • Should we name this blog to explicitly mention as Asynchronous metadata indexing using Hudi. This will avoid confusion with the current indexing concept here for users - https://hudi.apache.org/docs/indexing ?
  • Also I see that we have a category of Services in our Docs. There is a page for clustering, compaction, etc in there. Since this is one of the table services, should we also add a page on metadata indexing under Services. It can have a brief description and point to this blog for detailed content.
  • One last thing is, seeing the description, I also wanted to understand what happens if we disable hoodie.metadata.index.async . What is the behavior then ? Does it run inline ? What are the implications. It would be good to see those explained as well.

Let me know your thoughts.

Good points @bhasudha

I have moved the setup and usage related sections to a separate page under Services tab, check out #5476
I need to do some more work on this blog. I will rename, add more design elements and diagram. If async indexing is disabled with any of the index types enabled then those metadata partitions will be created/updated synchronously. Added this line in caveat section in #5476

@codope codope force-pushed the HUDI-3911-async-index-blog branch from 7fe14ec to dba7f58 Compare May 25, 2022 06:53

## Background

Built on top of cheap storage and open file formats, the Hudi stack is more than
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codope
Copy link
Member Author

codope commented May 25, 2022

Closing this PR #5476 already merged

@codope codope closed this May 25, 2022
PR Tracker Board automation moved this from Under Discussion PRs to Done May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants