Skip to content

[HUDI-8285] implement row writer cluster with fg reader#12400

Closed
jonvex wants to merge 3 commits intoapache:masterfrom
jonvex:row_writer_cluster_fg_reader_better
Closed

[HUDI-8285] implement row writer cluster with fg reader#12400
jonvex wants to merge 3 commits intoapache:masterfrom
jonvex:row_writer_cluster_fg_reader_better

Conversation

@jonvex
Copy link
Contributor

@jonvex jonvex commented Dec 2, 2024

Change Logs

Implement row writer clustering by adding fixed file index to spark datasource

Impact

Clustering with full read featureset

Risk level (write none, low medium or high below)

low

Documentation Update

N/A

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:L PR with lines of changes in (300, 1000] label Dec 2, 2024
@hudi-bot
Copy link
Collaborator

hudi-bot commented Dec 3, 2024

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@danny0405
Copy link
Contributor

what's the relationship with this PR: #12395

@jonvex
Copy link
Contributor Author

jonvex commented Dec 3, 2024

This is a contingency incase that other pr got stuck. Both have the same end goal of clustering with the new fg reader, but different approac. In that pr, we use the fg reader directly in spark clustering to read the filegroups. This pr implements a new index so that we can use spark datasource to read all the filegroups.

@jonvex jonvex closed this Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L PR with lines of changes in (300, 1000]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants