Skip to content

[MINOR] add integrity check of parquet file#8056

Open
XuQianJin-Stars wants to merge 1 commit into
apache:masterfrom
XuQianJin-Stars:check-parquet-file
Open

[MINOR] add integrity check of parquet file#8056
XuQianJin-Stars wants to merge 1 commit into
apache:masterfrom
XuQianJin-Stars:check-parquet-file

Conversation

@XuQianJin-Stars
Copy link
Copy Markdown
Contributor

@XuQianJin-Stars XuQianJin-Stars commented Feb 27, 2023

add integrity check of parquet file for HoodieRowDataParquetWriter, keep read and write parquet data files consistent.

Change Logs

NA

Impact

NA

Risk level (write none, low medium or high below)

low

Documentation Update

NA

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed


IOUtils.checkParquetFileVaid(hoodieTable.getHadoopConf(), newFilePath);

long oldNumWrites = 0;
Copy link
Copy Markdown
Contributor

@danny0405 danny0405 Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what senario the handle flushes successfully but the parquet is in incompleteness? And what about the HoodieCreateHandle ?

@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@github-actions github-actions Bot added the size:S PR with lines of changes in (10, 100] label Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

Status: 🆕 New

Development

Successfully merging this pull request may close these issues.

3 participants