Skip to content

tolerate missing padding at the end of the last block of parquet files#99857

Merged
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
seva-potapov:fix-parquet-v3-delta-binary-packed-no-padding
Mar 18, 2026
Merged

tolerate missing padding at the end of the last block of parquet files#99857
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
seva-potapov:fix-parquet-v3-delta-binary-packed-no-padding

Conversation

@seva-potapov
Copy link
Copy Markdown
Contributor

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

tolerate missing padding at the end of the last block of parquet files

Details:

The Parquet spec requires zero-padding the last block to the full block size, but some writers (e.g. parquet-go) only write bytes for the actual values. The missing bytes correspond to padding values beyond total_values_remaining that would never be read by decodeImpl, so we reduce the miniblock's readable value count to match the available data.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 18, 2026

Workflow [PR], commit [6c7f32e]

Summary:

@alexey-milovidov alexey-milovidov self-assigned this Mar 18, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 18, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.70% 83.70% +0.00%
Functions 23.90% 23.90% +0.00%
Branches 76.30% 76.30% +0.00%

PR changed lines: PR changed-lines coverage: 92.86% (13/14, 0 noise lines excluded)
Diff coverage report
Uncovered code

@alexey-milovidov alexey-milovidov added this pull request to the merge queue Mar 18, 2026
Merged via the queue into ClickHouse:master with commit 03c3e3e Mar 18, 2026
163 checks passed
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants