Skip to content

Fix errors and inconsistencies in Variant format documentation#574

Open
iemejia wants to merge 1 commit into
apache:masterfrom
iemejia:fix/variant-docs
Open

Fix errors and inconsistencies in Variant format documentation#574
iemejia wants to merge 1 commit into
apache:masterfrom
iemejia:fix/variant-docs

Conversation

@iemejia
Copy link
Copy Markdown
Member

@iemejia iemejia commented Jun 2, 2026

Summary

Fix bugs, terminology errors, and inconsistencies in the Variant format specification documents.

Changes

VariantEncoding.md

  • Fix BINARY -> BYTE_ARRAY (BINARY is not a Parquet physical type)
  • Add note on decimal little-endian vs big-endian difference
  • Fix decimal implied-precision formula for val <= 0
  • Label undocumented reserved bits in metadata/object/array headers
  • Make sorted_strings description consistent across three definitions
  • Use INT(N, true) notation consistent with LogicalTypes.md
  • Hyphenate compound adjectives ("3 byte" -> "3-byte", etc.)

VariantShredding.md

  • Fix Python syntax error: iterating dict yields keys only; add .items() for (name, field) unpacking
  • Replace BINARY with BYTE_ARRAY
  • Fix comma -> colon inside JSON-like literal in table cell
  • Remove trailing space inside backticks in table header
  • Use INT(N, true) notation consistent with LogicalTypes.md

Validation

No semantic/behavioral changes to the format specification. All fixes are documentation-only.

Split from #572 for easier review.

VariantEncoding.md:
- Fix BINARY -> BYTE_ARRAY (BINARY is not a Parquet physical type)
- Add note on decimal little-endian vs big-endian difference
- Fix decimal implied-precision formula for val <= 0
- Label undocumented reserved bits in metadata/object/array headers
- Make sorted_strings description consistent across three definitions
- Use INT(N, true) notation consistent with LogicalTypes.md
- Hyphenate compound adjectives ("3 byte" -> "3-byte", etc.)

VariantShredding.md:
- Fix Python syntax error: iterating dict yields keys only;
  add .items() for (name, field) unpacking
- Replace BINARY with BYTE_ARRAY
- Fix comma -> colon inside JSON-like literal in table cell
- Remove trailing space inside backticks in table header
- Use INT(N, true) notation consistent with LogicalTypes.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant