Skip to content

Conversation

@wgtmac
Copy link
Member

@wgtmac wgtmac commented Nov 4, 2025

  • fix schema definition of manifest list & file with versions
  • port manifest list test cases from java
  • fix various issues found by new cases

@wgtmac wgtmac changed the title test: add extensive manifest (file/list) test cases test: add manifest list test cases Nov 7, 2025
@wgtmac wgtmac marked this pull request as ready for review November 7, 2025 15:35
@wgtmac
Copy link
Member Author

wgtmac commented Nov 7, 2025

@zhjwpku @HeartLinked

@wgtmac wgtmac requested a review from Copilot November 7, 2025 16:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors manifest and manifest list schema initialization across different Iceberg format versions (V1, V2, V3). The changes replace dynamic schema construction based on field ID sets with static, version-specific schema definitions, improving code clarity and maintainability.

Key changes:

  • Introduced static helper methods (EntrySchema, WrapFileSchema, FileType/DataFileSchema) for constructing version-specific schemas
  • Added static kManifestListSchema constants for each format version
  • Changed V3's first_row_id parameter from std::optional<int64_t> to int64_t in writer constructors
  • Removed InitSchema methods from base adapter classes
  • Updated error message format strings from %s to {} for consistency with std::format

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/iceberg/v1_metadata.h/cc Added static schema construction methods and manifest list schema constant for V1
src/iceberg/v2_metadata.h/cc Added static schema construction methods and manifest list schema constant for V2
src/iceberg/v3_metadata.h/cc Added static schema construction methods, manifest list schema constant, changed first_row_id to non-optional, renamed method to WrapFirstRowId
src/iceberg/schema_field.h Added AsRequired/AsOptional helper methods for field optionality conversion
src/iceberg/manifest_writer.h/cc Changed MakeV3Writer signature to accept non-optional first_row_id, added next_row_id() accessor
src/iceberg/manifest_adapter.h/cc Removed InitSchema methods, moved partition_spec initialization to constructor, added next_row_id() virtual method
src/iceberg/manifest_list.h/cc Changed Type() return type from StructType reference to shared_ptr
src/iceberg/manifest_entry.h/cc Added partition documentation constant, updated default partition type handling
src/iceberg/manifest_reader.cc Simplified manifest list reader initialization using new Type() signature
src/iceberg/test/manifest_list_versions_test.cc New comprehensive test file for cross-version manifest list compatibility
src/iceberg/test/CMakeLists.txt Added new test file to build configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- fix schema definition of manifest list & file with versions
- port manifest list test cases from java
- fix various issues found by new cases
- support manifest writer to write data and delete content
wgtmac and others added 3 commits November 8, 2025 20:10
Copy link
Collaborator

@zhjwpku zhjwpku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wgtmac
Copy link
Member Author

wgtmac commented Nov 10, 2025

Thanks @zhjwpku and @liurenjie1024 for the review!

Copy link
Contributor

@dongxiao1198 dongxiao1198 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@wgtmac wgtmac merged commit 23ed851 into apache:main Nov 10, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants