Skip to content

Accept Arrow null (Iceberg v3 unknown) physical types in Parquet schema validation and add test#12

Open
manuzhang wants to merge 1 commit into
mainfrom
codex/support-unknown-v3-type
Open

Accept Arrow null (Iceberg v3 unknown) physical types in Parquet schema validation and add test#12
manuzhang wants to merge 1 commit into
mainfrom
codex/support-unknown-v3-type

Conversation

@manuzhang
Copy link
Copy Markdown
Owner

Motivation

  • Iceberg v3 represents unknown values as nulls in physical Parquet files, so schema-evolution validation must treat Arrow null physical types as compatible with any projected Iceberg type.

Description

  • Updated ValidateParquetSchemaEvolution in parquet_schema_util.cc to allow Arrow null (::arrow::Type::NA) physical types and return success for them.
  • Adjusted anonymous namespace boundaries and added a forward declaration for ProjectNested to keep helper functions organized.
  • Exported the ValidateParquetSchemaEvolution declaration in parquet_schema_util_internal.h so it is available where needed.
  • Added unit test ValidateSchemaEvolutionAllowsNullPhysicalType in parquet_schema_test.cc that verifies a Parquet field with ::arrow::null() is accepted.

Testing

  • Ran the ParquetSchemaProjectionTest suite including the new ValidateSchemaEvolutionAllowsNullPhysicalType test, and all tests passed.
  • Existing projection tests (e.g. int->long, float->double, incompatible types) were executed and remained successful.

Codex Task

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8676ef1e19

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/iceberg/parquet/parquet_schema_util.cc
@manuzhang
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 06a8959bd0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/iceberg/parquet/parquet_schema_util.cc Outdated
Comment thread src/iceberg/avro/avro_schema_util.cc Outdated
@manuzhang
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 54a5dc8f87

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/iceberg/parquet/parquet_schema_util.cc
Comment thread src/iceberg/avro/avro_schema_util.cc
@manuzhang
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b3f9ce333b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/iceberg/json_serde.cc
@manuzhang
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 200922726d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/iceberg/schema_util.cc
@manuzhang
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 691b15b45e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/iceberg/schema_util.cc
Comment thread src/iceberg/schema_internal.cc
@manuzhang
Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Add an Iceberg unknown primitive type and JSON, Arrow, Avro, Parquet, projection, and data path support for null-only unknown fields. Enforce optionality invariants so required projections cannot be materialized from unknown/null-only fields.

Co-authored-by: Codex <codex@openai.com>
@manuzhang manuzhang force-pushed the codex/support-unknown-v3-type branch from 71a8faa to 5d6eb36 Compare May 20, 2026 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant