feat(table): gate v3-only metadata fields on sub-v3 reads (#1006) by tanmayrauth · Pull Request #1069 · apache/iceberg-go

tanmayrauth · 2026-05-11T17:54:48Z

Reject next-row-id and encryption-keys during v1/v2 metadata parsing with a descriptive error naming the field and required format version.

Closes: #1006

tanmayrauth · 2026-05-12T15:20:04Z

@laskoviymishka can you please review this PR?

laskoviymishka

The goal makes sense, catching v3-only fields leaking into v1/v2 metadata is a real correctness concern.

But I’m a bit skeptical about the strict reject here, because it seems to diverge from the other Iceberg clients. Java reads next-row-id only when formatVersion >= 3 and otherwise defaults it. encryption-keys does not seem to have a version guard. PyIceberg drops unknown fields through Pydantic, and iceberg-rust does the same through serde.

So with this change, Go becomes the only client that errors on these fields. For example, a v3 → v2 downgrade with stray encryption-keys would still be readable by Java / Python / Rust, but unreadable by Go. That may be the right call, but I think it’s worth an explicit project-level decision before we make Go stricter than the reference behavior.

If we do want to be stricter, I’d at least use ErrInvalidMetadataFormatVersion rather than ErrInvalidMetadata. Callers usually read ErrInvalidMetadata as “corrupt file,” while this is more “field does not match format version.”

Design-wise, rejectV3OnlyFields taking each field as a typed parameter feels like it won’t scale well. V3 has more fields coming, and every new gate would require changing the helper signature, both call sites, and adding more flat tests. A variadic / slice-based helper would be easier to extend.

The tests could also be simpler as one table-driven subtest, and I think we’re missing a positive v3 case for encryption-keys.

Reject next-row-id and encryption-keys during v1/v2 metadata parsing with a descriptive error naming the field and required format version.

tanmayrauth · 2026-05-13T22:29:40Z

Adressed the below things:

Switched to ErrInvalidMetadataFormatVersion — agreed that ErrInvalidMetadata implies corruption
Refactored to variadic ...v3FieldCheck so new fields are a one-line addition
Call sites now read aux.FormatVersion instead of literal 1/2
Tests collapsed into a single table-driven subtest, added the v3+encryption-keys positive case and the empty encryption-keys: [] boundary case

On the strictness question : the spec explicitly scopes these fields to v3, so my thinking was that surfacing the mismatch early (with a clear, distinguishable error) helps consumers catch misconfigured metadata closer to the source. In the v3→v2 downgrade scenario, ideally the downgrade tool would strip v3 fields - silently accepting them downstream could mask that gap for Go consumers. What do you think?

Use ErrInvalidMetadataFormatVersion, variadic v3FieldCheck helper, aux.FormatVersion at call sites, and table-driven tests.

tanmayrauth · 2026-05-16T21:12:02Z

@laskoviymishka does this PR looks fine now?

zeroshade · 2026-05-16T22:20:43Z

+			return fmt.Errorf("%w: v3-only field '%s' present in v%d metadata",
+				ErrInvalidMetadataFormatVersion, c.name, version)


should we use errors.Join so that we can report all the fields we're rejecting at once instead of piece-meal?

Done. I changed rejectV3OnlyFields to collect all violations and report them via errors.Join instead of returning on the first one. Added a test case ("reports all rejected fields") that verifies both next-row-id and encryption-keys are mentioned in the error when both are present simultaneously.

Use errors.Join to collect all v3-only field violations instead of returning on the first one encountered.

laskoviymishka

Looks good to me now. I’d like to track two follow-ups, neither blocking.

The bigger one is cross-client behavior. I read through TableMetadataParser.java while reviewing this, and Java doesn’t reject these fields: next-row-id is only read when formatVersion >= 3 and otherwise defaults to INITIAL_ROW_ID; encryption-keys has no version guard at all. PyIceberg and iceberg-rust also silently drop unknown fields. So Go is the only client hard-erroring here.

That doesn’t hurt today because v3 isn’t widely deployed, but once downgrade tooling exists, a v2 file with stray encryption-keys would read everywhere except Go. It’s worth deciding before v3 GA whether we want to keep that strictness. I think it’s defensible — catching mis-versioned metadata early has real value — but if we keep it, a short code comment explaining the intentional divergence would help future readers.

Smaller one: last-sequence-number is v2+ on the shared commonMetadata, but is silently accepted on v1 reads today. If we’re gating v3-only fields on sub-v3 reads, the same logic naturally extends to v2-only fields on v1.

Worth a follow-up issue to keep the gate principled.

tanmayrauth requested a review from zeroshade as a code owner May 11, 2026 17:54

laskoviymishka requested changes May 13, 2026

View reviewed changes

Comment thread table/metadata.go Outdated

Comment thread table/metadata.go Outdated

Comment thread table/metadata_internal_test.go Outdated

feat(table): gate v3-only metadata fields on sub-v3 reads (apache#1006)

17fcd67

Reject next-row-id and encryption-keys during v1/v2 metadata parsing with a descriptive error naming the field and required format version.

refactor(table): address review feedback on v3 field gating

a3735a2

Use ErrInvalidMetadataFormatVersion, variadic v3FieldCheck helper, aux.FormatVersion at call sites, and table-driven tests.

tanmayrauth force-pushed the feat/1006-gate-v3-metadata-fields branch from 05f6365 to a3735a2 Compare May 13, 2026 22:30

ci: retrigger pipeline

b461777

tanmayrauth closed this May 14, 2026

tanmayrauth deleted the feat/1006-gate-v3-metadata-fields branch May 14, 2026 17:55

tanmayrauth restored the feat/1006-gate-v3-metadata-fields branch May 14, 2026 18:19

tanmayrauth reopened this May 14, 2026

zeroshade reviewed May 16, 2026

View reviewed changes

refactor(table): report all rejected v3 fields in a single error

72ecede

Use errors.Join to collect all v3-only field violations instead of returning on the first one encountered.

laskoviymishka approved these changes May 17, 2026

View reviewed changes

laskoviymishka mentioned this pull request May 17, 2026

discussion: align v1/v2 metadata parsing with Java/PyIceberg on v3-only fields #1086

Open

laskoviymishka merged commit 2e6b2b6 into apache:main May 17, 2026
14 checks passed

This was referenced May 17, 2026

feat(table): gate v2-only metadata fields on v1 reads #1087

Open

feat(table): gate v2-only metadata fields on v1 reads #1088

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(table): gate v3-only metadata fields on sub-v3 reads (#1006)#1069

feat(table): gate v3-only metadata fields on sub-v3 reads (#1006)#1069
laskoviymishka merged 4 commits into
apache:mainfrom
tanmayrauth:feat/1006-gate-v3-metadata-fields

tanmayrauth commented May 11, 2026

Uh oh!

tanmayrauth commented May 12, 2026

Uh oh!

laskoviymishka left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tanmayrauth commented May 13, 2026 •

edited

Loading

Uh oh!

tanmayrauth commented May 16, 2026

Uh oh!

zeroshade May 16, 2026

Uh oh!

tanmayrauth May 16, 2026

Uh oh!

laskoviymishka left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		return fmt.Errorf("%w: v3-only field '%s' present in v%d metadata",
		ErrInvalidMetadataFormatVersion, c.name, version)

Conversation

tanmayrauth commented May 11, 2026

Uh oh!

tanmayrauth commented May 12, 2026

Uh oh!

laskoviymishka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tanmayrauth commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tanmayrauth commented May 16, 2026

Uh oh!

zeroshade May 16, 2026

Choose a reason for hiding this comment

Uh oh!

tanmayrauth May 16, 2026

Choose a reason for hiding this comment

Uh oh!

laskoviymishka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tanmayrauth commented May 13, 2026 •

edited

Loading