feat(table): gate v3-only metadata fields on sub-v3 reads (#1006)#1069
Conversation
|
@laskoviymishka can you please review this PR? |
laskoviymishka
left a comment
There was a problem hiding this comment.
The goal makes sense, catching v3-only fields leaking into v1/v2 metadata is a real correctness concern.
But I’m a bit skeptical about the strict reject here, because it seems to diverge from the other Iceberg clients. Java reads next-row-id only when formatVersion >= 3 and otherwise defaults it. encryption-keys does not seem to have a version guard. PyIceberg drops unknown fields through Pydantic, and iceberg-rust does the same through serde.
So with this change, Go becomes the only client that errors on these fields. For example, a v3 → v2 downgrade with stray encryption-keys would still be readable by Java / Python / Rust, but unreadable by Go. That may be the right call, but I think it’s worth an explicit project-level decision before we make Go stricter than the reference behavior.
If we do want to be stricter, I’d at least use ErrInvalidMetadataFormatVersion rather than ErrInvalidMetadata. Callers usually read ErrInvalidMetadata as “corrupt file,” while this is more “field does not match format version.”
Design-wise, rejectV3OnlyFields taking each field as a typed parameter feels like it won’t scale well. V3 has more fields coming, and every new gate would require changing the helper signature, both call sites, and adding more flat tests. A variadic / slice-based helper would be easier to extend.
The tests could also be simpler as one table-driven subtest, and I think we’re missing a positive v3 case for encryption-keys.
Reject next-row-id and encryption-keys during v1/v2 metadata parsing with a descriptive error naming the field and required format version.
|
Adressed the below things:
On the strictness question : the spec explicitly scopes these fields to v3, so my thinking was that surfacing the mismatch early (with a clear, distinguishable error) helps consumers catch misconfigured metadata closer to the source. In the v3→v2 downgrade scenario, ideally the downgrade tool would strip v3 fields - silently accepting them downstream could mask that gap for Go consumers. What do you think? |
Use ErrInvalidMetadataFormatVersion, variadic v3FieldCheck helper, aux.FormatVersion at call sites, and table-driven tests.
05f6365 to
a3735a2
Compare
|
@laskoviymishka does this PR looks fine now? |
| return fmt.Errorf("%w: v3-only field '%s' present in v%d metadata", | ||
| ErrInvalidMetadataFormatVersion, c.name, version) |
There was a problem hiding this comment.
should we use errors.Join so that we can report all the fields we're rejecting at once instead of piece-meal?
There was a problem hiding this comment.
Done. I changed rejectV3OnlyFields to collect all violations and report them via errors.Join instead of returning on the first one. Added a test case ("reports all rejected fields") that verifies both next-row-id and encryption-keys are mentioned in the error when both are present simultaneously.
Use errors.Join to collect all v3-only field violations instead of returning on the first one encountered.
laskoviymishka
left a comment
There was a problem hiding this comment.
Looks good to me now. I’d like to track two follow-ups, neither blocking.
The bigger one is cross-client behavior. I read through TableMetadataParser.java while reviewing this, and Java doesn’t reject these fields: next-row-id is only read when formatVersion >= 3 and otherwise defaults to INITIAL_ROW_ID; encryption-keys has no version guard at all. PyIceberg and iceberg-rust also silently drop unknown fields. So Go is the only client hard-erroring here.
That doesn’t hurt today because v3 isn’t widely deployed, but once downgrade tooling exists, a v2 file with stray encryption-keys would read everywhere except Go. It’s worth deciding before v3 GA whether we want to keep that strictness. I think it’s defensible — catching mis-versioned metadata early has real value — but if we keep it, a short code comment explaining the intentional divergence would help future readers.
Smaller one: last-sequence-number is v2+ on the shared commonMetadata, but is silently accepted on v1 reads today. If we’re gating v3-only fields on sub-v3 reads, the same logic naturally extends to v2-only fields on v1.
Worth a follow-up issue to keep the gate principled.
Reject next-row-id and encryption-keys during v1/v2 metadata parsing with a descriptive error naming the field and required format version.
Closes: #1006