Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec: Clarify identity partition edge cases. #10835

Merged
merged 11 commits into from
Aug 6, 2024
17 changes: 13 additions & 4 deletions format/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,13 @@ Struct evolution requires the following rules for default values:

#### Column Projection

Columns in Iceberg data files are selected by field id. The table schema's column names and order may change after a data file is written, and projection must be done using field ids. If a field id is missing from a data file, its value for each row should be `null`.
Columns in Iceberg data files are selected by field id. The table schema's column names and order may change after a data file is written, and projection must be done using field ids.

Values for Field ids which are not present in a data file must be resolved according the following rules:
emkornfield marked this conversation as resolved.
Show resolved Hide resolved

* Return the value from partition metadata if an [Identity Transform](#partition-transforms) exists for the field.
emkornfield marked this conversation as resolved.
Show resolved Hide resolved
* Return the default value as defined in [Default values](#default-values) if it exists.
emkornfield marked this conversation as resolved.
Show resolved Hide resolved
* Return `null` in all other cases.

For example, a file may be written with schema `1: a int, 2: b string, 3: c double` and read using projection schema `3: measurement, 2: name, 4: a`. This must select file columns `c` (renamed to `measurement`), `b` (now called `name`), and a column of `null` values called `a`; in that order.

Expand Down Expand Up @@ -591,11 +597,10 @@ For example, an `events` table with a timestamp column named `ts` that is partit

Scan predicates are also used to filter data and delete files using column bounds and counts that are stored by field id in manifests. The same filter logic can be used for both data and delete files because both store metrics of the rows either inserted or deleted. If metrics show that a delete file has no rows that match a scan predicate, it may be ignored just as a data file would be ignored [2].

Data files that match the query filter must be read by the scan.
Data files that match the query filter must be read by the scan.
emkornfield marked this conversation as resolved.
Show resolved Hide resolved

Note that for any snapshot, all file paths marked with "ADDED" or "EXISTING" may appear at most once across all manifest files in the snapshot. If a file path appears more than once, the results of the scan are undefined. Reader implementations may raise an error in this case, but are not required to do so.


emkornfield marked this conversation as resolved.
Show resolved Hide resolved
Delete files that match the query filter must be applied to data files at read time, limited by the scope of the delete file using the following rules.

* A _position_ delete file must be applied to a data file when all of the following are true:
Expand Down Expand Up @@ -1393,4 +1398,8 @@ This section covers topics not required by the specification but recommendations
Iceberg supports two types of histories for tables. A history of previous "current snapshots" stored in ["snapshot-log" table metadata](#table-metadata-fields) and [parent-child lineage stored in "snapshots"](#table-metadata-fields). These two histories
might indicate different snapshot IDs for a specific timestamp. The discrepancies can be caused by a variety of table operations (e.g. updating the `current-snapshot-id` can be used to set the snapshot of a table to any arbitrary snapshot, which might have a lineage derived from a table branch or no lineage at all).

When processing point in time queries implementations should use "snapshot-log" metadata to lookup the table state at the given point in time. This ensures time-travel queries reflect the state of the table at the provided timestamp. For example a SQL query like `SELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00Z';` would find the snapshot of the Iceberg table just prior to '1986-10-26 01:21:00 UTC' in the snapshot logs and use the metadata from that snapshot to perform the scan of the table. If no snapshot exists prior to the timestamp given or "snapshot-log" is not populated (it is an optional field), then systems should raise an informative error message about the missing metadata.
When processing point in time queries implementations should use "snapshot-log" metadata to lookup the table state at the given point in time. This ensures time-travel queries reflect the state of the table at the provided timestamp. For example a SQL query like `SELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00Z';` would find the snapshot of the Iceberg table just prior to '1986-10-26 01:21:00 UTC' in the snapshot logs and use the metadata from that snapshot to perform the scan of the table. If no snapshot exists prior to the timestamp given or "snapshot-log" is not populated (it is an optional field), then systems should raise an informative error message about the missing metadata.

### Writing data files

All columns should be written to data files even if they introduce redundancy with metadata stored in manifest file (e.g. columns with identity partition transforms). Writing all columns provides redundancy in case of corruption or bugs in the metadata layer.
emkornfield marked this conversation as resolved.
Show resolved Hide resolved
emkornfield marked this conversation as resolved.
Show resolved Hide resolved