Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocol Specification for Row Commit Versions #1747

Closed

Conversation

tomvanbussel
Copy link
Contributor

Description

This PR adds the protocol specification changes for the Row Commit Versions that are proposed #1715.

In particular it makes the following changes:

  • Renames the rowIds feature to rowTracking.
  • Renames the delta.enableRowIds property to delta.enableRowTracking.
  • Renames and moves the preservedRowIds flag in rowIdHighWaterMark to delta.rowTracking.preserved in the tags of commitInfo.
  • Refactors the specification of Row IDs
  • Adds the specification for Row Commit Versions.

How was this patch tested?

n/a

Does this PR introduce any user-facing changes?

n/a

Copy link
Collaborator

@bart-samwel bart-samwel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Are you sure that the existing table feature name for Row IDs has not made it into any Delta Lake release?

allisonport-db pushed a commit that referenced this pull request May 30, 2023
This PR implements part of the changes proposed in #1747. It adds the `defaultRowCommitVersion` field to `AddFile` and `RemoveFile`, and it makes sure that it's populated during commits and read during log replay. It **does not** handle any transaction conflicts yet.

Closes #1781

GitOrigin-RevId: 781617fd33b3be2f39ac8ab36aa0b741ba99c97e
tdas pushed a commit that referenced this pull request Nov 14, 2023
## Description

This PR fixes a terminology issue in Delta protocol, so the term `supported` is now used to describe a table feature name being listed in table protocol's `readerFeatures` and/or `writerFeatures`. The choice of this word is to emphasize that, in such a scenario, the Delta table *may* use the listed table features but is not forced to do so.

For example, when `appendOnly` is listed in a table's protocol, the table may or may not be append-only, depending on the existence and value of table property `delta.appendOnly`. However, writers must recognize the table feature `appendOnly` and know that the table property should be checked before writing this table.

This PR did not touch the Row ID/Row Tracking sections, as it's handled by another PR: #1747.

Closes #1780

Co-authored-by: Lars Kroll <lars.kroll@databricks.com>
Co-authored-by: Bart Samwel <bart.samwel@databricks.com>
Signed-off-by: Paddy Xu <xupaddy@gmail.com>
GitOrigin-RevId: 8d9b86262e91a88a85388c6333c5ef7ac296931e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants