Skip to content

fix: reject changing the unenforced primary key once set#6810

Merged
jackye1995 merged 2 commits into
lance-format:mainfrom
touch-of-grey:ImmutablePrimaryKey
May 17, 2026
Merged

fix: reject changing the unenforced primary key once set#6810
jackye1995 merged 2 commits into
lance-format:mainfrom
touch-of-grey:ImmutablePrimaryKey

Conversation

@touch-of-grey
Copy link
Copy Markdown
Contributor

@touch-of-grey touch-of-grey commented May 17, 2026

Summary

The unenforced primary key is stored as schema field metadata, and the
UpdateConfig transaction-apply path applied field metadata updates with no
guard — so the key could be silently overridden, evolved, removed, or written
with a junk value.

This treats the lance-schema:unenforced-primary-key* metadata keys as
reserved, validated in the transaction-apply path:

  • once a primary key is set, any commit that changes it — or writes a reserved
    primary-key metadata key at all — is rejected;
  • writing a reserved key with a value that is not a valid marker is rejected
    rather than silently ignored;
  • a valid first install and updates to other field metadata are unaffected.

The check runs on every apply, including conflict-rebase, so it also rejects
the concurrent-writer race where two writers set the key on different columns
(their field-metadata updates don't conflict, so the loser would otherwise
rebase and corrupt the key).

The metadata.rs primary-key tests are updated accordingly.

Validate in the UpdateConfig transaction-apply path that a commit does
not change an already-set unenforced primary key. This runs on every
apply including conflict-rebase, so it rejects both a direct override
and the concurrent-writer race where a transaction is retried onto a
base manifest that already has a primary key.
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added the enhancement New feature or request label May 17, 2026
@jackye1995 jackye1995 changed the title feat(dataset): reject changing the unenforced primary key once set fix: reject changing the unenforced primary key once set May 17, 2026
@github-actions github-actions Bot added the bug Something isn't working label May 17, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 17, 2026

Codecov Report

❌ Patch coverage is 96.95122% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/metadata.rs 97.03% 0 Missing and 4 partials ⚠️
rust/lance/src/dataset/transaction.rs 96.55% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Comment thread rust/lance/src/dataset/metadata.rs Outdated
for falsy in ["no", "false", "0", "anything-else"] {

// Non-matching values must not be treated as a PK marker, so the
// field never becomes part of the primary key.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not only we should not treat it as a pk marker, we should error with something like the primary key is a reserved key.

Comment thread rust/lance/src/dataset/metadata.rs Outdated

// Removing the keys must clear the cached option, otherwise the
// protobuf would still encode a stale PK marker on next commit.
// Re-applying the identical primary key is a no-op and allowed; this
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is confusing, we should just not allow setting it like the rest of the behavior, instead of allowing it to succeed.

…mary key

Address review on lance-format#6810: the immutability guard now rejects any write
that touches the reserved primary key metadata keys once a key is set
(not just changes that alter the key), and rejects writing those keys
with a value that is not a valid marker.
Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix!

@jackye1995 jackye1995 merged commit 01dccbb into lance-format:main May 17, 2026
30 checks passed
jackye1995 pushed a commit to lancedb/lancedb that referenced this pull request May 17, 2026
## Summary

Adds `Table::set_unenforced_primary_key` — records a single column as
the
table's unenforced primary key in Lance schema field metadata.
"Unenforced"
means LanceDB does not check uniqueness on write; the key is metadata
that
`merge_insert` consumes.

- Single-column only; the column must exist and have a supported dtype
(Int32, Int64, Utf8, LargeUtf8, Binary, LargeBinary, FixedSizeBinary).
The
API accepts an iterable for binding ergonomics but requires exactly one
  column — compound keys are rejected.
- The primary key is immutable: calling this on a table that already has
an
unenforced primary key is rejected. Concurrent writers racing to set the
key
  fail at commit time rather than silently overriding it.
- `RemoteTable` returns `NotSupported`.
- Bindings: Python (`AsyncTable`, `LanceTable`, `RemoteTable`) and
TypeScript
  (`Table.setUnenforcedPrimaryKey`).

## Context

Split out from #3354 per review feedback, so the unenforced primary key
and the
`merge_insert` sharding spec land as separate reviewable PRs.

No Lance dependency bump — `main` is already on v7.0.0-beta.10, which
includes
the field-metadata round-trip fix the API relies on. Enforcing
primary-key
immutability at the Lance commit layer (so the cross-column concurrent
race is
also rejected) is a companion Lance change: lance-format/lance#6810.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants