feat: support updating the unenforced clustering key via UpdateConfig#6812
Merged
Conversation
jackye1995
reviewed
May 17, 2026
| } | ||
|
|
||
| // Single-column clustering key, installed via the position key. | ||
| { |
Contributor
There was a problem hiding this comment.
nit: all these should be individual tests, prefer to not use different sections in a single test. You might also want to update your previous primary key PR about it.
jackye1995
reviewed
May 17, 2026
| // dataset. | ||
| use lance_core::datatypes::LANCE_UNENFORCED_CLUSTERING_KEY_POSITION; | ||
|
|
||
| async fn fresh_dataset(uri: &str) -> Dataset { |
Contributor
There was a problem hiding this comment.
this is repeated. I think it is okay to just repeat this. we also have datagen module that can be used to simplify this dataset creation, check other tests. You might also want to update your previous primary key PR about it.
924a93f to
5ccb448
Compare
…ateConfig The UpdateConfig transaction-apply path recomputes unenforced_primary_key_position from field metadata but not unenforced_clustering_key_position, so installing or removing an unenforced clustering key via update_field_metadata does not round-trip — the protobuf encoder reads the stale cached option. Recompute unenforced_clustering_key_position on apply, and treat the clustering key as a reserved, immutable schema property mirroring the unenforced primary key: once set it cannot be changed, and writing its reserved metadata key with an invalid position is rejected. Also split the metadata-update tests for both keys into individual tests and simplify their dataset setup with the datagen module.
5ccb448 to
1b37856
Compare
jackye1995
approved these changes
May 17, 2026
Contributor
jackye1995
left a comment
There was a problem hiding this comment.
👍 thanks for adding this
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
#6552 added the unenforced clustering key to the schema format, but the
UpdateConfigtransaction-apply path does not recompute the cachedField::unenforced_clustering_key_positionafter applying field metadataupdates — only
unenforced_primary_key_positionis recomputed (added for theprimary key in #6706). The protobuf encoder reads the cached option, not the
metadata HashMap, so installing or removing an unenforced clustering key via
update_field_metadatadoes not round-trip.This recomputes
unenforced_clustering_key_positionon apply, and treats theclustering key as a reserved, immutable schema property — mirroring the
unenforced primary key:
reserved metadata key, is rejected;
rejected rather than silently ignored;
metadata are unaffected.
The check runs on every apply, including conflict-rebase, so it also rejects
the concurrent-writer race.
LANCE_UNENFORCED_CLUSTERING_KEY_POSITIONis also re-exported fromlance_core::datatypes, alongside the primary-key constants.The metadata-update tests for both the primary and clustering keys are
restructured into individual tests with simpler
lance_datagen-based setup.