Skip to content

feat: support updating the unenforced clustering key via UpdateConfig#6812

Merged
jackye1995 merged 1 commit into
lance-format:mainfrom
touch-of-grey:ClusteringKey
May 17, 2026
Merged

feat: support updating the unenforced clustering key via UpdateConfig#6812
jackye1995 merged 1 commit into
lance-format:mainfrom
touch-of-grey:ClusteringKey

Conversation

@touch-of-grey
Copy link
Copy Markdown
Contributor

@touch-of-grey touch-of-grey commented May 17, 2026

Summary

#6552 added the unenforced clustering key to the schema format, but the
UpdateConfig transaction-apply path does not recompute the cached
Field::unenforced_clustering_key_position after applying field metadata
updates — only unenforced_primary_key_position is recomputed (added for the
primary key in #6706). The protobuf encoder reads the cached option, not the
metadata HashMap, so installing or removing an unenforced clustering key via
update_field_metadata does not round-trip.

This recomputes unenforced_clustering_key_position on apply, and treats the
clustering key as a reserved, immutable schema property — mirroring the
unenforced primary key:

  • once a clustering key is set, any commit that changes it, or that writes its
    reserved metadata key, is rejected;
  • writing the reserved key with a value that is not a valid position is
    rejected rather than silently ignored;
  • a valid first install (single- or multi-column) and updates to other field
    metadata are unaffected.

The check runs on every apply, including conflict-rebase, so it also rejects
the concurrent-writer race.

LANCE_UNENFORCED_CLUSTERING_KEY_POSITION is also re-exported from
lance_core::datatypes, alongside the primary-key constants.

The metadata-update tests for both the primary and clustering keys are
restructured into individual tests with simpler lance_datagen-based setup.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added the enhancement New feature or request label May 17, 2026
Comment thread rust/lance/src/dataset/metadata.rs Outdated
}

// Single-column clustering key, installed via the position key.
{
Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 May 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: all these should be individual tests, prefer to not use different sections in a single test. You might also want to update your previous primary key PR about it.

Comment thread rust/lance/src/dataset/metadata.rs Outdated
// dataset.
use lance_core::datatypes::LANCE_UNENFORCED_CLUSTERING_KEY_POSITION;

async fn fresh_dataset(uri: &str) -> Dataset {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is repeated. I think it is okay to just repeat this. we also have datagen module that can be used to simplify this dataset creation, check other tests. You might also want to update your previous primary key PR about it.

…ateConfig

The UpdateConfig transaction-apply path recomputes
unenforced_primary_key_position from field metadata but not
unenforced_clustering_key_position, so installing or removing an
unenforced clustering key via update_field_metadata does not round-trip
— the protobuf encoder reads the stale cached option.

Recompute unenforced_clustering_key_position on apply, and treat the
clustering key as a reserved, immutable schema property mirroring the
unenforced primary key: once set it cannot be changed, and writing its
reserved metadata key with an invalid position is rejected.

Also split the metadata-update tests for both keys into individual
tests and simplify their dataset setup with the datagen module.
@jackye1995 jackye1995 changed the title feat(dataset): support updating the unenforced clustering key via UpdateConfig feat: support updating the unenforced clustering key via UpdateConfig May 17, 2026
Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks for adding this

@jackye1995 jackye1995 merged commit 7fefb55 into lance-format:main May 17, 2026
30 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants