feat!: make dataset object store access base-aware#6647
Merged
jackye1995 merged 1 commit intolance-format:mainfrom May 1, 2026
Merged
feat!: make dataset object store access base-aware#6647jackye1995 merged 1 commit intolance-format:mainfrom
jackye1995 merged 1 commit intolance-format:mainfrom
Conversation
Contributor
|
I'm concerned that every use of Dataset.object_store could be a potential bug, since that only represents the primary object store and not all object stores. Should we do a change where we turn that field into a |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
3a65cd0 to
c732b59
Compare
cmccabe
approved these changes
Apr 30, 2026
c732b59 to
319e9f8
Compare
319e9f8 to
4b0a195
Compare
LuQQiu
reviewed
Apr 30, 2026
Contributor
LuQQiu
left a comment
There was a problem hiding this comment.
Multi-base table is a newly introduce feature, kind of don't want it to break all existing users who is using single base table. Also agree with what @cmccabe mentioned, we don't want it to be sliently incorrect.
Would prefer to keep dataset.object_store() but error out if the dataset is multibase dataset
LuQQiu
approved these changes
Apr 30, 2026
4b0a195 to
5dc1480
Compare
5dc1480 to
34e14cc
Compare
wombatu-kun
pushed a commit
to wombatu-kun/lance
that referenced
this pull request
May 4, 2026
After rebasing onto main, our test added in "feat: route partial-schema merge_insert through the v2 write path" no longer compiled because main lance-format#6647 turned the public `Dataset::object_store()` getter into an async, base-aware method returning `Result<Arc<ObjectStore>>`. The other call site of `read_transaction_file` in the same module already uses the `pub(crate) object_store` field directly via `.as_ref()`; mirror that pattern here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Make dataset object-store access explicit for multi-base datasets. The public
Dataset::object_storeaccessor now takes anOption<u32>base id: passNonefor the primary dataset store, orSome(base_id)for an additional base referenced by dataset metadata. This is a breaking change from the previous zero-argument accessor.Adds
Dataset::with_object_store_wrappers(...)to clone datasets with wrappers propagated to the primary object store, refs store, primary store params, and every base store params entry. When a fragment, deletion file, or index references a base id, Lance resolves both the correct path location and the corresponding wrapped object store, so wrappers such as caching and instrumentation apply to multi-base reads without changing page cache keys.Also fixes base-aware read paths that resolved a
base_idpath but still used the primary dataset object store. Data-file metadata creation, deletion file reads, scalar/vector index opens, index detail inference, index stats, and index migration now resolve the correct base object store before reading.