Companion to #6658 (two-phase delete).
Request
Make lance::index::build_index_metadata_from_segments (currently pub(crate) at src/index.rs:111) public, OR expose a higher-level commit_existing_index_segments(...) Rust API that mirrors the Python pattern documented at lance.org/guide/distributed_indexing.
Context
CreateIndexBuilder::execute_uncommitted (src/index/create.rs:135) is pub and returns IndexMetadata — great. But for index types that take the segment-commit path (uses_segment_commit_path returning true: Vector, IvfPq, IvfSq, IvfFlat, IvfRq, IvfHnswFlat, IvfHnswPq, IvfHnswSq), the Operation::CreateIndex transaction must be constructed via:
let segments = ds.create_index_segment_builder()
.with_segments(vec![new_idx.clone()])
.build_all().await?;
let new_indices = build_index_metadata_from_segments(ds, &name, field_id, segments).await?;
TransactionBuilder::new(version, Operation::CreateIndex { new_indices, removed_indices }).build()
build_index_metadata_from_segments is pub(crate), so external callers cannot construct the transaction. This forces vector index builds to go through CreateIndexBuilder::execute() which inline-commits — incompatible with patterns like our omnigraph project that want a strict stage+commit separation across all writers.
The IndexSegmentBuilder (src/index/create.rs:568) and with_segments / build_all ARE pub, which suggests the intent was to expose the segment commit path. The pub(crate) on build_index_metadata_from_segments looks like an oversight.
Use case
omnigraph (a graph DB built on Lance) sits in front of Lance and wants to enforce by-construction at its trait boundary that no engine writer can call an inline-commit Lance API. Scalar indices (BTree, Inverted, Bitmap, NGram) work today via the simple branch of CreateIndexBuilder::execute. Vector indices are blocked.
A similar hard dependency exists on #6658 (two-phase delete). Both unblock the same architectural pattern: hoist Lance's stage+commit two-phase write to a load-bearing trait invariant in upper layers.
Suggested fix
Either:
- Mark
build_index_metadata_from_segments pub (smallest change).
- Add a higher-level
commit_existing_index_segments(...) Rust API per the Python distributed-indexing docs.
Option 1 is the surgical fix; option 2 mirrors the documented Python pattern.
Proposed signature (option 2)
```rust
impl<'a> IndexSegmentBuilder<'a> {
/// Convert a segment plan into a fully-formed Operation::CreateIndex
/// transaction without committing. Caller hands the returned
/// transaction to CommitBuilder::execute to advance HEAD.
pub async fn into_uncommitted_transaction(
self,
index_name: &str,
field_id: i32,
removed_indices: Vec,
) -> Result;
}
```
Happy to send a PR if either approach is preferred.
Companion to #6658 (two-phase delete).
Request
Make
lance::index::build_index_metadata_from_segments(currentlypub(crate)atsrc/index.rs:111) public, OR expose a higher-levelcommit_existing_index_segments(...)Rust API that mirrors the Python pattern documented at lance.org/guide/distributed_indexing.Context
CreateIndexBuilder::execute_uncommitted(src/index/create.rs:135) ispuband returnsIndexMetadata— great. But for index types that take the segment-commit path (uses_segment_commit_pathreturning true: Vector, IvfPq, IvfSq, IvfFlat, IvfRq, IvfHnswFlat, IvfHnswPq, IvfHnswSq), theOperation::CreateIndextransaction must be constructed via:build_index_metadata_from_segmentsispub(crate), so external callers cannot construct the transaction. This forces vector index builds to go throughCreateIndexBuilder::execute()which inline-commits — incompatible with patterns like our omnigraph project that want a strict stage+commit separation across all writers.The
IndexSegmentBuilder(src/index/create.rs:568) andwith_segments/build_allAREpub, which suggests the intent was to expose the segment commit path. Thepub(crate)onbuild_index_metadata_from_segmentslooks like an oversight.Use case
omnigraph (a graph DB built on Lance) sits in front of Lance and wants to enforce by-construction at its trait boundary that no engine writer can call an inline-commit Lance API. Scalar indices (BTree, Inverted, Bitmap, NGram) work today via the simple branch of
CreateIndexBuilder::execute. Vector indices are blocked.A similar hard dependency exists on #6658 (two-phase delete). Both unblock the same architectural pattern: hoist Lance's stage+commit two-phase write to a load-bearing trait invariant in upper layers.
Suggested fix
Either:
build_index_metadata_from_segmentspub(smallest change).commit_existing_index_segments(...)Rust API per the Python distributed-indexing docs.Option 1 is the surgical fix; option 2 mirrors the documented Python pattern.
Proposed signature (option 2)
```rust
impl<'a> IndexSegmentBuilder<'a> {
/// Convert a segment plan into a fully-formed Operation::CreateIndex
/// transaction without committing. Caller hands the returned
/// transaction to CommitBuilder::execute to advance HEAD.
pub async fn into_uncommitted_transaction(
self,
index_name: &str,
field_id: i32,
removed_indices: Vec,
) -> Result;
}
```
Happy to send a PR if either approach is preferred.