You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clean migration of Java and Python vector SDK entrypoints to the segment workflow. The test simplification is a big win — removing the manual UUID/field-id/transaction boilerplate makes the distributed indexing tests much more readable.
Issues
P1: inner_commit_existing_index_segments doesn't reload the dataset after commit
In inner_commit_existing_index_segments (blocking_dataset.rs), after commit_existing_index_segments mutates the dataset, load_indices_by_name is called on the same guard. This seems correct for returning metadata. However, the Java-side test asserts dataset.listIndexes().contains(...) on the original dataset handle — does commit_existing_index_segments update the dataset in-place, or does the caller need to reload? If it doesn't update in-place, the final assertTrue in the tests would be relying on stale state (though it might pass if the Rust side mutates the inner Dataset). Worth confirming this is intentional.
P1: segment_template validation may be too strict or too lax
segment_template validates that all segments share the same name, fields, and dataset_version — but it doesn't validate index_version. If segments can have different index_version values, the index_segment_to_metadata function uses the per-segment value (correct), but it's worth documenting why index_version is intentionally excluded from the template consistency check.
Nit
The Python into_builder → builder rename is correct (&self doesn't consume ownership), good catch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This follow-up PR moves the Python and Java vector distributed indexing entrypoints and tests to the segment workflow introduced in #6269.
It keeps the core staging-removal refactor in the base PR and limits this change to SDK-facing APIs, JNI/PyO3 glue, and binding-level tests.