Skip to content

feat: expose multi-base config to Python and Java write_fragments API#6855

Merged
yanghua merged 4 commits into
lance-format:mainfrom
zhangyue19921010:support-multi-base-config-write-fragments
May 21, 2026
Merged

feat: expose multi-base config to Python and Java write_fragments API#6855
yanghua merged 4 commits into
lance-format:mainfrom
zhangyue19921010:support-multi-base-config-write-fragments

Conversation

@zhangyue19921010
Copy link
Copy Markdown
Contributor

For Python

fragments = write_fragments(
    pa.table({"id": range(10, 20)}),
    dataset,
    mode="append",
    target_bases=["base2"],
    base_store_params=base_store_params,
)

For Java

  List<FragmentMetadata> fragments =
      Fragment.write()
          .allocator(allocator)
          .datasetUri(primary)
          .data(root2)
          .mode(WriteParams.WriteMode.APPEND)
          .targetBases(Arrays.asList("base2"))
          .baseStoreParams(baseStoreParams)
          .execute();

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added enhancement New feature or request A-python Python bindings A-java Java bindings + JNI labels May 20, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 86.66667% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/fragment/write.rs 86.66% 5 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

@yanghua
Copy link
Copy Markdown
Collaborator

yanghua commented May 20, 2026

@claude review

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-language feature plumbing that mirrors the existing dataset write pattern; tests cover the new path in all three languages, but worth a human eye on the Rust fragment write changes since they introduce an existing-dataset load and target-base resolution on a core write path.

Extended reasoning...

Overview

This PR exposes the multi-base configuration parameters (target_bases, initial_bases, base_store_params) on the fragment-level write API in both Python (write_fragments) and Java (Fragment.write() builder). It plumbs them through the Java JNI layer to the existing Rust extract_write_params machinery (which already supported them), and updates Rust FragmentCreateBuilder::write_fragments_v2_impl to load the existing dataset and resolve target bases when any of the three new params are supplied. The Python write_fragments also now inherits base_store_params from a passed LanceDataset instance if not explicitly provided.

Security risks

None observed. The new parameters control where fragment files are written and which object-store options are used; no auth, crypto, or permissions logic is touched. All new arguments are optional with backward-compatible defaults.

Level of scrutiny

This is a feature PR touching a core write path (rust/lance/src/dataset/fragment/write.rs) and JNI boundaries, so it warrants a human review even though the wiring is mechanical. The Rust change is small and closely mirrors the pattern already used in the dataset-level do_write_fragments call site (validate_and_resolve_target_bases + load existing dataset). The Java JNI changes are pass-through additions to existing signatures.

Other factors

  • The bug-hunting system found no issues.
  • Codecov reports 81% patch coverage, with 11 missing lines concentrated in rust/lance/src/dataset/fragment/write.rs.
  • Tests added: Rust unit test (test_write_fragments_with_target_base), two Java tests (testFragmentCreateWithMultiBaseParams, testFragmentWriteWithMultiBaseParams), and two Python tests for both the new base_store_params inheritance and the existing write_fragments + target_bases flow.
  • The PR does not touch any CODEOWNER-protected files.

Comment thread python/python/lance/fragment.py
Copy link
Copy Markdown
Collaborator

@yanghua yanghua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@yanghua yanghua merged commit 6e0e5a6 into lance-format:main May 21, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-java Java bindings + JNI A-python Python bindings enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants