Skip to content

Adds option to configure max batch size in readManyByPartitionKeys#48930

Merged
FabianMeiswinkel merged 5 commits intomainfrom
users/fabianm/configMaxBatchsize
Apr 24, 2026
Merged

Adds option to configure max batch size in readManyByPartitionKeys#48930
FabianMeiswinkel merged 5 commits intomainfrom
users/fabianm/configMaxBatchsize

Conversation

@FabianMeiswinkel
Copy link
Copy Markdown
Member

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Copilot AI review requested due to automatic review settings April 24, 2026 14:44
@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a per-request option to control the maximum batch size used by readManyByPartitionKeys, and wires it through the Cosmos Java SDK internals and Spark connector configuration.

Changes:

  • Add maxBatchSize getter/setter to CosmosReadManyByPartitionKeysRequestOptions and bridge accessor plumbing.
  • Plumb maxBatchSize through CosmosAsyncContainer -> AsyncDocumentClient -> RxDocumentClientImpl and use it when building batches.
  • Add Spark connector config key parsing + unit tests, and apply the setting when constructing request options.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosReadManyByPartitionKeysRequestOptions.java Introduces public per-request maxBatchSize API and exposes it via bridge accessor.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java Threads maxBatchSize into the internal execution path and uses it for batch construction.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ImplementationBridgeHelpers.java Extends request-options accessor interface to surface maxBatchSize.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/CosmosReadManyByPartitionKeysRequestOptionsImpl.java Stores/clones the new maxBatchSize option in the internal options implementation.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/AsyncDocumentClient.java Updates internal client interface to accept maxBatchSize.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosAsyncContainer.java Resolves effective maxBatchSize (per-request override vs global default) and passes it down.
sdk/cosmos/azure-cosmos-spark_3/src/test/scala/com/azure/cosmos/spark/CosmosConfigSpec.scala Adds Spark config parsing tests for readManyByPk.maxBatchSize (and updated expectations for prefetch).
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/ItemsPartitionReaderWithReadManyByPartitionKey.scala Applies Spark config overrides to request options via foreach.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/CosmosConfig.scala Adds new Spark config key + parsing; changes prefetch config default handling to defer to SDK when unset.
sdk/cosmos/azure-cosmos-spark_3/dev/README.md Adds build command for an additional Spark 4.1 module.
Comments suppressed due to low confidence (1)

sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java:4384

  • maxBatchSize is used as the step size for the batching loop downstream; if a caller passes 0 (or a negative value), the loop for (int i = 0; i < allPks.size(); i += maxPksPerPartitionQuery) will never advance and can hang. Add an argument validation similar to maxConcurrentBatchPrefetch (>= 1) and fail fast with a clear message.
        checkNotNull(partitionKeys, "Argument 'partitionKeys' must not be null.");
        checkArgument(!partitionKeys.isEmpty(), "Argument 'partitionKeys' must not be empty.");
        checkArgument(maxConcurrentBatchPrefetch >= 1,
            "Argument 'maxConcurrentBatchPrefetch' must be greater than or equal to 1.");

FabianMeiswinkel and others added 2 commits April 24, 2026 14:51
…CosmosReadManyByPartitionKeysRequestOptions.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xinlian12
Copy link
Copy Markdown
Member

Review complete (32:04)

Posted 3 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

Copy link
Copy Markdown
Member

@xinlian12 xinlian12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Member

@xinlian12 xinlian12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@FabianMeiswinkel FabianMeiswinkel merged commit 766c068 into main Apr 24, 2026
51 checks passed
@FabianMeiswinkel FabianMeiswinkel deleted the users/fabianm/configMaxBatchsize branch April 24, 2026 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants