Skip to content

Adding readManyByPartitionKey API#48801

Open
FabianMeiswinkel wants to merge 36 commits intomainfrom
users/fabianm/readManyByPK
Open

Adding readManyByPartitionKey API#48801
FabianMeiswinkel wants to merge 36 commits intomainfrom
users/fabianm/readManyByPK

Conversation

@FabianMeiswinkel
Copy link
Copy Markdown
Member

@FabianMeiswinkel FabianMeiswinkel commented Apr 13, 2026

Description

Adds a new readManyByPartitionKey API surface to the Java Cosmos SDK (sync + async) and wires it through the Spark connector to support PK-only reads (including partial HPK), with query-plan-based validation for custom queries.

Changes:

Added public readManyByPartitionKey overloads in CosmosAsyncContainer / CosmosContainer and an internal AsyncDocumentClient + RxDocumentClientImpl implementation that groups PKs by physical partition and issues per-range queries.
Introduced ReadManyByPartitionKeyQueryHelper to compose PK filters into user-provided SQL and added a new config knob for per-partition batching.
Added Spark support (UDF + PK serialization/parsing helper + reader) and unit/integration tests for query composition and end-to-end behavior.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Copilot AI review requested due to automatic review settings April 13, 2026 23:00
@FabianMeiswinkel FabianMeiswinkel marked this pull request as draft April 13, 2026 23:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new readManyByPartitionKey API surface to the Java Cosmos SDK (sync + async) and wires it through the Spark connector to support PK-only reads (including partial HPK), with query-plan-based validation for custom queries.

Changes:

  • Added public readManyByPartitionKey overloads in CosmosAsyncContainer / CosmosContainer and an internal AsyncDocumentClient + RxDocumentClientImpl implementation that groups PKs by physical partition and issues per-range queries.
  • Introduced ReadManyByPartitionKeyQueryHelper to compose PK filters into user-provided SQL and added a new config knob for per-partition batching.
  • Added Spark support (UDF + PK serialization/parsing helper + reader) and unit/integration tests for query composition and end-to-end behavior.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
sdk/cosmos/docs/readManyByPartitionKey-design.md Design doc describing the new API, query validation, and Spark integration approach.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DocumentQueryExecutionContextFactory.java Adds a helper method to fetch query plans through the gateway for validation.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java Implements readManyByPartitionKey execution, validation, PK→range grouping, batching, and concurrency.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ReadManyByPartitionKeyQueryHelper.java New helper to build SqlQuerySpec by appending PK filters and extracting table aliases.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/Configs.java Adds config/env accessors for max PKs per per-partition query batch.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/AsyncDocumentClient.java Adds internal interface method for PK-only read-many.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosContainer.java Adds sync readManyByPartitionKey overloads.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/CosmosAsyncContainer.java Adds async readManyByPartitionKey overloads and wiring to internal client.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/ReadManyByPartitionKeyQueryHelperTest.java Unit tests for SQL generation, alias extraction, and WHERE detection.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/ReadManyByPartitionKeyTest.java Emulator integration tests for single PK + HPK, partial HPK, projections, and query validation.
sdk/cosmos/azure-cosmos-spark_3/src/test/scala/com/azure/cosmos/spark/ItemsPartitionReaderWithReadManyByPartitionKeyITest.scala Spark integration test for reading by PKs and empty result behavior.
sdk/cosmos/azure-cosmos-spark_3/src/test/scala/com/azure/cosmos/spark/CosmosPartitionKeyHelperSpec.scala Unit tests for PK string serialization/parsing helpers.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/udf/GetCosmosPartitionKeyValue.scala Spark UDF to compute _partitionKeyIdentity values.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/ItemsPartitionReaderWithReadManyByPartitionKey.scala Spark partition reader that calls new SDK API and converts results to rows.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/CosmosReadManyByPartitionKeyReader.scala Spark reader that maps input rows to PKs and streams results via the partition reader.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/CosmosPartitionKeyHelper.scala Helper for PK serialization/parsing used by the UDF and data source.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/CosmosItemsDataSource.scala Adds Spark entry point to read-many by partition key, including PK extraction logic.
sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/CosmosConstants.scala Adds _partitionKeyIdentity constant.
Comments suppressed due to low confidence (1)

sdk/cosmos/azure-cosmos-spark_3/src/main/scala/com/azure/cosmos/spark/ItemsPartitionReaderWithReadManyByPartitionKey.scala:1

  • The error message has mismatched parentheses/quoting (classOf<SparkRowItem])) which makes it harder to read and search for. Suggest correcting it to a clean, unambiguous string (e.g., classOf[SparkRowItem]) to improve diagnosability.
// Copyright (c) Microsoft Corporation. All rights reserved.

Comment thread sdk/cosmos/docs/readManyByPartitionKey-design.md
FabianMeiswinkel and others added 18 commits April 14, 2026 16:49
…s/spark/ItemsPartitionReaderWithReadManyByPartitionKey.scala

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…s/spark/ItemsPartitionReaderWithReadManyByPartitionKey.scala

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ntation/RxDocumentClientImpl.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ntation/ReadManyByPartitionKeyQueryHelper.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@FabianMeiswinkel FabianMeiswinkel marked this pull request as ready for review April 16, 2026 21:57
@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

Comment thread sdk/cosmos/azure-cosmos/CHANGELOG.md Outdated
@xinlian12
Copy link
Copy Markdown
Member

Review complete (41:32)

Posted 6 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

@xinlian12
Copy link
Copy Markdown
Member

Review complete (34:12)

Posted 5 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

@FabianMeiswinkel
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants