Skip to content

fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos]#48811

Open
xinlian12 wants to merge 33 commits intoAzure:mainfrom
xinlian12:fix/issue-45521-custom-item-serializer-v2
Open

fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos]#48811
xinlian12 wants to merge 33 commits intoAzure:mainfrom
xinlian12:fix/issue-45521-custom-item-serializer-v2

Conversation

@xinlian12
Copy link
Copy Markdown
Member

Summary

Fixes #45521

Two bugs with CustomItemSerializer:

Bug 1 Deserialization failures in ORDER BY/GROUP BY/aggregate/DISTINCT queries

When customItemSerializer is configured on CosmosClientBuilder, complex queries (ORDER BY, GROUP BY, VALUE, COUNT, SUM, DISTINCT, HybridSearch) fail because the custom serializer leaks into the internal query pipeline and is used to deserialize internal SDK structures (OrderByRowResult, Document, etc.).

Root cause: PipelinedDocumentQueryExecutionContext.createBaseComponentFunction() called setCustomItemSerializer(null) on cloned request options, but RxDocumentClientImpl.getEffectiveItemSerializer() falls back to the client-level defaultCustomSerializer when the request-level serializer is null.

Fix: Set CosmosItemSerializer.DEFAULT_SERIALIZER instead of null on internal request options. This causes getEffectiveItemSerializer() to return DEFAULT_SERIALIZER immediately without falling through to the client-level custom serializer.

Bug 2 SqlParameter ignores customItemSerializer

SqlParameter.setValue() delegates to JsonSerializable.set() which hardcodes INTERNAL_DEFAULT_SERIALIZER. Custom serialization settings (e.g., dates as ISO strings) are never applied to query parameters.

Fix: Added rawValue field to SqlParameter to retain the original value, plus applySerializer() to re-serialize with a custom serializer. Called from PipelinedQueryExecutionContextBase.createAsync() where the effective serializer is known.

Changes

File Change
PipelinedDocumentQueryExecutionContext.java null DEFAULT_SERIALIZER in 4 internal pipeline paths
PipelinedQueryExecutionContextBase.java Apply serializer to SqlParameters at query execution time
SqlParameter.java Added rawValue field + applySerializer() method
SqlQuerySpec.java Added applySerializerToParameters()

Testing

Build compiles cleanly. Existing tests unaffected. Additional test cases for ORDER BY/GROUP BY/aggregate queries with custom serializer and SqlParameter serialization should be added.

Annie Liang and others added 17 commits April 14, 2026 12:31
…eline structures and is applied to SqlParameter serialization

Fixes Azure#45521

Bug 1: Changed PipelinedDocumentQueryExecutionContext to set DEFAULT_SERIALIZER
instead of null on internal request options, preventing getEffectiveItemSerializer()
from falling back to the client-level custom serializer for internal pipeline processing.
This fixes ORDER BY, GROUP BY, aggregate, DISTINCT, DCount, and hybrid search queries
when a custom serializer is configured on CosmosClientBuilder.

Bug 2: Added serializer-aware parameter serialization in the query execution path
so SqlParameter values are serialized with the effective custom serializer.
SqlParameter now stores the raw value and re-serializes it when applySerializer()
is called from PipelinedQueryExecutionContextBase.createAsync().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…structor

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ndition

When a custom serializer is configured, applySerializerToParameters() mutates
SqlParameter values in-place on the user's original SqlQuerySpec. If the same
SqlQuerySpec is reused across concurrent queries with different serializers,
the mutations race without synchronization.

Defensive fix: clone the SqlQuerySpec and its SqlParameter list before applying
the serializer, so the original object is never mutated by the SDK.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… method, use ImplementationBridgeHelpers accessor

- C1: Fix forceSerialization=false to true in SqlParameter.applySerializer()
- C4: Remove applySerializerToParameters from ModelBridgeInternal, use SqlQuerySpecHelper accessor pattern via ImplementationBridgeHelpers instead

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…aggregate, DISTINCT, and SqlParameter queries

Covers fix for Azure#45521

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ze()

The Spark connector and potentially other consumers implement
CosmosItemSerializer with only deserialize() and stub serialize()
as unimplemented. Guard with try-catch to fall back to default
serialization when the custom serializer's serialize() is not
available.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ializer in tests

Address PR review comments from xinlian12 - use the clientSerializer
variable (from getClientBuilder().getCustomItemSerializer()) instead of
hardcoding EnvelopWrappingItemSerializer.INSTANCE_NO_TRACKING_ID_VALIDATION
in the new query test methods.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dataProvider

Tests added without a dataProvider cause ArrayIndexOutOfBoundsException
in resolveTestNameSuffix because TestNG passes an empty Object[] to
@BeforeMethod setTestName. Add a null/length check before accessing row[0].

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Aggregate, DISTINCT, and GROUP BY query results are projections or
scalar values — not full envelope-wrapped documents. The
EnvelopWrappingItemSerializer returns null for these results (no
wrappedContent field), causing NullPointerException in Reactor's
FluxIterable.poll().

Use DEFAULT_SERIALIZER on the query request options for these tests.
This still validates the core fix (client-level custom serializer does
not leak into the internal query pipeline) while correctly handling
non-document result deserialization.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SELECT VALUE COUNT(1) returns a scalar integer, so use Integer.class
instead of ObjectNode.class as the result type. ObjectNode.class fails
because ValueUnwrapCosmosItemSerializer extracts the _value field and
cannot convert the resulting integer to ObjectNode.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create a BasicCustomItemSerializer that mirrors the real-world use case
from issue Azure#45521 — a simple ObjectMapper-based serializer with custom
settings (dates as ISO strings via JavaTimeModule) without transforming
the document structure.

Changes:
- Add BasicCustomItemSerializer inner class with custom ObjectMapper
- Add it to the Factory data provider so query tests also run with it
- Update query tests to use the custom serializer directly when the
  serializer does not transform document structure (vs falling back to
  DEFAULT_SERIALIZER for envelope-wrapping)
- Use instanceof checks for EnvelopWrappingItemSerializer instead of
  identity comparison for robustness

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…back to default

For queryWithAggregate, queryWithDistinct, and queryWithGroupBy custom
serializer tests: skip the test entirely when EnvelopWrappingItemSerializer
is used, rather than falling back to DEFAULT_SERIALIZER. These query result
shapes (aggregates, projections) are not full documents and cannot be
properly deserialized by the envelope-wrapping serializer.

Also removed unused isEnvelopeWrapper variable from queryWithOrderBy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ameter

Bug fixes:
- Use SELECT COUNT(1) instead of SELECT VALUE COUNT(1) in aggregate
  serializer test (custom serializers can't deserialize scalar _value wrapper)
- Preserve original Java type when cloning SqlParameter values (use
  getRawValue() instead of getValue(Object.class) which loses Instant→Long)
- Handle scalar values in BasicCustomItemSerializer.serialize() via
  JsonNode conversion + primitive value key pattern
- Skip SqlParameter clone/serialization when query has no parameters

Design improvements:
- Add canSerialize capability flag to CosmosItemSerializer using the same
  private field + setter pattern as setShouldWrapSerializationExceptions.
  SqlParameter.applySerializer() checks this flag instead of catching
  exceptions, so serializer bugs always propagate immediately.
- CosmosItemSerializerNoExceptionWrapping sets canSerialize=false via
  bridge accessor (both Spark Scala and test Java versions)

New test:
- queryWithSqlParameterDateTimeAndCustomSerializer validates end-to-end
  scenario: custom serializer writes Instant as ISO-8601 string,
  SqlParameter applies same serializer so query filter matches stored value

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When applySerializer is called for SQL parameter values (strings, numbers),
the EnvelopWrappingItemSerializer.serialize() would NPE because
ConcurrentHashMap.put() rejects null values from missing 'id'/'mypk' keys.

Pass through non-document values (those without 'id' key) without wrapping.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Set canSerialize(false) on ValueUnwrapCosmosItemSerializer for
  consistency with CosmosItemSerializerNoExceptionWrapping, preventing
  accidental serialize() calls that throw IllegalStateException.

- Skip SqlParameter cloning when canSerialize is false, avoiding
  unnecessary allocations for deserialize-only serializers like
  the Spark connector's CosmosItemSerializerNoExceptionWrapping.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…stages

Extract cloneOptionsForInternalPipeline() helper to enforce the invariant
that internal pipeline stages use DEFAULT_SERIALIZER. This replaces 4
separate clone+setCustomItemSerializer call sites, making the invariant
self-documenting and impossible to forget for future query types.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12 xinlian12 marked this pull request as ready for review April 14, 2026 19:34
@xinlian12 xinlian12 requested review from a team and kirankumarkolli as code owners April 14, 2026 19:34
Copilot AI review requested due to automatic review settings April 14, 2026 19:34
@xinlian12 xinlian12 changed the title fix: [SparkConnector] Skip readContainerThroughput when targetThroughput is configured fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos] Apr 14, 2026
@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - kafka

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xinlian12
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

@xinlian12
Copy link
Copy Markdown
Member Author

Review complete (39:33)

Posted 3 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

xinlian12 and others added 7 commits April 14, 2026 14:19
- Add queryWithConcurrentSqlQuerySpecReuseAndCustomSerializer test that
  validates the cloning logic prevents races when the same SqlQuerySpec
  is reused across concurrent queries with a custom serializer
- Enhance cloneOptionsForInternalPipeline Javadoc to document that new
  internal pipeline stages MUST use this method (ref PR Azure#48811)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - kafka

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

2 similar comments
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

xinlian12 and others added 3 commits April 15, 2026 22:53
…concurrent corruption

When multiple threads concurrently execute queries with the same shared
SqlQuerySpec and query plan caching is not applicable (e.g., cross-partition
queries without a partition key), all threads call
QueryPlanRetriever.getQueryPlanThroughGatewayAsync which serializes the
shared SqlQuerySpec. The serialization path calls populatePropertyBag()
which modifies the SqlQuerySpec's internal ObjectNode (backed by a
non-thread-safe LinkedHashMap). Concurrent modification can corrupt the
ObjectNode, producing malformed JSON with duplicate parameter entries
that the server rejects with 'duplicate parameter name'.

This fix creates a defensive copy of the SqlQuerySpec before serializing
it for the query plan request. Each copy has its own property bag,
eliminating the concurrent modification race.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Added SqlQuerySpec.clone() method (overrides Object.clone()) that creates
  a shallow copy with its own property bag to prevent concurrent modification
  of the internal ObjectNode.
- Exposed clone() via SqlQuerySpecAccessor for internal use.
- Refactored QueryPlanRetriever to use the new accessor-based clone.
- Rewrote queryWithConcurrentSqlQuerySpecReuseAndCustomSerializer test to
  use CosmosAsyncClient/CosmosAsyncContainer with Flux.range().flatMap()
  instead of ExecutorService and CountDownLatch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove SqlQuerySpec.clone() and SqlQuerySpecAccessor.clone() — the
defensive copy logic is only needed in QueryPlanRetriever, so keep it
as a simple private copyQuerySpec() method there.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12
Copy link
Copy Markdown
Member Author

@sdkReviewAgent

@xinlian12
Copy link
Copy Markdown
Member Author

Review complete (36:51)

Posted 2 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

xinlian12 and others added 2 commits April 16, 2026 10:24
Move the clone logic from SqlQuerySpecAccessor.cloneSqlParameter() to a
public SqlParameter.clone() method, following the convention that clone
operations belong on the type itself. Update PipelinedQueryExecutionContextBase
to call p.clone() directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SqlParameter is a public model class — keep its public API unchanged.
The copy logic is now a private createCopy() method exposed via
SqlParameterHelper/SqlParameterAccessor in ImplementationBridgeHelpers,
following the same accessor pattern used by other model types.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - kafka

1 similar comment
@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - kafka

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

2 similar comments
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Cosmos – CustomItemSerializer - not working in certain queries and not applied in SqlParameter

3 participants