Adding the embedding generator interface#46902
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a public embedding-generation interface to azure-cosmos (sync + async) by introducing an EmbeddingProvider Protocol and an EmbeddingResult return type, enabling GenerateEmbeddings(...) query expressions to carry vectors plus optional usage metadata.
Changes:
- Added
EmbeddingProviderProtocols for sync and async clients. - Added
EmbeddingResultfrozen dataclass to return vectors plus optional token usage metadata. - Exported the new API surface via package
__init__files and documented it in the changelog.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/cosmos/azure-cosmos/azure/cosmos/aio/_embedding_provider.py | Adds async EmbeddingProvider Protocol contract for aio client usage |
| sdk/cosmos/azure-cosmos/azure/cosmos/aio/init.py | Exposes EmbeddingProvider / EmbeddingResult from azure.cosmos.aio |
| sdk/cosmos/azure-cosmos/azure/cosmos/_embedding_result.py | Adds EmbeddingResult dataclass used as the embedding return type |
| sdk/cosmos/azure-cosmos/azure/cosmos/_embedding_provider.py | Adds sync EmbeddingProvider Protocol contract |
| sdk/cosmos/azure-cosmos/azure/cosmos/init.py | Exposes EmbeddingProvider / EmbeddingResult from azure.cosmos |
| sdk/cosmos/azure-cosmos/CHANGELOG.md | Documents the new embedding interface feature |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
kushagraThapar
left a comment
There was a problem hiding this comment.
Thanks @aayush3011 — this is a clean, well-scoped interface PR, and Python is actually ahead of .NET on several dimensions I want to call out before the asks: @runtime_checkable on both Protocols (which the .NET interface can't offer natively), a single shared EmbeddingResult dataclass between sync and async (no duplication, no drift risk), and a sync surface that .NET doesn't have. Nice work.
I see @FabianMeiswinkel already LGTM'd, so framing the below as two small things to consider for either this PR or a quick follow-up — happy to defer either if merge is imminent.
1. EmbeddingResult is reachable from azure.cosmos but not from azure.cosmos.aio — could you please re-export it?
This is the genuinely new + actionable one. azure/cosmos/aio/__init__.py re-exports EmbeddingProvider but EmbeddingResult is absent from __all__, even though the async EmbeddingProvider.generate_embeddings(...) returns it:
$ grep -n "EmbeddingResult\|EmbeddingProvider" sdk/cosmos/azure-cosmos/azure/cosmos/aio/__init__.py
25:from ._embedding_provider import EmbeddingProvider
35: "EmbeddingProvider"
The net effect is that an async-only customer writing from azure.cosmos.aio import EmbeddingProvider, EmbeddingResult hits ImportError and is forced to cross-import the return type from the sync namespace (from azure.cosmos import EmbeddingResult). Every other return type in aio (ContainerProxy, DatabaseProxy, CosmosClient) is re-exported — this is the convention this PR diverges from. The shared dataclass design is intentional and good; it just needs to be visible from both namespaces.
Could you please add the re-export?
# aio/__init__.py
from .._embedding_result import EmbeddingResult
...
__all__ = (..., "UserProxy", "EmbeddingProvider", "EmbeddingResult")Worth doing in this PR if there's room — it's a 2-line change and an interface-only PR is the easiest moment to fix it before customers start cross-importing. Totally fine to defer to a fast follow-up too if you'd rather keep this PR's diff minimal.
2. Could we consider expanding the Protocol docstring contract — extending @copilot's good concurrency-symmetry observation?
@copilot already flagged that the async Protocol docstring drops the "Implementations must be safe to call concurrently" sentence the sync one carries — that observation is on the right track, and I am curious to learn if we could extend it into a fuller contract on both docstrings while we're here.
The .NET counterpart (PR #5838) ships a comprehensive 4-paragraph <remarks> block on ICosmosEmbeddingGenerator covering:
- Lifecycle and disposal — the customer owns the generator instance; the SDK keeps a reference for the lifetime of the configured client but never disposes it; the customer disposes any underlying HTTP/embedding client resources.
- Error semantics — implementations are responsible for transient-failure handling (network, rate limiting); the SDK does not retry the provider; exceptions are wrapped into a
CosmosExceptionand surfaced to the originating caller.- Cancellation — implementations should honor the supplied cancellation cooperatively wherever feasible (e.g. forward to the underlying HTTP call).
- Idempotency and concurrency — the SDK may invoke the provider multiple times for the same inputs (internal query retry) and concurrently from multiple callers; implementations must be safe to call repeatedly and from parallel callers; implementations may cache responses to avoid duplicate billing.
(Source: ICosmosEmbeddingGenerator.cs:21-46 — we asked for this on .NET PR #5838 as M6 and the author landed it.)
Could you please consider mirroring that contract on both the sync and async Python Protocol docstrings, with Python-idiomatic phrasing? Roughly something like:
"""Protocol for classes that generate text embeddings for Azure Cosmos DB queries.
Implementations are invoked by the SDK to embed literal text in queries that
use ``GenerateEmbeddings(...)``. A provider may be attached at the client level
or overridden at the container level.
**Lifecycle.** The SDK does not own the lifetime of this object. Customers are
responsible for constructing the provider and disposing/closing any resources
it holds (e.g. ``httpx`` / ``aiohttp`` clients) when their application tears down.
**Error semantics.** Implementations should raise on transport or service failure;
the SDK does not retry calls to ``generate_embeddings``. Provider exceptions are
surfaced to the originating SDK caller wrapped in a ``CosmosClientError``.
**Cancellation.** Implementations should respect cooperative cancellation where
appropriate (e.g. ``asyncio.CancelledError`` on the async Protocol; forwarding
timeouts to the underlying HTTP call on the sync Protocol).
**Idempotency and concurrency.** Implementations must be safe to call (or
``await``) concurrently from multiple tasks. The SDK may invoke the provider
more than once for the same inputs during internal query retry, so providers
should be idempotent and may cache responses internally to avoid duplicate
billing for identical inputs.
"""(Same body on both sync and async Protocols — only the leading sentence differs.)
The reason for raising this now: an interface-only PR is the ideal moment to lock the contract before customer implementations start depending on the implicit semantics. Once even one customer ships against the current 1-sentence contract, the contract is effectively whatever they assumed — porting the explicit semantics later becomes a much harder conversation. Happy to defer to a fast follow-up if you'd rather keep this PR minimal.
Other things deferred
latencyonEmbeddingResultfor .NET parity — @ananth7592 has an open thread on this already; I don't want to duplicate, but +1 to that ask whenever it lands. .NET shipsTimeSpan? LatencyatCosmosEmbeddingResult.cs:67.- A handful of design-doc-alignment questions we noticed (multi-datatype coverage for
uint8/int8pervectorEmbeddingPolicy.dataType, anchoringdimensionsto a specificvectorEmbeddingPolicy.pathwhen multiple paths have different dims, and the broader observation that the Cosmos design-doc corpus has no chapter on vector search yet) — those belong in a separate design-docs PR, not in the SDK. Surfacing here just so they're not lost.
Thanks again — looking forward to the wiring + companion azure-cosmos-ai package PRs.
@kushagraThapar Following the repo convention, pure data types like PartitionKey and ConsistencyLevel aren't re-exported from azure.cosmos.aio. Async users import them directly from azure.cosmos. EmbeddingResult fits the same mold. It's a single data class with no sync or async variant, so it should have one canonical home (azure.cosmos) rather than being re-exported from both. This is also why the Sphinx build is failing. When autodoc processes both packages, it encounters EmbeddingResult twice under the same fully qualified name (azure.cosmos._embedding_result.EmbeddingResult), producing a duplicate object description warning that gets promoted to an error under W (warnings treated as errors). Dropping the re-export from azure.cosmos.aio.init.py resolves the warning and aligns with how the rest of the shared types are exposed. |
simorenoh
left a comment
There was a problem hiding this comment.
Only pending questions are on the use of dataclass for the EmbeddingResult class and the runtime_checkable annotation for Azure SDK guidelines, otherwise LGTM
…uration (preview) (#5838) Adds the **public surface only** for V0 query-time embedding generation in the .NET SDK (preview). No runtime behavior is wired up yet — pipeline resolver and binding land in a follow-up PR. ## Customer-facing API (preview) ### Interface and result type ```csharp public interface ICosmosEmbeddingGenerator { Task<CosmosEmbeddingResult> GenerateEmbeddingsAsync( IReadOnlyList<string> texts, string endpoint, string deploymentName, int dimensions, CancellationToken cancellationToken = default); } public sealed class CosmosEmbeddingResult { public IReadOnlyList<ReadOnlyMemory<float>> Vectors { get; } public int? TotalTokens { get; } public TimeSpan? Latency { get; } } ``` Signature mirrors [Python SDK PR #46902](Azure/azure-sdk-for-python#46902). ### Client-wide configuration ```csharp var client = new CosmosClientBuilder(endpoint, cred) .WithEmbeddingGenerator(myGenerator) .Build(); ICosmosEmbeddingGenerator current = client.EmbeddingGenerator; ``` Or via options: ```csharp var client = new CosmosClient(endpoint, key, new CosmosClientOptions { EmbeddingGenerator = myGenerator, }); ``` ## Files ### Source (new) - `Microsoft.Azure.Cosmos/src/ICosmosEmbeddingGenerator.cs` - `Microsoft.Azure.Cosmos/src/CosmosEmbeddingResult.cs` ### Source (modified) - `CosmosClientOptions.cs` — adds `EmbeddingGenerator` (`[JsonIgnore]`) - `Fluent/CosmosClientBuilder.cs` — adds `WithEmbeddingGenerator(...)` - `CosmosClient.cs` — adds `EmbeddingGenerator` read-only accessor ### Tests - `CosmosClientOptionsUnitTests.cs`: - `VerifyEmbeddingGeneratorBuilderProperties` — builder round-trip + null-arg guard - `CosmosClient_EmbeddingGenerator_ReturnsConfiguredInstance` — read-only accessor surfaces builder- and options-set instances - `CosmosClientOptions_Clone_PreservesEmbeddingGenerator` — locks down `Clone()` preservation - `Contracts/DotNetPreviewSDKAPI.net6.json` — regenerated for the new surface ## Validation - ✅ PREVIEW build (`/p:IsPreview=true`): 0 errors - ✅ Non-PREVIEW build: 0 errors - ✅ `ContractEnforcement` + `SettingsContractTests` + `CosmosClientOptionsUnitTests`: PREVIEW 220/220 ; non-PREVIEW 221/221 - ✅ Baseline regenerated with LF line endings preserved Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Adds the EmbeddingProvider Protocol and EmbeddingResult dataclass to azure-cosmos (both sync and async), defining the contract the SDK will use to generate vector embeddings for GenerateEmbeddings(...) query expressions.
This PR is interface-only, no client wiring, no concrete implementation. A default Azure OpenAI implementation will ship in a future companion azure-cosmos-ai package.
EmbeddingResult is used as the return type instead of a plain List[List[float]] so the SDK can carry optional usage metadata (e.g. total_tokens) alongside the vectors for diagnostics later on without a breaking change to the Protocol.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines