Region normalization fix#46937
Conversation
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR fixes a routing correctness issue in azure-cosmos where customer-supplied region strings in preferred_locations / excluded_locations (client-level and per-request) were compared against account region names using exact string matching, causing non-canonical spellings (e.g., east-us-2, eastus2) to be silently ignored.
Changes:
- Add a single region-name normalization routine and apply it consistently across routing, refresh decisions, and bootstrap locational endpoint construction.
- Add config-time warnings (deduped across refreshes) when configured preferred/excluded regions don’t match any account regions.
- Add targeted tests validating normalization behavior and warning deduplication; update CHANGELOG with the bug fix note.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
sdk/cosmos/azure-cosmos/azure/cosmos/_location_cache.py |
Implements region normalization, uses normalized lookups for endpoint selection/refresh logic, and emits deduped mismatch warnings. |
sdk/cosmos/azure-cosmos/tests/test_location_cache.py |
Adds regression tests for normalized preferred/excluded locations, non-preferred routing paths, locational endpoint construction, and warning dedupe behavior. |
sdk/cosmos/azure-cosmos/CHANGELOG.md |
Documents the bug fix in the unreleased changelog section. |
updating chnagelog Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
/azp run python - cosmos - tests |
|
@sdkReviewAgent-2 |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run python - cosmos - tests |
|
@sdkReviewAgent-2 |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@sdkReviewAgent-2 |
|
@sdkReviewAgent-2 |
|
✅ Review complete (45:37) Posted 5 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
|
@sdkReviewAgent-2 |
|
✅ Review complete (42:30) Posted 2 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
When a customer configures a Cosmos client, they pass region names as strings (preferred_locations, excluded_locations). The SDK previously did exact string comparisons against the canonical names returned by the account ("East US 2", "West US 3", ...).
A small spelling difference — "eastus2", "east-us-2", "east_us_2" — silently failed to match, the entry was dropped, and the client could end up routing all traffic through the global endpoint instead of the regional pool.
What this change does:
Region matching is now tolerant of case, surrounding/internal whitespace, hyphens, and underscores.
Equivalent inputs all resolve to the same region:
"East US 2"
"east us 2"
"eastus2"
"EASTUS2"
"east-us-2"
"east_us_2"
" EastUs2 "
Anything beyond that (punctuation, digits, fuzzy matching) is intentionally not stripped — a more aggressive rule could collapse genuinely different regions like "East US" and "East US 2" into the same key and silently route to the wrong region. All client-supplied region-name strings — client-level preferred, client-level excluded, and request-level excluded are normalized
The same matching rule is applied wherever the customer's region string is consumed. Previously some paths used exact-string comparisons; now they all share one normalization rule:
Routing: which region serves a request (preferred + excluded, client-level + per-request).
Refresh decision: whether to schedule a background refresh based on the most-preferred region.
Bootstrap fallback URL: when the global endpoint can't be reached at startup and the SDK constructs a regional URL from a preferred region.
Misconfigured region names now produce a visible warning that names the dropped entry and the regions that were available, emitted at config time — when account metadata is processed at startup and on each background refresh.
There is no warning for per-request values as those happen thousands of times and its a lower blast radius.
No public API change - preferred_locations and excluded_locations remain plain list[str].
Behavior examples
Account regions: ["East US 2", "West US 3"]
Backwards compatibility:
If client config already uses the exact spelling Azure returns (e.g., "East US 2"), nothing changes
New Azure regions continue to work without an SDK upgrade; nothing in this change hardcodes a region list.
What this PR does not do:
Does not add a Regions constants surface. Adding a named-constant list of well-known regions would expand the public API surface area. Normalization plus a visible warning is enough to close the failure mode without touching the public surface today. Constants are therefore deferred to a later stable release as a separate, additive change.