Skip to content

Return HTTP 409 (Conflict) for Sequential Bundle Race Condition#5592

Merged
mikaelweave merged 7 commits into
mainfrom
mikaelweave/fix-theory1-sql-transaction-conflict
May 29, 2026
Merged

Return HTTP 409 (Conflict) for Sequential Bundle Race Condition#5592
mikaelweave merged 7 commits into
mainfrom
mikaelweave/fix-theory1-sql-transaction-conflict

Conversation

@mikaelweave
Copy link
Copy Markdown
Contributor

@mikaelweave mikaelweave commented May 28, 2026

Description

This PR changes SQL Server transaction-bundle behavior for the confirmed WI 188321 root cause. When MergeAsync sees SQL conflict 50409 while enlisted in a C# transaction, it now stops retrying inside that already-invalidated transaction and throws ResourceConflictException, which maps to HTTP 409 Conflict. This prevents the request from continuing to SqlTransaction.Commit() and surfacing the later zombie transaction symptom as a generic HTTP 500.

The completed SqlTransaction fallback in BundleHandler remains a 500 for non-conflict root causes. The change is intentionally scoped to enlisted SQL concurrency conflicts; the existing non-enlisted retry path still retries and preserves its current exhausted-retry behavior.

This PR also adds targeted coverage for:

  • Enlisted SQL conflict fail-fast behavior at the SQL data store layer.
  • Transaction bundle conflict wiring to HTTP 409.
  • The existing completed/zombie transaction fallback behavior.

The initial PowerShell repro harness was removed from the branch.

AB#188321

Testing

  • dotnet test .\src\Microsoft.Health.Fhir.Api.UnitTests\Microsoft.Health.Fhir.R4.Api.UnitTests.csproj --no-restore --filter "FullyQualifiedName~BundleHandlerTests" --logger "console;verbosity=minimal"
  • dotnet test .\src\Microsoft.Health.Fhir.Api.UnitTests\Microsoft.Health.Fhir.R4.Api.UnitTests.csproj --no-restore --filter "FullyQualifiedName~GivenATransaction_WhenTransactionIsZombiedAtCommit|FullyQualifiedName~GivenATransaction_WhenInnerRequestReturnsConflictOperationOutcome" --logger "console;verbosity=minimal"
  • dotnet test .\src\Microsoft.Health.Fhir.SqlServer.UnitTests\Microsoft.Health.Fhir.SqlServer.UnitTests.csproj --no-restore --framework net8.0 --filter "Name~MergeAsync_OnSqlConflict_WithEnlistedTransaction" --logger "console;verbosity=minimal"
  • dotnet test .\src\Microsoft.Health.Fhir.SqlServer.UnitTests\Microsoft.Health.Fhir.SqlServer.UnitTests.csproj --no-restore --framework net9.0 --filter "Name~MergeAsync_OnSqlConflict_WithEnlistedTransaction" --logger "console;verbosity=minimal"

FHIR Team Checklist

  • Update the title of the PR to be succinct and less than 65 characters
  • Add a milestone to the PR for the sprint that it is merged (i.e. add S47)
  • Tag the PR with the type of update: Bug, Build, Dependencies, Enhancement, New-Feature or Documentation
  • Tag the PR with Open source, Azure API for FHIR (CosmosDB or common code) or Azure Healthcare APIs (SQL or common code) to specify where this change is intended to be released.
  • Tag the PR with Schema Version backward compatible or Schema Version backward incompatible or Schema Version unchanged if this adds or updates Sql script which is/is not backward compatible with the code.
  • When changing or adding behavior, if your code modifies the system design or changes design assumptions, please create and include an ADR.
  • CI is green before merge Build Status
  • Review squash-merge requirements

Semver Change (docs)

Patch

mikaelweave and others added 3 commits May 28, 2026 07:00
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mikaelweave mikaelweave changed the title Add Theory 1 transaction conflict repro tests Return HTTP 409 (Conflict) for Sequential Bundle Race Condition May 28, 2026
@mikaelweave mikaelweave added this to the FY26\Q4\2Wk\2Wk24 milestone May 28, 2026
@mikaelweave mikaelweave added Bug Bug bug bug. Azure Healthcare APIs Label denotes that the issue or PR is relevant to the FHIR service in the Azure Healthcare APIs No-PaaS-breaking-change No-ADR ADR not needed labels May 28, 2026
@mikaelweave mikaelweave marked this pull request as ready for review May 28, 2026 19:13
@mikaelweave mikaelweave requested a review from a team as a code owner May 28, 2026 19:13
mikaelweave and others added 2 commits May 28, 2026 14:00
The earlier fix gated the fail-fast 409 on MergeOptions.EnlistInTransaction,
but regular single-resource upserts also set enlistTransaction: true. That
caused concurrent no-ETag upserts (which should retry to a last-write-wins
outcome) to surface as ResourceConflictException (409) instead of retrying,
breaking GivenASavedResource_WhenConcurrentlyUpsertingWithNoETagHeader_ThenTheExistingResourceIsUpdated.

A SQL conflict (50409) only zombies the transaction when the command actually
enlisted in an ambient C# transaction scope (the condition used in
MergeResourcesWrapperAsync). Gate the fail-fast on
EnlistInTransaction && SqlTransactionScope != null, captured before the merge
attempt, so only sequential transaction bundles fail fast while regular and
batch-bundle upserts keep retrying.

Updated the enlisted unit test to set up a real ambient scope via
BeginTransaction(), and added a regression test asserting enlist:true without
an ambient scope still retries.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a comment noting that a SurrogateIdCollision (50424) inside an ambient C#
transaction zombies the transaction the same way a 50409 conflict does, so the
retries there are futile and still surface as a 500. Intentionally left
unhandled since 50424 is rare and C# transactions are being deprecated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
mikaelweave and others added 2 commits May 28, 2026 14:57
Document why a zombied-at-commit transaction returns 500 (root cause lost) while an inner-request conflict surfaces a precise 409 before commit. Condense data-store comments.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.34%. Comparing base (ee554e9) to head (bb50e22).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #5592   +/-   ##
=======================================
  Coverage   77.33%   77.34%           
=======================================
  Files         996      996           
  Lines       36534    36550   +16     
  Branches     5538     5541    +3     
=======================================
+ Hits        28255    28268   +13     
+ Misses       6912     6910    -2     
- Partials     1367     1372    +5     

see 11 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mikaelweave mikaelweave merged commit 4542176 into main May 29, 2026
49 checks passed
@mikaelweave mikaelweave deleted the mikaelweave/fix-theory1-sql-transaction-conflict branch May 29, 2026 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Azure Healthcare APIs Label denotes that the issue or PR is relevant to the FHIR service in the Azure Healthcare APIs Bug Bug bug bug. No-ADR ADR not needed No-PaaS-breaking-change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants