[release/9.0] [Mono] Fix c11 ARM64 atomics to issue full memory barrier. #115635
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #115573 to release/9.0
/cc @steveisok @lateralusX
Customer Impact
Customers both in #114262 and through normal support channels have reported weird and random crashes in their .NET 9 MAUI apps when running on iOS. All of the data provided to us pointed at an issue with Task continuations, but with no clear way to tell why as customers indicated they were doing nothing remarkable when the crashes happened.
A reliable repro was hard to come by and it took some time for us to get something going. Once we had a repro in place, we found an issue with the mono runtime's use of C11 atomics, specifically around when we call
Interlocked
functions. The callers to these functions have an expectation of a full memory barrier whereas our .NET 9 implementation only guaranteed a half barrier. It just so happensTask
makes heavy use of interlocked functions.The fix adds a full memory barrier around the interlocked functions.
Regression
We completed C11 atomics support in 12ecfe7. This change inadvertently guaranteed a half memory barrier instead a full, breaking consistency expectations.
Testing
We added tests extracted from our repro to validate the before and after. Before the fix, the crash would happen fairly quickly. After the fix was applied, not at all.
As to why we didn't hit this before, we did not have the kind of stress tests in place to simulate the contention customers would see at random.
Risk
Low - even though this is a sensitive area, the fix is targeted and has been well tested.
IMPORTANT: If this backport is for a servicing release, please verify that:
release/X.0-staging
, notrelease/X.0
.Package authoring no longer needed in .NET 9
IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.