-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ExecutionEngineException in WinForms app (Paint.NET) on .NET 8 RC2, which does not occur in .NET 7 (GCStress) #94579
Comments
Tagging subscribers to this area: @dotnet/gc Issue DetailsDescriptionI've had two spurious EEE's pop out at me while I was in the debugger over the last week, working on the new version of my app (Paint.NET v5.1) that is targeting .NET 8. I was worried it was due to my new code, some of which is clipboard related (always a bug farm) ... ... But I'm now able to reproduce this 100% with my latest public release (Paint.NET v5.0.11, .NET 7.0.12), when it's retargeted for .NET 8. It has no trouble with .NET 7. I believe this is a new bug in .NET 8, it's not due to any changes I've made in my code. Reproduction StepsYou will need access to https://github.com/dotnet/paint.net/ -- just ask and I will add you (for MSFT employees that is) You will want this branch: https://github.com/dotnet/paint.net/tree/net8_EEE_repro . This is branched from v5.0.11, the latest public release using .NET 7, and then I've done the minimal amount of work to get it running on .NET 8 rc2 (mostly some nullable fixes that are already in my newer v5.1 branch)
Expected behaviorApp starts up just fine, albeit very slowly due to Actual behaviorApp is not able to initialize, it dies with an EEE before the main window is displayed. On my system it's usually somewhere in the constructor for the It usually won't even show a callstack in the debugger, and when it does it will not load symbols for the relevant .NET framework/runtime DLLs: Regression?Yes. This works fine if you change Known WorkaroundsNo response ConfigurationVisual Studio 2022 17.8.0 Preview 7.0 Happens on both my desktop: and on my Lenovo P16 laptop: Other informationcc @EgorBo who was commenting on this a little over on Discord, and who already has access to the dotnet/paint.net repository
|
the fact that it repros with GCStress indicates its some heap corruption, so possibly something the WinForms team should initially investigate. Is it easy to share a few dumps? |
Sure; just via Task Manager's right click -> Create memory dump file? |
Here is a DMP file from the paintdotnet.exe process right when it crashed: https://1drv.ms/u/s!Ak4YjacO2C9yktlxwjC_frr4DNymkA?e=5JChdi Let me know when you have the file so I can un-share/delete it |
As I had suspected the heap seems to be corrupted. Please provide access to the repo, so we can try to repro locally. Alternatively perhaps you could run with |
Also adding @AaronRobinsonMSFT since there is
|
Some frames in the native stack do not look right, perhaps some symbol mismatc. However this was entered from ComAwareWeakReference, so it could be related to changes there. |
@rickbrew I have the file. |
@VSadov A possible explanation of the crash is that the syncblock was collected between the |
Yes, that is the most likely culprit since it happens under GCStress. Thanks for the hint about Precious. |
It looks like everything that puts any data in a syncblock will do The whole scheme with precious and transirentPresious seems somewhat dubious. If you need a syncblock, how likely you will not need it again? Maybe there is a scenario where it is more useful than it appears at first glance. It is hard to rule out though that there are opportunities under GC stress for a syncblock to be created, collected by GC, and recreated again when needed next time or before the data is actually set. Anyways. Since setting interop info would set the precious bit, our way of detecting a possible COM object by the presence of syncblock is valid. runtime/src/coreclr/vm/syncblk.cpp Lines 2915 to 2918 in 76aeefb
If there is interop info, then there must be a syncblock. I will check if such fix is sufficient for the Paint.net. I wonder if it is possible to have a simpler test-like repro for this bug. |
I've managed to capture a time travel trace for this failure, so now we can know the root cause. There is code in monitor lock that creates syncblocks if acquiring via the thinlock was unsuccessful, but does not make them precious, so it keeps creating these syncblocks and GC keeps collecting them. In my trace:
|
I have tried the fix proposed above and I no longer see the crash. (without the fix it always reproes for me) I've tried several times - it takes quite a while (since it is with GCStress=2), but eventually we get to the UI showing up without seeing EE exceptions. I will stage a PR for the main branch and then we can port the fix to 8.0. |
Why not make them precious? This back and forth seems less than ideal. |
Some early-2000 Java VMs experimented with lock-deflation. If a fat lock becomes uncontended, you may try to reduce it back into a thin lock when GC runs. It is however rare that locks change their habits permanently. Once deflated, the lock will likely inflate again. I am not aware if any modern VM does this. The code in question looks like an attempt to do something similar, but it does not go all the way. It would not deflate a fully inflated lock. It only deflates if it catches a lock in an intermediate stage - when there is a syncblock, but no OS event allocated yet. That is probably why it is rare to see the effects outside of GC stress. There is also a piece that does not allow thin lock acquires if GC wants to happen, which is nearly always the case under GC stress. That leads to locks inflating even if there is no contention and creates an artificial opportunity for the above "optimization" to happen. The code could use some clean up, but not in a servicing fix that we'd like to port to 8.0 |
Reopening for backport |
This has been backported to .NET 8.0.1. |
Description
I've had two spurious EEE's pop out at me while I was in the debugger over the last week, working on the new version of my app (Paint.NET v5.1) that is targeting .NET 8. I was worried it was due to my new code, some of which is clipboard related (always a bug farm) ...
... But I'm now able to reproduce this 100% with my latest public release (Paint.NET v5.0.11, .NET 7.0.12), when it's retargeted for .NET 8. It has no trouble with .NET 7.
I believe this is a new bug in .NET 8, it's not due to any changes I've made in my code.
Reproduction Steps
You will need access to https://github.com/dotnet/paint.net/ -- just ask and I will add you (for MSFT employees that is)
You will want this branch: https://github.com/dotnet/paint.net/tree/net8_EEE_repro . This is branched from v5.0.11, the latest public release using .NET 7, and then I've done the minimal amount of work to get it running on .NET 8 rc2 (mostly some nullable fixes that are already in my newer v5.1 branch)
DOTNET_GCStress=2
environment variable is set up in the Debug propertiesExecutionEngineException
pop up before the app's main window is displayedExpected behavior
App starts up just fine, albeit very slowly due to
DOTNET_GCStress=2
Actual behavior
App is not able to initialize, it dies with an EEE before the main window is displayed. On my system it's usually somewhere in the constructor for the
FileMenu
class, but there's some random variance as to exactly where it dies.It usually won't even show a callstack in the debugger, and when it does it will not load symbols for the relevant .NET framework/runtime DLLs:
Regression?
Yes. This works fine if you change
./paintdotnet/paintdotnet.csproj
to target .NET 7. Just remove the<TargetFramework>
element at the top and do a full rebuild (sometimes need to do that twice after changing TargetFramework, as the first time will error due to what I think is a VS or msbuild bug). It will then revert to .NET 7 based on the values in./TargetFramework.props
Known Workarounds
No response
Configuration
Visual Studio 2022 17.8.0 Preview 7.0
x64
.NET 8 RC2
Happens on both my desktop:
Ryzen 9 7950X
Windows 11 23H2 build 22635.2700
and on my Lenovo P16 laptop:
Intel Core i7-12800HX
Windows 11 23H2 build 22631.2506
Other information
cc @EgorBo who was commenting on this a little over on Discord, and who already has access to the dotnet/paint.net repository. Also @MichalPetryka who recommended me to file the issue
The text was updated successfully, but these errors were encountered: