Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DAC issue with redefined standard new / delete operators #55945

Merged

Conversation

janvorli
Copy link
Member

@janvorli janvorli commented Jul 19, 2021

I have found that since .NET 6 preview 5, SOS on macOS arm64 crashes
when running basic commands like clrstack due to the fact that
its std::string allocations are done using standard new operator,
but the freeing at some places is done using our overridden delete
operator due to inlining of STL code. These new std::string usages
were added in preview 5.

This change removes our redefinitions of those operators for DAC
compilation, so this clash cannot happen.

This happens only with checked / debug builds of the runtime,
in release build, the issue doesn't fire because our implementation
of the delete operator becomes basically just a call to free, which
accidentally matches what the global delete does or at least doesn't
make it crash.

I have found that since .NET 6 preview 5, SOS on macOS arm64 crashes
when running basic commands like clrstack due to the fact that
its std::string allocations are done using standard new operator,
but the freeing at some places is done using our overriden delete
operator due to inlining of STL code.

This change removes our redefinitions of those operators for DAC
compilation, so this clash cannot happen.
@ghost
Copy link

ghost commented Jul 19, 2021

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

I have found that since .NET 6 preview 5, SOS on macOS arm64 crashes
when running basic commands like clrstack due to the fact that
its std::string allocations are done using standard new operator,
but the freeing at some places is done using our overridden delete
operator due to inlining of STL code. These new std::string usages
were added in preview 5.

This change removes our redefinitions of those operators for DAC
compilation, so this clash cannot happen.

Author: janvorli
Assignees: janvorli
Labels:

area-Diagnostics-coreclr

Milestone: -

Copy link
Member

@mikem8361 mikem8361 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jkotas
Copy link
Member

jkotas commented Jul 19, 2021

What is the change that added STL use?

We have been avoiding STL for reasons, this is one of them.

Note that this change means that any places that catch OutOfMemory exceptions in the DAC are no longer be able to catch them. Hard to tell whether it will show up as a problem.

@janvorli
Copy link
Member Author

The offending std::string usage was added in #51425. It fails in MachOModule::TryLookupSymbol. But there already was some previous usage of STL.

@mikem8361
Copy link
Member

I'm not exactly sure why this started happening in the 6.0 preview 5 build because both the ELF and MachO readers in dbgutil use and have used std::string and std::vector even back in 5.0. These readers are used by createdump mostly except this one symbol lookup that the DAC needs to do to find the global dac table symbol.

@janvorli
Copy link
Member Author

Note that this change means that any places that catch OutOfMemory exceptions in the DAC are no longer be able to catch them.

I guess that if that turns out to be a problem, we could actually modify the code under the related ifdefs so that we would still implement the new operators ourselves, but just as thin wrappers around the actual standard new with post check for NULL results.

@mikem8361
Copy link
Member

Is this ready to be merged? Is everybody ok with this fix?

@janvorli
Copy link
Member Author

I was wondering if anyone else from the diagnostics team would want to chime in. If not, we can merge it in.

@mikem8361
Copy link
Member

@dotnet/dotnet-diag

@noahfalk
Copy link
Member

I'm fine to merge it. In terms of the OOM issue are we back to the behavior we had before preview 5 or is something worse?

@janvorli
Copy link
Member Author

@noahfalk we were always able to catch OOM exception in the DAC until this change. This change is fixing a problem that was introduced in preview 5 and that caused mixed usage of our overrides of new / delete operators and the standard new / delete operators. This mixing is something we cannot reasonably prevent as is it standard c++ library implementation specific detail. We were just lucky that it didn't happen before. So this fix removes all the implementations of global new / delete operators for DAC to prevent that problem.
The concern was - is OOM in DAC something that happens in real world scenarios and that we can reasonably handle if it happens?

@mikem8361
Copy link
Member

I really OOM in DAC is something that a debugger could recover from anyway. I think we should merge this PR.

@noahfalk
Copy link
Member

is OOM in DAC something that happens in real world scenarios and that we can reasonably handle if it happens?

To the best of my knowledge it is quite rare. I'd still happily merge the PR. In the future if we get new reports of OOM we'll have a good lead where it might be coming from and we can take additional steps to address it.

@mikem8361 mikem8361 merged commit a81fee2 into dotnet:main Jul 22, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants