Skip to content

Hang in GC in OOM case  #6164

@janvorli

Description

@janvorli

While I was hunting the strange memory unmapping issue during the last few weeks, one of the instrumentations that I have locally added was a failure on n-th allocation of memory in PAL. I have tried to run it in a loop today, iterating the n from 1 to some high number to see if I can hit some problem. I was running it with my single threaded testing app that allocates GC memory like mad until it OOMs and also catches the OOM exception.

And I have found a case when the app ends up hanging in gc_heap::wait_for_gc_done forever. I have looked into it and I think I understand what’s going on. The root of the issue is that the SetupUnstartedThread uses throwing operator new to allocate a new thread. When that allocation fails and it throws, the stack looks like shown below. The exception unwinds all the frames until it is caught in the JIT_NewArr1 (frame dotnet/coreclr#18 below). And the gc_heap::gc_started stays set. Now we try to allocate memory for the Throwable for the exception, so we go to GC heap and since there is not enough space, we end up calling gc_heap::try_allocate_more_space, which calls gc_heap::wait_for_gc_done. And that’s the end of it, since the gc_started is still set and we wait forever.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions