-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Add heuristic to trigger GC to finalize dead threads and clean up han… #10413
Conversation
|
I investigated whether it would be possible to delete handles and some unmanaged memory associated with a thread upon the thread exiting. This does not seem to be feasible due to races that would exist without the use of a lock to read thread state, which would be necessary on almost every Thread API call. The current mechanism eliminates the possibility of the native Thread object and its handles from being deleted concurrently with a Thread API call by associating their lifetime with the managed Thread object's finalizer. So, going back to the original solution of adding a heuristic to trigger GC based on the number of dead threads accumulated since the last finalizing GC and time elapsed since the last finalizing GC. |
src/gc/gcinterface.h
Outdated
| virtual HRESULT GarbageCollect(int generation = -1, BOOL low_memory_p = FALSE, int mode = collection_blocking) = 0; | ||
|
|
||
| // Gets the smallest GC generation that runs finalizers. | ||
| virtual unsigned GetMinFinalizingGeneration() = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand what this means or why you need it. The finalizers run even after gen0 GC:
using System;
using System.Runtime.CompilerServices;
class Program
{
~Program()
{
Console.WriteLine("Here");
}
[MethodImpl(MethodImplOptions.NoInlining)]
static void foo()
{
GC.KeepAlive(new Program());
}
static void Main(string[] args)
{
GC.Collect();
foo();
GC.Collect(0);
for (;;);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm somehow I was under the other impression, looks like gen 0 is fine. I'll fix to specify 1 or max based on the other discussion. Doing only gen 0 GC probably would not cover enough cases.
src/vm/threads.cpp
Outdated
| m_TriggerFinalizingGC = false; | ||
|
|
||
| IGCHeap *gcHeap = GCHeapUtilities::GetGCHeap(); | ||
| gcHeap->GarbageCollect(gcHeap->GetMinFinalizingGeneration(), FALSE, collection_non_blocking); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the waking threads have done just enough work to get them promoted to Gen2, this won't cleanup anything. Should this rather be full GC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had discussed this with @Maoni0 before, the concern was that the mostly idle apps that are running into this issue may have a large GC heap and this would cause those apps to do a full GC every period. Also, in all of the cases where I saw this issue manifest, the scenario was such that neither gen 1 nor gen 2 GCs were happening frequently enough. So doing a gen 1 GC should cover most of the cases that are running into this issue. On the other hand, as you said this would not cover thread objects that are promoted to gen 2. Thoughts on the tradeoff? Maybe it could trigger a gen 2 GC between a number of gen 1 GCs. Or if there is a way to see which generation an object is in, it could walk the dead threads' exposed objects and trigger a GC for the highest generation containing a dead thread (but that would include reachable dead threads too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if there is a way to see which generation an object is in,
WhichGeneration
that would include reachable dead threads too
If you wanted to handle these better, you can have a flag on the thread that you will set once you have considered it once.
I would think that this needs to do Gen2 GCs at least sometimes if we do not ever want to hear about it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you wanted this for the idling scenario where nothing is happening aside from the timer creating threads? If there's work going on, you'd have gen1 GCs (or sometimes gen2 GCs) happening anyway so you don't need to worry about this. So if you are triggering GCs, I'd think a gen1 GC would suffice 'cause all the previously created threads would be dead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The work that's happening is not sufficient to automatically trigger a gen 2 GC (and usually not enough to trigger a gen 1 GC either), but in cases where a gen 1 GC is triggered automatically it's possible that threads that are still referenced get promoted to gen 2, but a gen 2 GC is not triggered frequently enough to clean up those threads, which can continue to increase the handle count as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think this is different from the scenario I understood when we talked. The scenario I understood was the threads that you created before were all dead (except the current one) - nothing holding onto them - if something is holding onto them it means they are useful in some way and if that's the case, some work is in progress and GCs will probably be triggered anyway. But it sounds like this is not the case because you worry about a little bit of work is going on that's enough to trigger gen1's. I am hesitant to have it trigger gen2 GCs because then we might be back to square one where it's triggering too many gen2 GCs so we have to be careful about triggering gen2's. You could call WhichGeneration and see if the percentage of gen2 objects is high enough, then trigger a gen2 but I would experiment with this carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as the observed scenarios are concerned, a gen 1 GC would suffice. Slightly different scenarios may trigger gen 1 GCs as a result of medium-level work, while referencing created threads for a short enough amount of time that they are still garbage but are alive long enough to get promoted to gen 2. I think it's beneficial to not leave any room for this being a continuing problem in the future. I plan on using WhichGeneration to trigger precisely the generation of GC that would benefit for this purpose, so this shouldn't cause excessive GCs in any scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okie that sounds good
src/vm/threads.h
Outdated
|
|
||
| private: | ||
| // For suspends: | ||
| #ifndef FEATURE_CORECLR // CoreCLR does not support user-requested thread suspension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can delete this. No need to add FEATURE_CORECLR ifdefs. All FEATURE_CORECLR ifdefs in VM were deleted some time ago.
src/gc/gcinterface.h
Outdated
| SVAL_DECL(uint32_t, gcHeapType); | ||
| #endif | ||
|
|
||
| static const uint32_t minFinalizingGeneration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally there won't be any static fields on IGCHeap - the two existing ones (gcHeapType and maxGeneration) will be getting removed soon, they were there because of DAC trickiness. Can we avoid putting this on the interface itself? (Provided it's necessary, per Jan's comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this and will just specify a number relative to 0 or max
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just use (max_generation - 1).
|
Made all changes from above |
|
@dotnet-bot test this please |
…dles and memory A thread that is started allocates some handles in Thread::AllocHandles(). After it terminates, the managed thread object needs to be collected by a GC for the handles to be released. Some applications that are mostly idle but have a timer ticking periodically will cause a thread pool thread to be created on each tick, since the thread pool preserves threads for a maximum of 20 seconds currently. Over time the number of handles accumulates to a high value. Thread creation adds some memory pressure, but that does not force a GC until a sufficiently large handle count, and for some mostly idle apps, that can be very long. The same issue arises with directly starting threads as well. Fixes #6602: - Track a dead thread count separately from the current dead thread count. This is the count that will contribute to triggering a GC, once it reaches a threshold. The count is tracked separately so that it may be reset to zero when a max-generation GC occurs, preventing dead threads that survive a GC due to references from contributing to triggering a GC again in this fashion. - If the count exceeds a threshold, enumerate dead threads to see which GC generation the corresponding managed thread objects are in. If the duration of time since the last GC of the desired generation also exceeds a threshold, trigger a preferably non-blocking GC on the finalizer thread. - Removed a couple of handles and some code paths specific to user-requested thread suspension, which is not supported on CoreCLR
|
Updated commit description |
| SIZE_T newDeadThreadGenerationCounts[3] = {0}; | ||
| int countedMaxGeneration = _countof(newDeadThreadGenerationCounts) - 1; | ||
| { | ||
| GCX_COOP(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks dangerous to be switching to cooperating mode here after the thread is marked as dead. Can we just do the simple checks of the time and dead thread count thresholds here, kick the finalizer thread if they are hit, and do the more fine grained checks on the finalizer thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also fixed to clear the count even if a GC is not triggered after scanning threads due to not enough time having elapsed since the last GC at the same generation, so that it would not scan threads every time a thread becomes dead after the count threshold is reached and before the time threshold is reached.
|
It may be nice to add a test that sets DeadThreadGCTriggerPeriodMilliseconds to very low number and then creates a ton of threads - to verify that there is nothing completely broken. |
src/vm/threads.cpp
Outdated
| IGCHeap *gcHeap = GCHeapUtilities::GetGCHeap(); | ||
| _ASSERTE(gcHeap != nullptr); | ||
| SIZE_T generationCountThreshold = static_cast<SIZE_T>(s_DeadThreadCountThresholdForGCTrigger) / 2; | ||
| SIZE_T newDeadThreadGenerationCounts[3] = {0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 [](start = 41, length = 1)
this should really be max_generation + 1
src/vm/threads.cpp
Outdated
| continue; | ||
| } | ||
|
|
||
| int exposedObjectGeneration = min(countedMaxGeneration, static_cast<int>(gcHeap->WhichGeneration(exposedObject))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
min [](start = 42, length = 3)
not sure why you needed to do this min - WhichGeneration is not going to return anything that's > max_generation
src/vm/threads.cpp
Outdated
| } | ||
|
|
||
| // Make sure that enough time has elapsed since the last GC of the desired generation. Otherwise, the memory pressure | ||
| // would be sufficient to trigger GCs automatically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what to make of this comment - if not enough time has elapsed, we wouldn't want to trigger a GC anyway just because we don't want to trigger GCs too often.
| return; | ||
| } | ||
|
|
||
| // For threads whose exposed objects are in the generation of GC that will be triggered or in a lower GC generation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
threads whose exposed objects are in the generation of GC that will be triggered or in a lower GC generation, [](start = 14, length = 110)
When are dead threads removed from the thread list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're removed in the Thread destructor, which is typically called from the finalizer thread (CallFinalizerOnThreadObject, ..., Thread::DoExtraWorkForFinalizer => Thread::CleanupDetachedThreads => Thread::DecExternalCount)
|
Made changes from above |
…dles and memory
A thread that is started allocates some handles in Thread::AllocHandles(). After it terminates, the managed thread object needs to be collected by a GC for the handles to be released. Some applications that are mostly idle but have a timer ticking periodically will cause a thread pool thread to be created on each tick, since the thread pool preserves threads for a maximum of 20 seconds currently. Over time the number of handles accumulates to a high value. Thread creation adds some memory pressure, but that does not force a GC until a sufficiently large handle count, and for some mostly idle apps, that can be very long. The same issue arises with directly starting threads as well.
Fixes #6602: