Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@kouvel
Copy link

@kouvel kouvel commented Mar 23, 2017

…dles and memory

A thread that is started allocates some handles in Thread::AllocHandles(). After it terminates, the managed thread object needs to be collected by a GC for the handles to be released. Some applications that are mostly idle but have a timer ticking periodically will cause a thread pool thread to be created on each tick, since the thread pool preserves threads for a maximum of 20 seconds currently. Over time the number of handles accumulates to a high value. Thread creation adds some memory pressure, but that does not force a GC until a sufficiently large handle count, and for some mostly idle apps, that can be very long. The same issue arises with directly starting threads as well.

Fixes #6602:

  • Track a dead thread count separately from the current dead thread count. This is the count that will contribute to triggering a GC, once it reaches a threshold. The count is tracked separately so that it may be reset to zero when a max-generation GC occurs, preventing dead threads that survive a GC due to references from contributing to triggering a GC again in this fashion.
  • If the count exceeds a threshold, enumerate dead threads to see which GC generation the corresponding managed thread objects are in. If the duration of time since the last GC of the desired generation also exceeds a threshold, trigger a preferably non-blocking GC on the finalizer thread.
  • Removed a couple of handles and some code paths specific to user-requested thread suspension, which is not supported on CoreCLR

@kouvel
Copy link
Author

kouvel commented Mar 23, 2017

I investigated whether it would be possible to delete handles and some unmanaged memory associated with a thread upon the thread exiting. This does not seem to be feasible due to races that would exist without the use of a lock to read thread state, which would be necessary on almost every Thread API call. The current mechanism eliminates the possibility of the native Thread object and its handles from being deleted concurrently with a Thread API call by associating their lifetime with the managed Thread object's finalizer.

So, going back to the original solution of adding a heuristic to trigger GC based on the number of dead threads accumulated since the last finalizing GC and time elapsed since the last finalizing GC.

@jkotas @Maoni0

virtual HRESULT GarbageCollect(int generation = -1, BOOL low_memory_p = FALSE, int mode = collection_blocking) = 0;

// Gets the smallest GC generation that runs finalizers.
virtual unsigned GetMinFinalizingGeneration() = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what this means or why you need it. The finalizers run even after gen0 GC:

using System;
using System.Runtime.CompilerServices;

class Program
{
    ~Program()
    {
        Console.WriteLine("Here");
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void foo()
    {
        GC.KeepAlive(new Program());
    }

    static void Main(string[] args)
    {
        GC.Collect();
        foo();
        GC.Collect(0);
        for (;;);
    }
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm somehow I was under the other impression, looks like gen 0 is fine. I'll fix to specify 1 or max based on the other discussion. Doing only gen 0 GC probably would not cover enough cases.

m_TriggerFinalizingGC = false;

IGCHeap *gcHeap = GCHeapUtilities::GetGCHeap();
gcHeap->GarbageCollect(gcHeap->GetMinFinalizingGeneration(), FALSE, collection_non_blocking);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the waking threads have done just enough work to get them promoted to Gen2, this won't cleanup anything. Should this rather be full GC?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had discussed this with @Maoni0 before, the concern was that the mostly idle apps that are running into this issue may have a large GC heap and this would cause those apps to do a full GC every period. Also, in all of the cases where I saw this issue manifest, the scenario was such that neither gen 1 nor gen 2 GCs were happening frequently enough. So doing a gen 1 GC should cover most of the cases that are running into this issue. On the other hand, as you said this would not cover thread objects that are promoted to gen 2. Thoughts on the tradeoff? Maybe it could trigger a gen 2 GC between a number of gen 1 GCs. Or if there is a way to see which generation an object is in, it could walk the dead threads' exposed objects and trigger a GC for the highest generation containing a dead thread (but that would include reachable dead threads too).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if there is a way to see which generation an object is in,

WhichGeneration

that would include reachable dead threads too

If you wanted to handle these better, you can have a flag on the thread that you will set once you have considered it once.

I would think that this needs to do Gen2 GCs at least sometimes if we do not ever want to hear about it again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you wanted this for the idling scenario where nothing is happening aside from the timer creating threads? If there's work going on, you'd have gen1 GCs (or sometimes gen2 GCs) happening anyway so you don't need to worry about this. So if you are triggering GCs, I'd think a gen1 GC would suffice 'cause all the previously created threads would be dead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The work that's happening is not sufficient to automatically trigger a gen 2 GC (and usually not enough to trigger a gen 1 GC either), but in cases where a gen 1 GC is triggered automatically it's possible that threads that are still referenced get promoted to gen 2, but a gen 2 GC is not triggered frequently enough to clean up those threads, which can continue to increase the handle count as before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think this is different from the scenario I understood when we talked. The scenario I understood was the threads that you created before were all dead (except the current one) - nothing holding onto them - if something is holding onto them it means they are useful in some way and if that's the case, some work is in progress and GCs will probably be triggered anyway. But it sounds like this is not the case because you worry about a little bit of work is going on that's enough to trigger gen1's. I am hesitant to have it trigger gen2 GCs because then we might be back to square one where it's triggering too many gen2 GCs so we have to be careful about triggering gen2's. You could call WhichGeneration and see if the percentage of gen2 objects is high enough, then trigger a gen2 but I would experiment with this carefully.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the observed scenarios are concerned, a gen 1 GC would suffice. Slightly different scenarios may trigger gen 1 GCs as a result of medium-level work, while referencing created threads for a short enough amount of time that they are still garbage but are alive long enough to get promoted to gen 2. I think it's beneficial to not leave any room for this being a continuing problem in the future. I plan on using WhichGeneration to trigger precisely the generation of GC that would benefit for this purpose, so this shouldn't cause excessive GCs in any scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okie that sounds good

src/vm/threads.h Outdated

private:
// For suspends:
#ifndef FEATURE_CORECLR // CoreCLR does not support user-requested thread suspension
Copy link
Member

@jkotas jkotas Mar 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete this. No need to add FEATURE_CORECLR ifdefs. All FEATURE_CORECLR ifdefs in VM were deleted some time ago.

SVAL_DECL(uint32_t, gcHeapType);
#endif

static const uint32_t minFinalizingGeneration;
Copy link

@swgillespie swgillespie Mar 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally there won't be any static fields on IGCHeap - the two existing ones (gcHeapType and maxGeneration) will be getting removed soon, they were there because of DAC trickiness. Can we avoid putting this on the interface itself? (Provided it's necessary, per Jan's comment)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this and will just specify a number relative to 0 or max

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just use (max_generation - 1).

@kouvel
Copy link
Author

kouvel commented Mar 24, 2017

Made all changes from above

@kouvel
Copy link
Author

kouvel commented Mar 24, 2017

@dotnet-bot test this please

Koundinya Veluri added 3 commits March 24, 2017 05:04
…dles and memory

A thread that is started allocates some handles in Thread::AllocHandles(). After it terminates, the managed thread object needs to be collected by a GC for the handles to be released. Some applications that are mostly idle but have a timer ticking periodically will cause a thread pool thread to be created on each tick, since the thread pool preserves threads for a maximum of 20 seconds currently. Over time the number of handles accumulates to a high value. Thread creation adds some memory pressure, but that does not force a GC until a sufficiently large handle count, and for some mostly idle apps, that can be very long. The same issue arises with directly starting threads as well.

Fixes #6602:
- Track a dead thread count separately from the current dead thread count. This is the count that will contribute to triggering a GC, once it reaches a threshold. The count is tracked separately so that it may be reset to zero when a max-generation GC occurs, preventing dead threads that survive a GC due to references from contributing to triggering a GC again in this fashion.
- If the count exceeds a threshold, enumerate dead threads to see which GC generation the corresponding managed thread objects are in. If the duration of time since the last GC of the desired generation also exceeds a threshold, trigger a preferably non-blocking GC on the finalizer thread.
- Removed a couple of handles and some code paths specific to user-requested thread suspension, which is not supported on CoreCLR
@kouvel
Copy link
Author

kouvel commented Mar 24, 2017

Updated commit description

SIZE_T newDeadThreadGenerationCounts[3] = {0};
int countedMaxGeneration = _countof(newDeadThreadGenerationCounts) - 1;
{
GCX_COOP();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks dangerous to be switching to cooperating mode here after the thread is marked as dead. Can we just do the simple checks of the time and dead thread count thresholds here, kick the finalizer thread if they are hit, and do the more fine grained checks on the finalizer thread?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also fixed to clear the count even if a GC is not triggered after scanning threads due to not enough time having elapsed since the last GC at the same generation, so that it would not scan threads every time a thread becomes dead after the count threshold is reached and before the time threshold is reached.

@jkotas
Copy link
Member

jkotas commented Mar 24, 2017

It may be nice to add a test that sets DeadThreadGCTriggerPeriodMilliseconds to very low number and then creates a ton of threads - to verify that there is nothing completely broken.

IGCHeap *gcHeap = GCHeapUtilities::GetGCHeap();
_ASSERTE(gcHeap != nullptr);
SIZE_T generationCountThreshold = static_cast<SIZE_T>(s_DeadThreadCountThresholdForGCTrigger) / 2;
SIZE_T newDeadThreadGenerationCounts[3] = {0};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 [](start = 41, length = 1)

this should really be max_generation + 1

continue;
}

int exposedObjectGeneration = min(countedMaxGeneration, static_cast<int>(gcHeap->WhichGeneration(exposedObject)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min [](start = 42, length = 3)

not sure why you needed to do this min - WhichGeneration is not going to return anything that's > max_generation

}

// Make sure that enough time has elapsed since the last GC of the desired generation. Otherwise, the memory pressure
// would be sufficient to trigger GCs automatically.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what to make of this comment - if not enough time has elapsed, we wouldn't want to trigger a GC anyway just because we don't want to trigger GCs too often.

return;
}

// For threads whose exposed objects are in the generation of GC that will be triggered or in a lower GC generation,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

threads whose exposed objects are in the generation of GC that will be triggered or in a lower GC generation, [](start = 14, length = 110)

When are dead threads removed from the thread list?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're removed in the Thread destructor, which is typically called from the finalizer thread (CallFinalizerOnThreadObject, ..., Thread::DoExtraWorkForFinalizer => Thread::CleanupDetachedThreads => Thread::DecExternalCount)

@kouvel
Copy link
Author

kouvel commented Mar 25, 2017

Made changes from above

@kouvel kouvel closed this Mar 25, 2017
@kouvel kouvel reopened this Mar 25, 2017
@kouvel kouvel merged commit 2401b6e into dotnet:master Mar 27, 2017
@kouvel kouvel deleted the DeadThreads branch March 27, 2017 18:10
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants