Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GC time to the .Net Agent #922

Closed
jpenniman opened this issue Aug 3, 2020 · 3 comments
Closed

Add GC time to the .Net Agent #922

jpenniman opened this issue Aug 3, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@jpenniman
Copy link
Contributor

The .Net Agent already records the number collections and the heap sizes, however it is missing the pause time.

Recording GC pause times is critical for identifying memory leaks. When troubleshooting high memory utilization, we want to answer: "Is CPU high too?" and "How much of that CPU time is being used by the garbage collector?" If the garbage collector is running a lot, but collecting very little, it's an indication that there is likely a memory leak. Long GC pause times can also cause application performance issues even without a memory leak. Understanding pause times is key to understanding if application slowness is possibly do to long pause times or too frequent pauses.

The Java agent records: jvm.gc.time

I'm proposing a .net equivalent: clr.gc.time

For full framework:

_traceEventSession.Source.AddCallbackOnProcessStart(process =>
{
	process.AddCallbackOnDotNetRuntimeLoad(runtime =>
	{
		runtime.GCEnd += (traceProcess, gc) =>
		{
			if (traceProcess.ProcessID == Process.GetCurrentProcess().Id)
				_gcPauseTime = gc.PauseDurationMSec;
		};
	});
});

.Net Core doens't write pause time data o the event, but it can be calculated by capturing the instant of GCStart and doing the math in GCEnd. It's not perfect, since not all collections stop the world. Labeling the metric with the generation being collected and GC settings can help the engineer determine if the gc.time is a pause or not.

sealed class GcEventListener : EventListener
{
    long _gcStartTime;

    protected override void OnEventWritten(EventWrittenEventArgs eventData)
    {
        if (eventData.EventId == GcStartEvent)
        {
            _gcStartTime = DateTime.UtcNow.Ticks;
        }
        else if (eventData.EventId == GcEndEvent)
        {
            var gcStopTime = DateTime.UtcNow.Ticks;
            var pauseTime = (double)(gcStopTime - _gcStartTime) / 10_000.0
        }
    }
}

Our application currently uses similar code to get the GC Pause time. We've tested code in net461, net472, and netcoreapp3.1 on both Windows and Linux, and it successfully reports GC pause times in all targets. Ideally, we'd like to get this functionality into the Agent, so it's something we don't have to maintain separately.

I'm available to add this feature one we can agree on implementation approach.

@jpenniman jpenniman added the enhancement New feature or request label Aug 3, 2020
@gregkalapos
Copy link
Contributor

Hi @jpenniman

sounds reasonable. The set of metrics the agent currently collects is definitely not final, we plan to extend it and iterate on it. I'd be happy to help and review this if you open a PR.

The name clr.gc.time (in sync with the Java Agent) as you suggest sounds good to me.

And also the 2 proposed solutions are ok. You probably already found this file where you can implement it.

Before you open a PR please quickly go through our CONTRIBUTING.md.

@jpenniman
Copy link
Contributor Author

Ok great. I'll go through the doc and sign the agreement.

jpenniman added a commit to jpenniman/apm-agent-dotnet that referenced this issue Aug 6, 2020
@jpenniman jpenniman changed the title Add GC pause time to the .Net Agent Add GC time to the .Net Agent Aug 6, 2020
gregkalapos pushed a commit to jpenniman/apm-agent-dotnet that referenced this issue Mar 2, 2021
gregkalapos added a commit that referenced this issue Mar 2, 2021
* Implements #922: Add GC time to the .Net Agent

* Represent GC Time as a sum for the sample period.

* GC time: add test to assert on metric with `clr.gc.time`

* Update GcMetricsProvider.cs

GC time calculation on Full Framework: calculate GC time based on times span between Start and Stop

Co-authored-by: Greg Kalapos <gergo@kalapos.net>
@russcam
Copy link
Contributor

russcam commented Nov 9, 2021

This was implemented in #925.

@russcam russcam closed this as completed Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants