Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose GC allocation statistics #6324

Closed
ayende opened this issue Jul 14, 2016 · 15 comments
Closed

Expose GC allocation statistics #6324

ayende opened this issue Jul 14, 2016 · 15 comments
Assignees
Labels
area-GC-coreclr enhancement Product code improvement that does NOT require public API changes/additions

Comments

@ayende
Copy link
Contributor

ayende commented Jul 14, 2016

Consider the following code:

 var str = "{'Items':[" + 
     string.Join(",", Enumerable.Range(0, 3000).Select(i => "{}"))
     + "]}";
 var jObject = JObject.Parse(str);

This generates a string that is ~10KB in size, but parsing it using JSON.Net takes about 400KB.
This specific example is something that we run into, where a specific code path got us to allocate so much memory we started swapping and the machine died.

The problem is that there is really no way for us to protect against that.

It would be nice if we had a way to limit allocations in particular times so we can say "okay, I'm allowing this thread to allocate in this scope up to 100MB", or even just to get the amount of allocations that were done in a thread between two points.

@ayende
Copy link
Contributor Author

ayende commented Jul 14, 2016

This can be done right now via ETW, of course, but that isn't really something that you can do from within the same process and make decisions based on it.

@ayende
Copy link
Contributor Author

ayende commented Jul 14, 2016

@omariom
Copy link
Contributor

omariom commented Jul 14, 2016

@ayende Ha! I managed to comment in the future )

AppDomain.CurrentDomain.MonitoringTotalAllocatedMemorySize is not thread specific but it can be consumed from within process and AFAIK is a bit more precise than GC ETW events.

@ayende
Copy link
Contributor Author

ayende commented Jul 14, 2016

@omariom Yes, you have the private key for the post :-)

I didn't know about that property, or the monitoring options there in general. That is pretty cool.

That said, it doesn't appear to exists in the CoreCLR.

And while this would be useful, since the data is already sent on a per thread basis, if we can get it on a thread by thread case, it would be much more useful

@ayende
Copy link
Contributor Author

ayende commented Jul 14, 2016

Another problem with this property is that it is only updated on full GC.
Since .NET uses a bump allocator, keeping track of the amount of memory each thread is allocating is going to be much cheaper.

@swgillespie
Copy link
Contributor

cc @Maoni0

@Maoni0
Copy link
Member

Maoni0 commented Jul 14, 2016

I am not aware there's any ETW event that would give you data on this granularity, and if so with any kind of overhead that's not horrendous.

But there is something that gives you exactly the number of bytes allocated between 2 points on a thread with completely trivial overhead since we already keep this data anyway - it's a matter of checking if it's exposed in some reasonable way. In clrmd terms you get this from IThreadData.AllocPtr/AllocLimit - this keeps track of the current allocation context. Unfortunately I don't see it exposing the alloc_bytes on the thread's alloc_context but that can be added (or you could just read it yourself off of the thread alloc context with the offset...this doesn't really change). Years ago I wrote a VS plug in that would show you the # of bytes of allocations done between 2 breakpoints (I didn't keep it up though so it most likely doesn't work with the current VS) and this is what I used. But I dunno if clrmd works with CoreCLR.

On desktop (unfortunately this is also not exposed in CoreCLR) we have this API that allows you to get to the alloc_bytes which is updated every allocation context.

@leculver, is clrmd usable from CoreCLR? If not is there anything equivalent that you can use to get to the dac data? And can we expose alloc_bytes as part of thread data?

@Maoni0
Copy link
Member

Maoni0 commented Jul 14, 2016

Also AppDomain.CurrentDomain.MonitoringTotalAllocatedMemorySize is not updated with each full GC - it's not updated during a GC at all, only at allocation time - it's updated each time we get a new alloc context (like the API I mentioned above) so this is accurate to +/- a few kbytes. MonitoringSurvivedMemorySize is only guaranteed to be accurate after a full collection (it's updated more often than that, it's just not really accurate). Of course, it's also not exposed in CoreCLR.

@ayende
Copy link
Contributor Author

ayende commented Jul 15, 2016

@Maoni0 What I would love to have is the AllocPtr/AllocLimit values for the thread, yes.
The reason this is important, I want to be able to make decisions on whatever to proceed with the current operation or abort / reduce memory consumption.

Exposing this to the CoreCLR seems like it shouldn't have a high cost to it, right?

Doing that with ClrMD would be great for debugging purposes, but it would be best if this was exposed to managed code directly. This has also the great benefit of making certain benchmarks very easy to write, like Stopwatch, we can have AllocationWatch and see how much memory a particular operation consumes.

This can actually be pretty hard to figure out now, and require using profiling.

@Maoni0
Copy link
Member

Maoni0 commented Jul 15, 2016

I agree that adding a managed API for this is a good thing to do.

Instead of giving you AllocPtr/AllocLimit which is only for the current alloc context (and you need to know the alloc contexts that happened inbetween), it's better to just expose something like GetCurrentAllocated on the Thread class like we do with ADs.

@ayende
Copy link
Contributor Author

ayende commented Jul 15, 2016

@Maoni0 That would be the simplest solution, and by just remember the previous value, we could compare and see how much we allocated.

Question, how expensive is this likely to be? Since this is already tracked, I'm assuming this is effectively just reading a value.

@ayende
Copy link
Contributor Author

ayende commented Jul 15, 2016

Also, any suggestions on where to put it?
I'm assuming that Thread / GC would be a bad idea.

System.Diagnostics.AllocationStatistics.GetThreadAllocatedBytes() ?

@terrajobst
Copy link
Member

Following our API review process, I've filed the request here: https://github.com/dotnet/corefx/issues/10157.

@Maoni0
Copy link
Member

Maoni0 commented Jul 19, 2016

@ayende, I don't see a problem with exposing this on Thread - it's an attribute of a Thread so seems logical to expose it there. Do you have strong objection? The cost of this would be a call (since we have to call into the runtime from mscorlib) and some arithmetic (add/sub) on a couple of values.

@Maoni0
Copy link
Member

Maoni0 commented Aug 3, 2016

Since we already opened 10157 for the API I'm closing this issue. Thank you for bringing this up, @ayende!

@Maoni0 Maoni0 closed this as completed Aug 3, 2016
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@dotnet dotnet locked as resolved and limited conversation to collaborators Dec 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-GC-coreclr enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

5 participants