New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide API to understand how many allocations have happened #10157

Closed
terrajobst opened this Issue Jul 19, 2016 · 41 comments

Comments

@terrajobst
Member

terrajobst commented Jul 19, 2016

As discussed in dotnet/coreclr#6275 we would like an API to expose how many bytes were allocated since the beginning of a thread's life time.

What about an API like this:

namespace System
{
    public static class GC
    {
        public long GetAllocatedBytesForCurrentThread();
    }
}

The usage could like this:

[Fact]
public void EnsureAllocations()
{
    var allocationsBefore = GC.GetAllocatedBytesForCurrentThread();

    var str = "{'Items':[" + 
        string.Join(",", Enumerable.Range(0, 3000).Select(i => "{}")) + "]}";
    var jObject = JObject.Parse(str);

    var allocationsAfter = Thread.GetAllocatedBytesForCurrentThread();
    var allocations = allocationsAfter - allocationsBefore;

    const long MaximumBytesAllowed = 200 * 1024;
    Assert.LowerOrEqual(allocations, MaximumBytesAllowed);
}

Questions:

  • Can/should this method track bytes per native OS thread or for the logical call context?
  • Does this track memory across async boundaries? Should it?

@Maoni0

Updates

  • Based on @vancem's concerns we've decided not to add GetAllocatedBytesForThread(int managedThreadId).
@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 19, 2016

Member

Thanks @terrajobst.

This tracks allocations on the GC heap for the managed thread and does not track memory when the execution moves to another thread.

For each managed thread, it maintains an allocation context - allocations from this thread will be managed by this context and the number will be reported based on certain values in this context.

Member

Maoni0 commented Jul 19, 2016

Thanks @terrajobst.

This tracks allocations on the GC heap for the managed thread and does not track memory when the execution moves to another thread.

For each managed thread, it maintains an allocation context - allocations from this thread will be managed by this context and the number will be reported based on certain values in this context.

@terrajobst

This comment has been minimized.

Show comment
Hide comment
@terrajobst

terrajobst Jul 19, 2016

Member

When you say "thread" I think you mean whatever the given runtime decides to use to implement the System.Thread type. Which could be fibers but are usually threads.

But what happens for tasks and async? Some information, such as synchronization context, is passed via the call context and thus flows across thread boundaries so that it doesn't matter if the given async operation completes on the same thread or a different thread. I assume allocation information wouldn't flow?

Member

terrajobst commented Jul 19, 2016

When you say "thread" I think you mean whatever the given runtime decides to use to implement the System.Thread type. Which could be fibers but are usually threads.

But what happens for tasks and async? Some information, such as synchronization context, is passed via the call context and thus flows across thread boundaries so that it doesn't matter if the given async operation completes on the same thread or a different thread. I assume allocation information wouldn't flow?

@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 19, 2016

Member

When you say "thread" I think you mean whatever the given runtime decides to use to implement the System.Thread type.

Yes.

But what happens for tasks and async? Some information, such as synchronization context, is passed via the call context. I assume this information wouldn't flow?

The allocated bytes is an attribute on a thread - it has no concept of tasks/async. If you want to know how many bytes a method that involves some async stuff allocates, you can inspect the synchronization context yourself to follow how the execution switches threads and get the allocated bytes for thread it switches to and add them together.

Member

Maoni0 commented Jul 19, 2016

When you say "thread" I think you mean whatever the given runtime decides to use to implement the System.Thread type.

Yes.

But what happens for tasks and async? Some information, such as synchronization context, is passed via the call context. I assume this information wouldn't flow?

The allocated bytes is an attribute on a thread - it has no concept of tasks/async. If you want to know how many bytes a method that involves some async stuff allocates, you can inspect the synchronization context yourself to follow how the execution switches threads and get the allocated bytes for thread it switches to and add them together.

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 19, 2016

Contributor

I suggest that the name would be: GetTotalAllocatedBytes(), instead of GetCurrentAllocatedBytes(), because it represent the sum total of memory allocated by this thread (as I understand it), not the current allocated (but not GCed) memory.

Contributor

ayende commented Jul 19, 2016

I suggest that the name would be: GetTotalAllocatedBytes(), instead of GetCurrentAllocatedBytes(), because it represent the sum total of memory allocated by this thread (as I understand it), not the current allocated (but not GCed) memory.

@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 19, 2016

Member

I don't have strong opinions on the name. I'll leave that up to @terrajobst.

This represents the allocated bytes, not survived bytes. Allocated is not taking GC into consideration while survived is. So if you do

ulong currentAlloc = Thread.NewGetAllocAPI();
MyClass o = new MyClass(); // this allocates 100 bytes.
// do stuff with o but no other allocations
ulong allocated = Thread.NewGetAllocAPI() - currentAlloc;

allocated will be 100 regardless of whether a GC happened inbetween and possibly cleaned up o.

Member

Maoni0 commented Jul 19, 2016

I don't have strong opinions on the name. I'll leave that up to @terrajobst.

This represents the allocated bytes, not survived bytes. Allocated is not taking GC into consideration while survived is. So if you do

ulong currentAlloc = Thread.NewGetAllocAPI();
MyClass o = new MyClass(); // this allocates 100 bytes.
// do stuff with o but no other allocations
ulong allocated = Thread.NewGetAllocAPI() - currentAlloc;

allocated will be 100 regardless of whether a GC happened inbetween and possibly cleaned up o.

@svick

This comment has been minimized.

Show comment
Hide comment
@svick

svick Jul 19, 2016

Contributor

Should this method be on Thread? There is already a somewhat similar method GetTotalMemory(bool) on the GC class. Maybe the two methods should stay together? In that case, the name could be something like GC.GetCurrentThreadTotalAllocatedBytes() (though that is a fairly long name).

Contributor

svick commented Jul 19, 2016

Should this method be on Thread? There is already a somewhat similar method GetTotalMemory(bool) on the GC class. Maybe the two methods should stay together? In that case, the name could be something like GC.GetCurrentThreadTotalAllocatedBytes() (though that is a fairly long name).

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 20, 2016

Contributor

What about an overload? GC.GetTotalMemory(Thread thread) ?

Contributor

ayende commented Jul 20, 2016

What about an overload? GC.GetTotalMemory(Thread thread) ?

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 20, 2016

Contributor

Being able to ask that on another thread is interesting and give us the chance to do some interesting metrics (what is the allocation rate per thread, so we can make decisions on that).

Contributor

ayende commented Jul 20, 2016

Being able to ask that on another thread is interesting and give us the chance to do some interesting metrics (what is the allocation rate per thread, so we can make decisions on that).

@mellinoe

This comment has been minimized.

Show comment
Hide comment
@mellinoe

mellinoe Jul 20, 2016

Contributor

I'm not sure we can / want to have System.GC depend on the higher-level System.Threading.Thread concept. UWP currently doesn't actually support Thread (which we should fix), but even still I'm not sure it's the best dependency to bring in.

Contributor

mellinoe commented Jul 20, 2016

I'm not sure we can / want to have System.GC depend on the higher-level System.Threading.Thread concept. UWP currently doesn't actually support Thread (which we should fix), but even still I'm not sure it's the best dependency to bring in.

@terrajobst

This comment has been minimized.

Show comment
Hide comment
@terrajobst

terrajobst Jul 21, 2016

Member

I agree with @mellinoe, having that on the GC type doesn't seem to be the right place -- especially because as @Maoni0 outlined that this API wouldn't take GCs into account. So I think Thread is more suitable.

However, it seems @ayende raised good points on naming and versatility, so what about:

namespace System.Threading
{
    partial class Thread
    {
        public ulong GetTotalAllocatedBytes();
    }
}
Member

terrajobst commented Jul 21, 2016

I agree with @mellinoe, having that on the GC type doesn't seem to be the right place -- especially because as @Maoni0 outlined that this API wouldn't take GCs into account. So I think Thread is more suitable.

However, it seems @ayende raised good points on naming and versatility, so what about:

namespace System.Threading
{
    partial class Thread
    {
        public ulong GetTotalAllocatedBytes();
    }
}
@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 21, 2016

Member

Unless you've stashed the Thread object somewhere it will be pretty expensive to get to the object via CurrentThread. And in many cases we do not expose the Thread object. So I would provide both a static method for getting the current thread's alloc bytes and an instance method for people who have the Thread object stashed somewhere.

Member

Maoni0 commented Jul 21, 2016

Unless you've stashed the Thread object somewhere it will be pretty expensive to get to the object via CurrentThread. And in many cases we do not expose the Thread object. So I would provide both a static method for getting the current thread's alloc bytes and an instance method for people who have the Thread object stashed somewhere.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Jul 22, 2016

Any specific reason why this should be implemented as a Method instead of a Property?

Does this take in to account native allocations as well? If not, I think having “Total” in the name can be confusing as well.

I suggest keeping it short:

Thread.AllocatedBytes

Or

Thread.ManagedHeapAllocatedBytes (in case native memory is planned to be tracked separately in the future)

I think in this context the past tense clearly conveys the notion of something being accumulated.

Any specific reason why this should be implemented as a Method instead of a Property?

Does this take in to account native allocations as well? If not, I think having “Total” in the name can be confusing as well.

I suggest keeping it short:

Thread.AllocatedBytes

Or

Thread.ManagedHeapAllocatedBytes (in case native memory is planned to be tracked separately in the future)

I think in this context the past tense clearly conveys the notion of something being accumulated.

@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 25, 2016

Member

this does not include native allocations.

Member

Maoni0 commented Jul 25, 2016

this does not include native allocations.

@vancem

This comment has been minimized.

Show comment
Hide comment
@vancem

vancem Jul 25, 2016

Member

Some comments.

I don't believe that this API should be on System.Thread. Thread really is a very FUNDAMENTAL concept, so you don't want it's property-space polluted with things like this (there are plenty of other metrics that you might also want to track). Conceptually performance metrics are LESS fundamental (you can live without them (as we have been doing)).

I don't think it is bad to put this in the GC class because GC does not really mean GC so much as 'managed memory'. However you certainly could make the argument that since these APIs are not fundamental, they should go in their own class (e.g. GCHeapMetrics).

On the question of whether this tracks native allocations: NO. For several reasons

  1. It is not usually intersting
  2. It is hard to implement (there is no one choke point to instrument).
  3. It is confusing (How can you explain the number to people (how do they know it is correct?). With GC memory, the allocation events give a precise breakdown that should 'add up'.

Should it track over async: BOTH

Basically we need two APIs (or a flag on one API). As Manoi indicates the easiest implementation ignores Task hops, but that makes using this for interesting scenarios like allocations per service request (which almost certainly will use async), problematic. The async-aware version is built on top of the other by using AsyncLocals to 'flow' the lower level information (which does have a cost, so you don't want to do it if don't want it)

That fact that there is overhead actually argues for a different API. Instead of this being 'always on' we really should have a object that 'holds' the current value. Thus we should really have GCAllocationTracker, and if it is alive then tracking happens, and if none are alive then you can avoid the overhead of tracking. This generalizes nicely to other performance tracking (which may also have overhead). For example, in the future you might ask it to remember the types being allocoated, or their stack traces).

Anyway the API would look like

class GCAllocationTracker() {
GCAllocationTracker(bool includeSpawnedTasks);

 long AllocatedBytes { get } 

}

Member

vancem commented Jul 25, 2016

Some comments.

I don't believe that this API should be on System.Thread. Thread really is a very FUNDAMENTAL concept, so you don't want it's property-space polluted with things like this (there are plenty of other metrics that you might also want to track). Conceptually performance metrics are LESS fundamental (you can live without them (as we have been doing)).

I don't think it is bad to put this in the GC class because GC does not really mean GC so much as 'managed memory'. However you certainly could make the argument that since these APIs are not fundamental, they should go in their own class (e.g. GCHeapMetrics).

On the question of whether this tracks native allocations: NO. For several reasons

  1. It is not usually intersting
  2. It is hard to implement (there is no one choke point to instrument).
  3. It is confusing (How can you explain the number to people (how do they know it is correct?). With GC memory, the allocation events give a precise breakdown that should 'add up'.

Should it track over async: BOTH

Basically we need two APIs (or a flag on one API). As Manoi indicates the easiest implementation ignores Task hops, but that makes using this for interesting scenarios like allocations per service request (which almost certainly will use async), problematic. The async-aware version is built on top of the other by using AsyncLocals to 'flow' the lower level information (which does have a cost, so you don't want to do it if don't want it)

That fact that there is overhead actually argues for a different API. Instead of this being 'always on' we really should have a object that 'holds' the current value. Thus we should really have GCAllocationTracker, and if it is alive then tracking happens, and if none are alive then you can avoid the overhead of tracking. This generalizes nicely to other performance tracking (which may also have overhead). For example, in the future you might ask it to remember the types being allocoated, or their stack traces).

Anyway the API would look like

class GCAllocationTracker() {
GCAllocationTracker(bool includeSpawnedTasks);

 long AllocatedBytes { get } 

}

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 25, 2016

Contributor

Given just a GCHeapMetrics or MemoryMetrics (I agree with your
reasoning that this should be separate class) and a single thread specific
property, users can implement the async version externally.

I would rather focus on the stuff that require cooperation from the runtime
and isn't possible without it, then do the rest as a separate issue to do
the async stuff once you can just write the managed code only and send a PR

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Mon, Jul 25, 2016 at 10:23 PM, Vance Morrison notifications@github.com
wrote:

Some comments.

I don't believe that this API should be on System.Thread. Thread really is
a very FUNDAMENTAL concept, so you don't want it's property-space polluted
with things like this (there are plenty of other metrics that you might
also want to track). Conceptually performance metrics are LESS fundamental
(you can live without them (as we have been doing)).

I don't think it is bad to put this in the GC class because GC does not
really mean GC so much as 'managed memory'. However you certainly could
make the argument that since these APIs are not fundamental, they should go
in their own class (e.g. GCHeapMetrics).

On the question of whether this tracks native allocations: NO. For several
reasons

  1. It is not usually intersting
  2. It is hard to implement (there is no one choke point to instrument).
  3. It is confusing (How can you explain the number to people (how do they
    know it is correct?). With GC memory, the allocation events give a precise
    breakdown that should 'add up'.

Should it track over async: BOTH

Basically we need two APIs (or a flag on one API). As Manoi indicates the
easiest implementation ignores Task hops, but that makes using this for
interesting scenarios like allocations per service request (which almost
certainly will use async), problematic. The async-aware version is built on
top of the other by using AsyncLocals to 'flow' the lower level information
(which does have a cost, so you don't want to do it if don't want it)

That fact that there is overhead actually argues for a different API.
Instead of this being 'always on' we really should have a object that
'holds' the current value. Thus we should really have GCAllocationTracker,
and if it is alive then tracking happens, and if none are alive then you
can avoid the overhead of tracking. This generalizes nicely to other
performance tracking (which may also have overhead). For example, in the
future you might ask it to remember the types being allocoated, or their
stack traces).

Anyway the API would look like

class GCAllocationTracker() {
GCAllocationTracker(bool includeSpawnedTasks);

long AllocatedBytes { get }

}


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10157 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHIs3WlzLK1m9JbOj0cQHXbsE6MjUUQks5qZQ0vgaJpZM4JQMec
.

Contributor

ayende commented Jul 25, 2016

Given just a GCHeapMetrics or MemoryMetrics (I agree with your
reasoning that this should be separate class) and a single thread specific
property, users can implement the async version externally.

I would rather focus on the stuff that require cooperation from the runtime
and isn't possible without it, then do the rest as a separate issue to do
the async stuff once you can just write the managed code only and send a PR

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Mon, Jul 25, 2016 at 10:23 PM, Vance Morrison notifications@github.com
wrote:

Some comments.

I don't believe that this API should be on System.Thread. Thread really is
a very FUNDAMENTAL concept, so you don't want it's property-space polluted
with things like this (there are plenty of other metrics that you might
also want to track). Conceptually performance metrics are LESS fundamental
(you can live without them (as we have been doing)).

I don't think it is bad to put this in the GC class because GC does not
really mean GC so much as 'managed memory'. However you certainly could
make the argument that since these APIs are not fundamental, they should go
in their own class (e.g. GCHeapMetrics).

On the question of whether this tracks native allocations: NO. For several
reasons

  1. It is not usually intersting
  2. It is hard to implement (there is no one choke point to instrument).
  3. It is confusing (How can you explain the number to people (how do they
    know it is correct?). With GC memory, the allocation events give a precise
    breakdown that should 'add up'.

Should it track over async: BOTH

Basically we need two APIs (or a flag on one API). As Manoi indicates the
easiest implementation ignores Task hops, but that makes using this for
interesting scenarios like allocations per service request (which almost
certainly will use async), problematic. The async-aware version is built on
top of the other by using AsyncLocals to 'flow' the lower level information
(which does have a cost, so you don't want to do it if don't want it)

That fact that there is overhead actually argues for a different API.
Instead of this being 'always on' we really should have a object that
'holds' the current value. Thus we should really have GCAllocationTracker,
and if it is alive then tracking happens, and if none are alive then you
can avoid the overhead of tracking. This generalizes nicely to other
performance tracking (which may also have overhead). For example, in the
future you might ask it to remember the types being allocoated, or their
stack traces).

Anyway the API would look like

class GCAllocationTracker() {
GCAllocationTracker(bool includeSpawnedTasks);

long AllocatedBytes { get }

}


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10157 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHIs3WlzLK1m9JbOj0cQHXbsE6MjUUQks5qZQ0vgaJpZM4JQMec
.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Jul 26, 2016

My understanding is that the api aims to answer the question of how much was allocated during an execution path, and since there is no class modeling the allocation part of the managed heap (e.g only ‘new’ keyword), having the word ‘Thread’ in the api is as close you can get in terms of expressiveness. Personally, I don’t consider having that property on the Thread class not sensible but if you insist on separating metrics I would suggest:

class ThreadMetrics() { ThreadMetrics(); long AllocatedBytes { get } }

Q: Can this be a struct?

I’m not sure what are the exact requirements for the ‘Task’ version, might be good idea to list those.

As for tracking native memory, I wouldn’t say it’s not usually interesting, since anyone working with bitmaps may want to know what’s behind the managed proxy object, but indeed this look like a secondary concern. I agree it would be hard to implement since there’s no one choke point to instrument, otherwise the property value would provide a complete picture.

My understanding is that the api aims to answer the question of how much was allocated during an execution path, and since there is no class modeling the allocation part of the managed heap (e.g only ‘new’ keyword), having the word ‘Thread’ in the api is as close you can get in terms of expressiveness. Personally, I don’t consider having that property on the Thread class not sensible but if you insist on separating metrics I would suggest:

class ThreadMetrics() { ThreadMetrics(); long AllocatedBytes { get } }

Q: Can this be a struct?

I’m not sure what are the exact requirements for the ‘Task’ version, might be good idea to list those.

As for tracking native memory, I wouldn’t say it’s not usually interesting, since anyone working with bitmaps may want to know what’s behind the managed proxy object, but indeed this look like a secondary concern. I agree it would be hard to implement since there’s no one choke point to instrument, otherwise the property value would provide a complete picture.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Jul 26, 2016

@Maoni0 Can you please explain how this information gets updated after doing 'new' and what exactly is a thread 'alloc_contex' ?

@Maoni0 Can you please explain how this information gets updated after doing 'new' and what exactly is a thread 'alloc_contex' ?

@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 27, 2016

Member

@clrjunkie this is described in the Design of Allocator section in GC BotR. GC gives out allocation contexts and then JIT will usually sub allocate in this alloc context. When JIT runs out an alloc context it will goto GC and get a new one.

Member

Maoni0 commented Jul 27, 2016

@clrjunkie this is described in the Design of Allocator section in GC BotR. GC gives out allocation contexts and then JIT will usually sub allocate in this alloc context. When JIT runs out an alloc context it will goto GC and get a new one.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Jul 27, 2016

@Maoni0 Thanks.

  • Can you please add diagram to the document (hyphens and pipes will suffice) to help visualize how heap segments are divided into object/empirical segments and how multiple heaps come into play. I mean given the following text, it’s unclear to me if each “heap segment” is partitioned into individual “object segments” or if one “object segment” spans multiple “heap segments”. Furthermore, I don't understand how ephemeral segments relate to small object heap segments (did you mean “objects of ephemeral generation”?)

Physical representation of the managed heap:

The heap segments are partitioned into small and large object segments, given the distinction of small and large objects. On each heap the heap segments are chained together. There is at least one small object segment and one large segment.

There’s always only one ephemeral segment in each small object heap, which is where gen0 and gen1 live.

In addition to the ephemeral segment, there can be zero, one or more additional segments, which will be gen2 segments since they only contain gen2 objects

  • Currently, only steps involved in the collection process are outlined under “Functional Behavior”. Can you please detail the steps involved for memory allocation as well?
  • My understanding is that the motivation behind “Allocation Contexts” is to allocate memory in chunks to reduce allocation latency. Will the AllocatedBytes reflect the sum of all Allocation Contexts or actual "committed object memory"? I mean if it’s the sum of all “Allocation Contexts” that implies that in the worst case the error margin for reporting actual user allocated bytes would be the sum of all “Allocation Context” size (e.g quantum) minus object overhead bytes + 1 byte minimum. If so, I think it’s important to document so the AllocatedBytes figure doesn’t look erroneous.
  • I don’t quite understand the relationship between “generations” and “allocation contexts”

On a single-processor (meaning 1 logical processor) machine, a single context is used, which is the generation 0 allocation context.

@Maoni0 Thanks.

  • Can you please add diagram to the document (hyphens and pipes will suffice) to help visualize how heap segments are divided into object/empirical segments and how multiple heaps come into play. I mean given the following text, it’s unclear to me if each “heap segment” is partitioned into individual “object segments” or if one “object segment” spans multiple “heap segments”. Furthermore, I don't understand how ephemeral segments relate to small object heap segments (did you mean “objects of ephemeral generation”?)

Physical representation of the managed heap:

The heap segments are partitioned into small and large object segments, given the distinction of small and large objects. On each heap the heap segments are chained together. There is at least one small object segment and one large segment.

There’s always only one ephemeral segment in each small object heap, which is where gen0 and gen1 live.

In addition to the ephemeral segment, there can be zero, one or more additional segments, which will be gen2 segments since they only contain gen2 objects

  • Currently, only steps involved in the collection process are outlined under “Functional Behavior”. Can you please detail the steps involved for memory allocation as well?
  • My understanding is that the motivation behind “Allocation Contexts” is to allocate memory in chunks to reduce allocation latency. Will the AllocatedBytes reflect the sum of all Allocation Contexts or actual "committed object memory"? I mean if it’s the sum of all “Allocation Contexts” that implies that in the worst case the error margin for reporting actual user allocated bytes would be the sum of all “Allocation Context” size (e.g quantum) minus object overhead bytes + 1 byte minimum. If so, I think it’s important to document so the AllocatedBytes figure doesn’t look erroneous.
  • I don’t quite understand the relationship between “generations” and “allocation contexts”

On a single-processor (meaning 1 logical processor) machine, a single context is used, which is the generation 0 allocation context.

@Maoni0 Maoni0 added the 0 - Backlog label Jul 27, 2016

@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 27, 2016

Member

Those are good suggestions for documentation and we will add them to our backlog. For the API, we will document them appropriately on MSDN.

Member

Maoni0 commented Jul 27, 2016

Those are good suggestions for documentation and we will add them to our backlog. For the API, we will document them appropriately on MSDN.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Jul 27, 2016

"we will add them to our backlog. For the API, we will document them appropriately on MSDN"

@terrajobst this isn't serious..

"we will add them to our backlog. For the API, we will document them appropriately on MSDN"

@terrajobst this isn't serious..

@terrajobst

This comment has been minimized.

Show comment
Hide comment
@terrajobst

terrajobst Jul 29, 2016

Member

We've reviewed the API today with @Maoni0, @vancem, @KrzysztofCwalina, and @mellinoe and concluded the following:

  • We want to make sure this API is low-level, and can be used everywhere. Thus, we don't want to depend on the Thread class.
  • We want, however, to be able to the allocated bytes for the current thread as well as any thread.
namespace System
{
    public static class GC
    {
        public long GetAllocatedBytesForCurrentThread();
        public long GetAllocatedBytesForThread(int managedThreadId);
    }
}

We should also have a more high-level API that allows tracking, as @vancem explained above. But this should be a separate issue. I've filed #10431.

Also, this API will not track bytes across information contexts. This should be part of a higher level API.

Member

terrajobst commented Jul 29, 2016

We've reviewed the API today with @Maoni0, @vancem, @KrzysztofCwalina, and @mellinoe and concluded the following:

  • We want to make sure this API is low-level, and can be used everywhere. Thus, we don't want to depend on the Thread class.
  • We want, however, to be able to the allocated bytes for the current thread as well as any thread.
namespace System
{
    public static class GC
    {
        public long GetAllocatedBytesForCurrentThread();
        public long GetAllocatedBytesForThread(int managedThreadId);
    }
}

We should also have a more high-level API that allows tracking, as @vancem explained above. But this should be a separate issue. I've filed #10431.

Also, this API will not track bytes across information contexts. This should be part of a higher level API.

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 29, 2016

Contributor

Are there going to be cost associated with querying allocation on a different thread, vs. checking on the local one?

Contributor

ayende commented Jul 29, 2016

Are there going to be cost associated with querying allocation on a different thread, vs. checking on the local one?

@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Jul 29, 2016

Member

as far as I can see, based on a cursory reading of code in mscorlib/vm, given a managed thread ID, you'll need to enumerate through managed threads till you find one with that ID. we could sort the managed thread IDs to make the look up faster but you still gotta do the look up without a Thread object. @ericeil would know if there's a better way...

Member

Maoni0 commented Jul 29, 2016

as far as I can see, based on a cursory reading of code in mscorlib/vm, given a managed thread ID, you'll need to enumerate through managed threads till you find one with that ID. we could sort the managed thread IDs to make the look up faster but you still gotta do the look up without a Thread object. @ericeil would know if there's a better way...

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 29, 2016

Contributor

But calling it on the local thread is fast? The reason I'm asking is that if there is going to be a linear search, that is going to hurt.
Alternatively, what about an object that represent tracking on a particular thread, so we can cache that? (Or maybe cache the thread object itself on calls to the static method?)

Contributor

ayende commented Jul 29, 2016

But calling it on the local thread is fast? The reason I'm asking is that if there is going to be a linear search, that is going to hurt.
Alternatively, what about an object that represent tracking on a particular thread, so we can cache that? (Or maybe cache the thread object itself on calls to the static method?)

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 29, 2016

Contributor

Just to note, I have some scenarios where I have greater than 2000 threads running on a system.

Contributor

ayende commented Jul 29, 2016

Just to note, I have some scenarios where I have greater than 2000 threads running on a system.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Jul 29, 2016

public long GetAllocatedBytesForThread(int managedThreadId);

In my opinion, this overload makes no sense.

When would be the "right time" to call this overload (asynchronously) on the managedThreadId to reason about the thread's allocations?

What would the result of such method call or any successive call even mean when the result is based on arbitrary execution points of the target thread, especially when allocated bytes is cumulative?

public long GetAllocatedBytesForThread(int managedThreadId);

In my opinion, this overload makes no sense.

When would be the "right time" to call this overload (asynchronously) on the managedThreadId to reason about the thread's allocations?

What would the result of such method call or any successive call even mean when the result is based on arbitrary execution points of the target thread, especially when allocated bytes is cumulative?

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Jul 29, 2016

Contributor

@clrjunkie Assume that I have a bunch of threads doing work. I can use this in a watchdog thread to monitor them and see that they aren't allocating too much over time, and if they do, I can stop them.

that is a pretty rare scenario, and would probably be handled better by the thread itself, though.

Contributor

ayende commented Jul 29, 2016

@clrjunkie Assume that I have a bunch of threads doing work. I can use this in a watchdog thread to monitor them and see that they aren't allocating too much over time, and if they do, I can stop them.

that is a pretty rare scenario, and would probably be handled better by the thread itself, though.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Jul 29, 2016

@ayende
would probably be handled better by the thread itself, period.

@ayende
would probably be handled better by the thread itself, period.

@vancem

This comment has been minimized.

Show comment
Hide comment
@vancem

vancem Aug 1, 2016

Member

Immo @terrajobst, I don't think we should add the API

   public long GetAllocatedBytesForThread(int managedThreadId);

At this time. It is not clear to me what the scenario is other than simply accumulating work for all threads. We should either simply drop it, and if we want, we can add this API

    public long GetAllocatedBytesForAllThreads();

However I am conflicted with this one because we really don't want to synchronize the threads it so it is not really accurate (I suppose we can document that).

I am tempted at this point to only add GetAllocatedBytesForCurrentThread() until we have a clearer idea of the scenario for any of the other candicates.

Member

vancem commented Aug 1, 2016

Immo @terrajobst, I don't think we should add the API

   public long GetAllocatedBytesForThread(int managedThreadId);

At this time. It is not clear to me what the scenario is other than simply accumulating work for all threads. We should either simply drop it, and if we want, we can add this API

    public long GetAllocatedBytesForAllThreads();

However I am conflicted with this one because we really don't want to synchronize the threads it so it is not really accurate (I suppose we can document that).

I am tempted at this point to only add GetAllocatedBytesForCurrentThread() until we have a clearer idea of the scenario for any of the other candicates.

@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Aug 1, 2016

Contributor

This one already exists:

public long GetAllocatedBytesForAllThreads();

https://msdn.microsoft.com/en-us/library/system.appdomain.monitoringtotalallocatedmemorysize(v=vs.110).aspx

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Mon, Aug 1, 2016 at 10:18 PM, Vance Morrison notifications@github.com
wrote:

Immo @terrajobst https://github.com/terrajobst, I don't think we should
add the API

public long GetAllocatedBytesForThread(int managedThreadId);

public long GetAllocatedBytesForCurrentThread();
public long GetAllocatedBytesForAllThreads();


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10157 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHIszoON1FEggCujmhTfjXIQXu8mkQgks5qbkaBgaJpZM4JQMec
.

Contributor

ayende commented Aug 1, 2016

This one already exists:

public long GetAllocatedBytesForAllThreads();

https://msdn.microsoft.com/en-us/library/system.appdomain.monitoringtotalallocatedmemorysize(v=vs.110).aspx

*Hibernating Rhinos Ltd *

Oren Eini* l CEO l *Mobile: + 972-52-548-6969

Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811

On Mon, Aug 1, 2016 at 10:18 PM, Vance Morrison notifications@github.com
wrote:

Immo @terrajobst https://github.com/terrajobst, I don't think we should
add the API

public long GetAllocatedBytesForThread(int managedThreadId);

public long GetAllocatedBytesForCurrentThread();
public long GetAllocatedBytesForAllThreads();


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#10157 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHIszoON1FEggCujmhTfjXIQXu8mkQgks5qbkaBgaJpZM4JQMec
.

@vancem

This comment has been minimized.

Show comment
Hide comment
@vancem

vancem Aug 1, 2016

Member

The issue is that AppDomain.MonitoringTotalAllocatedMemorySize does not exist in .NET Core. In general AppDomains are concept we don't wish to promote going forward, and frankly having the monitoring APIs split like this is not good.

But I am OK with simply not creating a GC.GetAllocatedBytesForAllThreads right now.

Member

vancem commented Aug 1, 2016

The issue is that AppDomain.MonitoringTotalAllocatedMemorySize does not exist in .NET Core. In general AppDomains are concept we don't wish to promote going forward, and frankly having the monitoring APIs split like this is not good.

But I am OK with simply not creating a GC.GetAllocatedBytesForAllThreads right now.

@terrajobst

This comment has been minimized.

Show comment
Hide comment
@terrajobst

terrajobst Aug 2, 2016

Member

@vancem neither adding GetAllocatedBytesForThread(int managedThreadId) nor GetAllocatedBytesForAllThreads() seems reasonable to me. I've updated the proposal.

Member

terrajobst commented Aug 2, 2016

@vancem neither adding GetAllocatedBytesForThread(int managedThreadId) nor GetAllocatedBytesForAllThreads() seems reasonable to me. I've updated the proposal.

@Maoni0

This comment has been minimized.

Show comment
Hide comment
@Maoni0

Maoni0 Aug 2, 2016

Member

@vancem, you don't need to synchronize for getting allocated for all threads. This would be done the same way we do AD allocated bytes - we don't synchronize; we just get each heap's allocated bytes. Unless you completely stopped every thread in the process except the current one, it will not be totally accurate anyway and the level of accuracy we are providing is sufficient. However, I would not create this because no one has asked to create it and I don't see a strong usage scenario for it.

@terrajobst, I am fine with not providing the API with managedThreadID. As I mentioned in the API review, I am not thrilled with the user providing such an ID (instead of the Thread object) anyway. While I can see a reasonable usage scenario for getting another thread's allocated bytes as @ayende mentioned, you could do it on the thread you are monitoring (may not be as elegant but then again neither is the API with the managed thread ID).

Member

Maoni0 commented Aug 2, 2016

@vancem, you don't need to synchronize for getting allocated for all threads. This would be done the same way we do AD allocated bytes - we don't synchronize; we just get each heap's allocated bytes. Unless you completely stopped every thread in the process except the current one, it will not be totally accurate anyway and the level of accuracy we are providing is sufficient. However, I would not create this because no one has asked to create it and I don't see a strong usage scenario for it.

@terrajobst, I am fine with not providing the API with managedThreadID. As I mentioned in the API review, I am not thrilled with the user providing such an ID (instead of the Thread object) anyway. While I can see a reasonable usage scenario for getting another thread's allocated bytes as @ayende mentioned, you could do it on the thread you are monitoring (may not be as elegant but then again neither is the API with the managed thread ID).

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Aug 2, 2016

Having the allocation monitoring code execute within the same thread as the code being monitored is not only elegant (and quite innovative) but more importantly: it provides a simple programing model for implementing the corrective action (no need for any signaling protocol between the watchdog thread and the worker thread). You can also log allocations per code sections, deterministically, and reason about those at a logical level as opposed to logging an arbitrary byte count after the fact. It’s also simpler to implement and more portable because you don’t need to figure out how to get the list of threads and manage them or deal with tracking a threadpool that shrinks/grows.

Having the allocation monitoring code execute within the same thread as the code being monitored is not only elegant (and quite innovative) but more importantly: it provides a simple programing model for implementing the corrective action (no need for any signaling protocol between the watchdog thread and the worker thread). You can also log allocations per code sections, deterministically, and reason about those at a logical level as opposed to logging an arbitrary byte count after the fact. It’s also simpler to implement and more portable because you don’t need to figure out how to get the list of threads and manage them or deal with tracking a threadpool that shrinks/grows.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Aug 2, 2016

Thinking about this some more, if the API would prove to show a consistent use-case, it may be appropriate to discuss implementing support at the language level as well:

try(MAX_ALLOCATED_BYTES,` MAX_EXECUTION_TIME_MILLI)
{
}
catch(MaxAllocatedBytesException e)
{
    // Corrective Action
    return;
}
catch(MaxExecutionTimeException e)
{
    // Corrective Action
    return;
}

Where both constraints are checked at the end of the try block.

Obviously there need to be multiple ‘overloads’ for ‘try’ including one without arguments as exist today.

edit:

Possibly having the constraints named, so the order in which they are specified will determine what would be thrown first in case both thresholds are reached.

try(MaxAllocatedBytes=X,` MaxExecutionTimeMilli=Y)

clrjunkie commented Aug 2, 2016

Thinking about this some more, if the API would prove to show a consistent use-case, it may be appropriate to discuss implementing support at the language level as well:

try(MAX_ALLOCATED_BYTES,` MAX_EXECUTION_TIME_MILLI)
{
}
catch(MaxAllocatedBytesException e)
{
    // Corrective Action
    return;
}
catch(MaxExecutionTimeException e)
{
    // Corrective Action
    return;
}

Where both constraints are checked at the end of the try block.

Obviously there need to be multiple ‘overloads’ for ‘try’ including one without arguments as exist today.

edit:

Possibly having the constraints named, so the order in which they are specified will determine what would be thrown first in case both thresholds are reached.

try(MaxAllocatedBytes=X,` MaxExecutionTimeMilli=Y)
@ayende

This comment has been minimized.

Show comment
Hide comment
@ayende

ayende Aug 2, 2016

Contributor

@clrjunkie While I would absolutely love that, you are tying allocations to memory usage, and that isn't the case.
The problem is that something like:

for(var i =0; i < 1000 * 1000 ; i++)
{
   Console.WriteLine(i.ToString());
}

Here you are allocating a lot of memory, but your memory utilization is going to be pretty low because the GC can efficiently collect it.

Contributor

ayende commented Aug 2, 2016

@clrjunkie While I would absolutely love that, you are tying allocations to memory usage, and that isn't the case.
The problem is that something like:

for(var i =0; i < 1000 * 1000 ; i++)
{
   Console.WriteLine(i.ToString());
}

Here you are allocating a lot of memory, but your memory utilization is going to be pretty low because the GC can efficiently collect it.

@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Aug 2, 2016

@ayende
My suggestion is syntactic sugar for what may turn out to be a common usage pattern of the proposed api.

As always, you need to pick the right tool for the problem your trying to solve (or monitor :).

So considering your example, the interesting problem is:

for(var i =0; i < 1000 * 1000 ; i++)
{
   try(MaxAllocatedBytes=1000)
   {
    tasks[i].execute(); 
   }
   catch(MaxAllocatedBytesException e)
   {
      // Boom! executed a task that allocated 1MB
      DoSomethingWith(tasks[i], e.AllocatedBytes);
   }
 }

@ayende
My suggestion is syntactic sugar for what may turn out to be a common usage pattern of the proposed api.

As always, you need to pick the right tool for the problem your trying to solve (or monitor :).

So considering your example, the interesting problem is:

for(var i =0; i < 1000 * 1000 ; i++)
{
   try(MaxAllocatedBytes=1000)
   {
    tasks[i].execute(); 
   }
   catch(MaxAllocatedBytesException e)
   {
      // Boom! executed a task that allocated 1MB
      DoSomethingWith(tasks[i], e.AllocatedBytes);
   }
 }
@clrjunkie

This comment has been minimized.

Show comment
Hide comment
@clrjunkie

clrjunkie Aug 2, 2016

@ayende
Also keep in mind that this API definitely does not substitute a profiler, but it seems useful when processing user submitted tasks that can vary in memory usage (e.g dynamic queries) , that is assuming of course you can establish a baseline for what’s considered valid and preferably divided the problem such you can anticipate an approaching abnormal situation before it occurs.

clrjunkie commented Aug 2, 2016

@ayende
Also keep in mind that this API definitely does not substitute a profiler, but it seems useful when processing user submitted tasks that can vary in memory usage (e.g dynamic queries) , that is assuming of course you can establish a baseline for what’s considered valid and preferably divided the problem such you can anticipate an approaching abnormal situation before it occurs.

@tenor

This comment has been minimized.

Show comment
Hide comment
@tenor

tenor Aug 2, 2016

Somewhat related to dotnet/coreclr#5563

tenor commented Aug 2, 2016

Somewhat related to dotnet/coreclr#5563

@joshfree joshfree added this to the 1.2.0 milestone Aug 3, 2016

@GSPP

This comment has been minimized.

Show comment
Hide comment
@GSPP

GSPP Aug 23, 2016

The API name AllocatedBytes (and similar) could suggest that these are the currently allocated bytes. I think TotalAllocatedBytes is a better name.

GSPP commented Aug 23, 2016

The API name AllocatedBytes (and similar) could suggest that these are the currently allocated bytes. I think TotalAllocatedBytes is a better name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment