Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostic support for NonGC heap #4156

Open
19 of 21 tasks
jkotas opened this issue Sep 19, 2022 · 11 comments
Open
19 of 21 tasks

Diagnostic support for NonGC heap #4156

jkotas opened this issue Sep 19, 2022 · 11 comments
Assignees
Labels
User Story A single user-facing feature. Can be grouped under an epic.
Milestone

Comments

@jkotas
Copy link
Member

jkotas commented Sep 19, 2022

[Jan had different text here before, but I rewrote it as the situation changed]

Background

Although the Frozen Object Heap has existed in the runtime for a long time, it has never been used for any prominent scenario and it has had no investment to integrate well with diagnostic tools. Now that we want to use it automatically for string literals we can't continue to ignore it for diagnostics.
Context: dotnet/runtime#49576 (comment)
Also, VM/JIT related work items are tracked in dotnet/runtime#76151

The design

Several possible conceptual designs were discussed and we landed on disclosing the Frozen Object Heap as a new memory region that holds managed objects, but is distinct from the GC entirely. It has also been proposed that we stop calling it the Frozen Object Heap because it does support dynamic allocations at runtime. The current suggestion is to call it the Non-GC heap. This would give us a conceptual diagram like this:

                      .NET Managed Heap
                             |
                  -------------------------
                  |                       |
              GC heap                  Non-GC heap
                  |
         -------------------
         |        |        |        
        SOH      POH      LOH

A consequence of this decision is that previously terms 'GC heap', 'Managed heap', '.NET heap', '.NET Managed heap' all meant the same thing and that is no longer true. Names that reference 'GC' are intended to exclude the Non-GC heap and all the other names that do not mention 'GC' are generic and include both. Some concrete implications:

  • All the APIs under the 'System.GC' class solely provide information about the GC portion of the heap. We have no way to prevent users from calling GC.GetGeneration() on objects that aren't in the GC heap, but we propose to return -1 to indicate that these objects do not belong to any generation.
  • All of our GC* environment variables solely control/constrain the behavior of the GC portion of the heap and have no impact on the Non-GC heap.
  • All of our existing performance counters for gc-* are not intended to include non-gc heap information
  • The existing ETW keywords for GCHeapDump, GCSampledObjectAllocation[high|low], GCHeapCollect may need to break the naming rules and apply to both GC and Non-GC heaps or we need to add a new provider because we are running out of keyword bits. This one is TBD.
  • Various ICorProfiler event masks that reference GC (COR_PRF_HIGH_BASIC_GC, COR_PRF_HIGH_MONITOR_GC_MOVED_OBJECTS, COR_PRF_MONITOR_GC) should only produce callbacks referring to GC heap objects, but event masks referencing allocation (COR_PRF_MONITOR_OBJECT_ALLOCATED, COR_PRF_ENABLE_OBJECT_ALLOCATED) should apply to all heaps. Although COR_PRF_HIGH_MONITOR_LARGEOBJECT_ALLOCATED doesn't explicitly say 'GC', the 'LARGEOBJECT' is a reference to the Large Object Heap and it would continue to apply only to that heap even if the non-GC heap supports objects of similar size eventually.
  • These ICorDebug APIs should (but currently do not) give results for both the GC and non-GC heap:
    ICorDebugProcess5::EnumerateHeap and ICorDebugProcess5::EnumerateHeapRegions
  • The CLRMD API Runtime.Heap should (but currently does not) represent the combined GC and non-GC heap. Runtime.Heap.EnumerateObjects() should (but currently does not) enumerate all objects from both heaps.
  • Our DAC APIs are not public APIs so we have leeway to abuse terminology to make implementation easier, but preferably we wouldn't do so to improve clarity for our own developers.
  • The SOS command DumpHeap should (but currently does not) represent both the GC and non-GC heap. EEHeap should (but current does not) include the Non-GC heap. EEHeap -gc remains only the GC-heap.
  • Visual Studio and other 3rd party tools that analyze memory are recommended to apply similar naming but it is outside of our control.

There is long history of treating the term "GC" as the heap containing all possible .NET objects so it is likely that users, docs, and some tools may continue to use the term that way even though it is no longer precise. We felt the potential confusion caused by that is acceptable. For the vast majority of scenarios the Non-GC heap will be comparatively small and the size of the GC heap is still a good approximation for the total managed heap. We will encourage tooling vendors for managed memory analysis tools to update their data reporting so for developers that do care about the details they will have an accurate representation.

Work needed to support the Non-GC heap:

Must have - changes to runtime APIs and tools

Must have - Notify others in this area so that they can update tools and APIs if needed. Some of these may be no-op

  • CLRMD
  • Visual Studio debugger
  • Visual Studio profiler
  • TraceEvent and PerfView
  • Application Insights
  • WPA
  • !analyze and windbg
  • OpenTelemetry.NET
  • 3rd party profilers (via the profiling announcement issue)

Must have - Notify customers of the breaking change in GC.GetGeneration() + conceptual changes

Nice to have - additional diagnostic features

  • SOS !eeheap command needs to report statistics for the non-GC heap
  • Add new BCL APIs that let applications query Non-GC heap statistics
  • Add new performance counters exposing Non-GC heap statistics

category:testing
theme:testing
skill-level:intermediate
cost:medium
impact:small

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented Sep 19, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details
  • Call allocate object callback when a frozen string literal is allocated
  • Report frozen segments from GetGenerationBounds profiler interface

Context: dotnet/runtime#49576 (comment)

Author: jkotas
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@noahfalk
Copy link
Member

noahfalk commented Nov 4, 2022

@jkotas @Maoni0 - I understand that you both talked offline and agreed on this new path that Frozen Object Heap will not be included conceptually anywhere that we have named things as 'GC'. I updated the text above to capture that and updated impact on diagnostics accordingly. If you have any issues I want to get those resolved.

@jkotas
Copy link
Member Author

jkotas commented Nov 5, 2022

GCSampledObjectAllocation
masks referencing allocation (COR_PRF_MONITOR_OBJECT_ALLOCATED, COR_PRF_ENABLE_OBJECT_ALLOCATED) should apply to all heaps.

How is this expected to work for file-backed non-GC heap segments? I think it would be more future proof to not produce fine grained allocation events for non-GC heap segments and treat these allocations as other non-GC managed memory allocations where we do not provide fine-grained events eithers.

@Maoni0
Copy link
Member

Maoni0 commented Nov 7, 2022

this looks great, thanks for writing this up, @noahfalk!

!eeheap -gc does include non-GC heap segments currently, just because they are threaded onto gen2 (for GC bookkeeping purposes for segments; for regions I'm thinking of making that a no op so it wouldn't show up in sos by accident and instead will be tracked on the VM side which has this knowledge anyway).

I haven't used WPA for any managed analysis so I'm unclear what will need to be updated there.

I'm fine with @jkotas's suggestion above for not producing a fine grained allocation events for non-GC heap segments.

@noahfalk
Copy link
Member

noahfalk commented Nov 8, 2022

How is this expected to work for file-backed non-GC heap segments?

I was assuming that when the time came to support that scenario (which didn't appear to be now) we would add an alternate callback function that was more appropriate and tools that cared about having complete information would need register for that new callback as well. For example that callback might be 'ModuleLoadBulkAllocation' and it identifies a range of memory or a set of ranges. Do you think that would be problematic? My goal was to avoid regressions in the overall scenario. In .NET 7 those string allocations are landing on the gen2 GC heap and users have visibility. It would be disappointing (but not horrible) to say diagnostic tooling in .NET 8 is less capable because of runtime implementation changes.

!eeheap -gc does include non-GC heap segments currently ... for regions I'm thinking of making that a no op so it wouldn't show up in sos by accident

Thanks! Will there be an option to run without regions in .NET 8? If so then I think we should add a work item to the list that SOS needs to explicitly filter them out.

@jkotas
Copy link
Member Author

jkotas commented Nov 8, 2022

I was assuming that when the time came to support that scenario (which didn't appear to be now)

We do not have general support for it. We support it for selected customers.

For example that callback might be 'ModuleLoadBulkAllocation' and it identifies a range of memory or a set of ranges.

I agree that we should have APIs/events that allow you to get the regions of memory where the non-GC managed objects are. Can the same API/events handle both dynamically created and file backed non-GC managed segments?

@noahfalk
Copy link
Member

noahfalk commented Nov 9, 2022

Can the same API/events handle both dynamically created and file backed non-GC managed segments?

We could make a callback that told tools when segments were created regardless of what kind they were, but I don't view that as a substitute for the object allocation callbacks. I expect tool authors will want to help users reason about why the memory was allocated. For dynamic segments that would mean individual object allocation callbacks, and for file backed segments it would be correlating it to a particular module load. That might end with APIs like:

//This already exists
ICorProfilerCallback::ObjectAllocated(ObjectID)

// New API I am not proposing now, but we could add it in the future
// we could decide whether this callback fires for all segments, for non-GC segments only, or
// for file-backed non-gc segments only.
ICorProfilerCallbackXX:SegmentAllocated(SegmentID, int size, ModuleID associatedModuleLoad)

@leculver
Copy link
Contributor

leculver commented Apr 18, 2023

I checked the SOS and ClrMD boxes.

As of now, SOS (in the main branch of dotnet/diagnostics) should fully support frozen objects in all commands. Feel free to raise a specific issue there if a command is misbehaving. For example, !dumpheap will enumerate those objects, !eeheap reports frozen segments, !gcwhere will report frozen objects, etc.

@EgorBo
Copy link
Member

EgorBo commented Apr 18, 2023

I checked the SOS and ClrMD boxes.

As of now, SOS (in the main branch of dotnet/diagnostics) should fully support frozen objects in all commands. Feel free to raise a specific issue there if a command is misbehaving. For example, !dumpheap will enumerate those objects, !eeheap reports frozen segments, !gcwhere will report frozen objects, etc.

Awesome! Thanks!

@tommcdon
Copy link
Member

Transferring to the diagnostics work to track the remaining documentation and communication work which can be done out of band from runtime work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
User Story A single user-facing feature. Can be grouped under an epic.
Projects
None yet
Development

No branches or pull requests

6 participants