Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

track type of native allocations when available #857

Merged
merged 3 commits into from Mar 20, 2019

Conversation

@kayle
Copy link

kayle commented Jan 17, 2019

This PR demonstrates how to retrieve the type of native allocations as described in #780

This would be useful functionality to have in the TraceEvent nuget package to allow for scenarios like dotnet/BenchmarkDotNet#457

There's a few questions in the PR on where certain functionality belongs.

src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
if (allocationSites.TryGetValue(module, out var heapAllocationSites) &&
heapAllocationSites.FirstOrDefault(x => x.Rva == cs.CodeAddress.Address - module.ImageBase)?.TypeName is string name)
{
typeName = name;

This comment has been minimized.

Copy link
@kayle

kayle Jan 17, 2019

Author

Note that this currently resolves most allocations from STL containers as type "void" because the std::allocator implementation uses void* as the return type. I think the Visual Studio implementation of the feature may special case this to extract the T, but I wanted to keep this simple for a first pass.

src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
src/PerfView/PerfViewData.cs Outdated Show resolved Hide resolved
Copy link
Contributor

vancem left a comment

I left some proposed changes in the line-by-line comments

@vancem

This comment has been minimized.

Copy link
Contributor

vancem commented Jan 22, 2019

Thanks for following up to make this PR. I like the general ideal, and commented on some of the details.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Jan 23, 2019

Codecov Report

Merging #857 into master will increase coverage by 0.13%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master     #857      +/-   ##
==========================================
+ Coverage   17.38%   17.51%   +0.13%     
==========================================
  Files         222      225       +3     
  Lines      137987   138989    +1002     
  Branches    12219    12316      +97     
==========================================
+ Hits        23984    24350     +366     
- Misses     113065   113696     +631     
- Partials      938      943       +5
Flag Coverage Δ
#2017 17.51% <0%> (+0.13%) ⬆️
#Debug 17.51% <0%> (+0.13%) ⬆️
Impacted Files Coverage Δ
src/PerfView/PerfViewData.cs 3.85% <0%> (-0.03%) ⬇️
src/TraceEvent/Symbols/NativeSymbolModule.cs 0% <0%> (ø) ⬆️
src/PerfView/StackViewer/StackWindow.xaml.cs 24.8% <0%> (-0.04%) ⬇️
src/MemoryGraph/graph.cs 0% <0%> (ø) ⬆️
src/PerfView/GuiUtilities/WindowBase.cs 83.33% <0%> (ø) ⬆️
src/PerfView/UserCommands.cs 0% <0%> (ø) ⬆️
src/MemoryGraph/MemoryGraph.cs 0% <0%> (ø) ⬆️
src/FastSerialization/StreamReaderWriter.cs 75.19% <0%> (ø) ⬆️
src/FastSerialization/GrowableArray.cs 59.74% <0%> (ø) ⬆️
...c/TraceEvent/Stacks/SpeedScopeStackSourceWriter.cs 60.73% <0%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eca7c00...61bb21b. Read the comment docs.

@@ -5713,6 +5713,10 @@ protected internal override StackSource OpenStackSourceImpl(string streamName, T
var heapParser = new HeapTraceProviderTraceEventParser(eventSource);
Dictionary<Address, StackSourceSample> lastHeapAllocs = null;

Dictionary<TraceModuleFile, NativeSymbolModule> loadedModules = new Dictionary<TraceModuleFile, NativeSymbolModule>();
var allocationTypeNames = new Dictionary<CallStackIndex, string>();
var symReader = GetSymbolReader(log);

This comment has been minimized.

Copy link
@kayle

kayle Jan 23, 2019

Author

Is this the appropriate way to get a SymbolReader? The CacheOnly argument can skip pdbs, which results in not finding any typenames for allocations from those modules. Even if I later load symbols for a module, I don't see a way to refresh the entire stackSource.

if (!loadedModules.TryGetValue(module, out var symbolModule))
{
loadedModules[module] = symbolModule =
(module.PdbSignature != Guid.Empty

This comment has been minimized.

Copy link
@kayle

kayle Jan 23, 2019

Author

I expected there to be a direct call to get a SymbolModule from a TraceModuleFile, but I didn't find one. This is my best guess based on some other areas of the code.

private IDiaDataSource3 m_source;
private IDiaEnumSymbolsByAddr m_symbolsByAddr;
internal readonly IDiaSession m_session;
private readonly SymbolReader m_reader;

This comment has been minimized.

Copy link
@kayle

kayle Jan 23, 2019

Author

A few style questions for this area:

  1. Is it good to switch to constructors to allow for readonly fields?
  2. Any reason for the bool/data field combination instead of just using Lazy
  3. Should the m_heapAllocationSites initializer be embedded in constructor or extracted to static method?
@kayle

This comment has been minimized.

Copy link
Author

kayle commented Jan 23, 2019

Thanks for the feedback. I left a few more questions as comments in the updated files.

{
if (!allocationTypeNames.TryGetValue(csi, out var typeName))
{
for (var current = csi; current != CallStackIndex.Invalid; current = eventLog.CallStacks.Caller(current))

This comment has been minimized.

Copy link
@vancem

vancem Jan 23, 2019

Contributor

Is there a reasonable bound we can put on number of frames of the stack crawl? (e.g. are the number of frames below the alloc allways less than say 10?) That would be nice as it cuts down search by potentially a large number (stacks are often 50 or more, and we are searching every unique stack).

This comment has been minimized.

Copy link
@kayle

kayle Jan 23, 2019

Author

Yes, but unfortunately I think 10 is a bit too small. From my limited testing, 15 frames was the longest chain I saw, so I'm thinking of starting with a limit of 20.

Name
wchar_t

  • wow64!Wow64SystemServiceEx
  • wow64cpu!ServiceNoTurbo
  • wow64cpu!BTCpuSimulate
  • wow64!RunCpuSimulation
    + wow64!Wow64LdrpInitialize
    • ntdll!LdrpInitializeProcess
    • ntdll!??_LdrpInitialize
    • ntdll!LdrpInitialize
    • ntdll!LdrInitializeThunk
      + ntdll!NtTraceEvent
      • ntdll!RtlpLogHeapAllocateEvent
      • ntdll!??RtlpAllocateHeapInternal
      • ntdll!RtlAllocateHeap
      • combase!utInitModuleNameCache

This comment has been minimized.

Copy link
@vancem

vancem Jan 23, 2019

Contributor

20 is fine, indeed 25 is fine (that still avoids 1/2 the cost for almost nothing).

@@ -5713,6 +5713,10 @@ protected internal override StackSource OpenStackSourceImpl(string streamName, T
var heapParser = new HeapTraceProviderTraceEventParser(eventSource);
Dictionary<Address, StackSourceSample> lastHeapAllocs = null;

Dictionary<TraceModuleFile, NativeSymbolModule> loadedModules = new Dictionary<TraceModuleFile, NativeSymbolModule>();

This comment has been minimized.

Copy link
@vancem

vancem Jan 23, 2019

Contributor

I won't require this, but you can turn loadedModules into an array of NativeSymbolModule. Every TraceModuleFile has a Index property which is a small integer whose bound is given by TraceModuleFiles.Count. Thus you can turn the dictionary into an array.

This comment has been minimized.

Copy link
@kayle

kayle Jan 23, 2019

Author

It's a bit awkward to keep track of initialization with an array. Would you use something like Optional[] or (NativeSymbolModule, bool)[]?

@vancem

This comment has been minimized.

Copy link
Contributor

vancem commented Jan 23, 2019

Only a few minor things (mentioned in line comments) left.

Kayle Hinkle
@adamsitnik

This comment has been minimized.

Copy link
Member

adamsitnik commented Mar 19, 2019

@vancem what is the status of this PR? I can see that @kayle has pushed some code cleanup commit after your review.

@vancem

This comment has been minimized.

Copy link
Contributor

vancem commented Mar 19, 2019

@adamsitnik - I will take a look later today or tomorrow.

@vancem

This comment has been minimized.

Copy link
Contributor

vancem commented Mar 20, 2019

I looked this over. It still has the big non-pay-for-play issue (basically it will try to download every PDB in the trace (often 100s since it includes every OS dll from every process). Thus the window will not open for typically 10s of seconds (maybe 100s of seconds).

But I took some time to implement the work-around that I suggested in the PR. Thus I will check this in and immediately follow up with my PR to bring back pay-for-play.

@vancem vancem merged commit b5f7fd9 into microsoft:master Mar 20, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
license/cla All CLA requirements met.
Details
@vancem

This comment has been minimized.

Copy link
Contributor

vancem commented Mar 20, 2019

A big thank you to @kayle for implementing this. It is very nice to have the C++ types available and it was pretty non-trivial to track down the new information in the PDBS that enabled this and then learn enough about PerfView to incorporate it. Thanks!

vancem added a commit to vancem/perfview that referenced this pull request Mar 20, 2019
This improves microsoft#857.
In particulare it does not look up every PDB agressively.
@kayle

This comment has been minimized.

Copy link
Author

kayle commented Mar 21, 2019

Thanks @vancem for getting this in and adding the fix to allow refreshing after loading symbols.

@kayle kayle deleted the kayle:native-allocation-type-tracking branch Mar 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.