Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing hit counts to a shared memory mapped area instead of to a file. #276

Merged
merged 7 commits into from
Jan 16, 2019

Conversation

petli
Copy link
Collaborator

@petli petli commented Dec 21, 2018

This might speed up UnloadModule enough that it will reliably execute within the
short time that ProcessUnload allows it, even on CI servers with load.

This is heavily based on work done by @Cyberboss, that unfortunately
showed that memory mapped files were to slow to use directly in RecordHit.

Note: I've only had time to test this code on Windows, and based on #241 there might be some differences in the memory map handling on Linux. That must be tested before this can be merged.

This might speed up UnloadModule enough that it will reliably execute within the
short time that ProcessUnload allows it, even on CI servers with load.

This is heavily based on work done by @Cyberboss, that unfortunately
showed that  memory mapped files were to slow to use directly in RecordHit.
@codecov
Copy link

codecov bot commented Dec 28, 2018

Codecov Report

Merging #276 into master will increase coverage by 0.29%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #276      +/-   ##
==========================================
+ Coverage   89.06%   89.36%   +0.29%     
==========================================
  Files          16       16              
  Lines        1912     1946      +34     
==========================================
+ Hits         1703     1739      +36     
+ Misses        209      207       -2
Impacted Files Coverage Δ
...overlet.core/Instrumentation/InstrumenterResult.cs 100% <ø> (ø) ⬆️
src/coverlet.core/Instrumentation/Instrumenter.cs 99.06% <100%> (+0.01%) ⬆️
src/coverlet.core/Coverage.cs 87.01% <100%> (+1.86%) ⬆️
src/coverlet.core/Symbols/CecilSymbolHelper.cs 100% <0%> (+0.46%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 63dff09...9b001ad. Read the comment docs.

@petli
Copy link
Collaborator Author

petli commented Dec 28, 2018

Having to fall back to file-based memory maps on Linux. This means I'm not sure how much this will really help with #210.

I did first do an experiment with having a background thread in ModuleTrackerTemplate that collects batches of hits from the threads and writes them to the memory map to try to do more work while the tests are running instead of all of it when it's done. It was predictably slower than this, but still faster than directly updating the memory maps in RecordHit. Perhaps something like that will end up being necessary. (In this case the work using a memory map instead of the hit file will not be wasted, as Coverage can remain as it is and the fancy stuff can be tried in ModuleTrackerTemplate.)

In that experiment I just collected 10k hits in a dictionary before shipping them to the worker thread through a BlockingQueue. Based on that I've been thinking of this approach:

  • Instrument each method to do a stackalloc of a hit count array for that method invocation
  • Instrument the hit counting to increment this array, rather than calling RecordHit
  • Instrument a try-finally to each method that calls RecordHit passing it as a Span<> using the new fancy ref readonly struct things
  • RecordHit updates the memory map with the hits. This would need to look into tradeoffs between Interlocked.Add() and locking the memory map or a segment of it from other threads

This would add a lot of overhead on calls to small methods, but that might perhaps be offset by a compensating lower total overhead on mid- to large methods.

@codemzs
Copy link

codemzs commented Jan 11, 2019

@tonerdo why is this PR not merged?

@petli
Copy link
Collaborator Author

petli commented Jan 11, 2019

Sorry @codemzs, I think this needs more discussion before merging as I'm dubious it helps much. I'm going to try another approach, so in the meantime I'll close the PR.

@petli petli closed this Jan 11, 2019
@petli
Copy link
Collaborator Author

petli commented Jan 11, 2019

@codemzs @tonerdo Boy, was my hunch wrong (again!). I added measurements to current coverlet, this branch, and another file -writing solution I wanted to try (details below). Writing hit counts to file, either solution, takes around 2000-3000 ticks, while the memory map writing takes around 100 ticks.

Caveats here is the usual: it is difficult to compare the results on a developer laptop with the situation on a loaded build machine. If that has less memory paging could end up causing the mmap solution to be slow too, perhaps slower than the less memory-intense operation of writing to files, but I retract my close and reopens this.

(
For reference, I thought it might be quicker to write the entire hits array as a buffer using the new spans, but it didn't make any significant difference (and in some cases were slower):

var intBuffer = new ReadOnlyMemory<int>(HitsArray);
unsafe
{
    var pinnedBuffer = intBuffer.Pin();
    var byteSpan = new ReadOnlySpan<byte>(pinnedBuffer.Pointer, HitsArray.Length * sizeof(int));
    bw.Write(byteSpan);
}

)

@petli petli reopened this Jan 11, 2019
@sharwell
Copy link
Contributor

@petli What about using Marshal.Copy?

@codemzs
Copy link

codemzs commented Jan 13, 2019

@petli @tonerdo @Cyberboss Can we please get some traction on this PR?


// No need to use Interlocked here since the mutex ensures only one thread updates
// the shared memory map.
*hitLocationArrayOffset += count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Is the file guaranteed to be zero-filled when it is first created?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's either a pure memory map (on Windows) or a new and thus empty file (on Unixlike), and in both cases zero pages are allocated for the map.

@petli
Copy link
Collaborator Author

petli commented Jan 14, 2019

@petli What about using Marshal.Copy?

@sharwell Does that help getting the bytes into the file? That seems to be more about copying an array into memory, but the Spans solve that without copying.

@codemzs
Copy link

codemzs commented Jan 14, 2019

@petli @tonerdo Is there an ETA when this fix can go in and a new nuget be released? We are trying to integrate coverlet in Microsoft's ML package and currently our CI builds are failing due to bug this PR fixes. I will highly appreciate an ETA to determine if we should use an alternative solution since we are under time constraint. Also, if there is a temporary workaround you can suggest that also will be appreciated.

CC: @shauheen

# Conflicts:
#	src/coverlet.template/ModuleTrackerTemplate.cs
}

s_isTracking = false;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the no-longer threadstatic field is set to false, in the assumption that UnloadModule is not called in parallel in the same appdomain. This is necessary for the multiple-call test case, but if that assumption is not true (cf #291 which changed it from treadstatic) another solution could be found.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, @sharwell removes this field entirely in #297... So that should go in to master first, then that change needs to be merged into this PR before it is (again) ready to merge.

@petli
Copy link
Collaborator Author

petli commented Jan 15, 2019

@tonerdo @codemzs @sharwell Merged with the other template-related changes, so this PR is again ready for merging.

@sharwell
Copy link
Contributor

sharwell commented Jan 15, 2019

I think we should merge #297 first since that is a substantial correctness change for the dotnet/coreclr team, and then update this pull request to focus on the performance improvements.

@petli Sorry for all the churn I know that can be frustrating.

@petli
Copy link
Collaborator Author

petli commented Jan 16, 2019

I think we should merge #297 first since that is a substantial correctness change for the dotnet/coreclr team, and then update this pull request to focus on the performance improvements.

@petli Sorry for all the churn I know that can be frustrating.

I agree on merging #297 first, that order will be easier to sort out codewise too. cc @tonerdo

# Conflicts:
#	src/coverlet.core/Instrumentation/Instrumenter.cs
#	src/coverlet.template/ModuleTrackerTemplate.cs
@petli
Copy link
Collaborator Author

petli commented Jan 16, 2019

@tonerdo This PR is now merged with the latest PRs from @sharwell

@tonerdo
Copy link
Collaborator

tonerdo commented Jan 16, 2019

@petli quick sanity check that the merge conflict resolution didn't affect the functionality of both this PR and #297

@sharwell
Copy link
Contributor

sharwell commented Jan 16, 2019

@tonerdo I verified that the most recent form does not impact the intent of my changes that were recently merged

@codemzs
Copy link

codemzs commented Jan 16, 2019

@tonerdo Do we have an ETA for the new nuget with this change?

@tonerdo
Copy link
Collaborator

tonerdo commented Jan 16, 2019

@codemzs you'd have a new NuGet within the next 24 hours

@codemzs
Copy link

codemzs commented Jan 16, 2019

Thanks, @tonerdo ! This is good news, looking forward to getting the new nuget and integrating coverlet with dotnet/machinelearning.

CC: @shauheen

@ViktorHofer
Copy link
Contributor

Unfortunately this broke System.Private.CoreLib coverage again as MemoryMappedFile isn't available in it: dotnet/corefx#34641 (comment).

@petli petli deleted the memory-mapped-hit-counts branch January 17, 2019 14:12
@ViktorHofer
Copy link
Contributor

ViktorHofer commented Jan 24, 2019

@petli you were looped in in the discussion in corefx where Jan pointed out that the early termination is not triggered by the CLR. I created microsoft/vstest#1900 as the vstest framework is the one who terminates the process before the OnProcessExit (+ ProcessExit events) call is completed. This happens if you use dotnet test or dotnet vstest.

I think we should revert this change and follow Jan's recommendation and just do sequential IO transfer. This will unblock the currently failing System.Private.CoreLib coverage measurements.

@codemzs
Copy link

codemzs commented Jan 24, 2019

@ViktorHofer Would that break dotnet/machinelearning?

@sharwell
Copy link
Contributor

@codemzs dotnet/machinelearning is still broken due to the vstest bug. I believe the performance improvements in #291 should avoid regressions until vstest is eventually updated to ensure the problem will not occur again.

@petli
Copy link
Collaborator Author

petli commented Jan 24, 2019 via email

@ViktorHofer
Copy link
Contributor

Thanks for understanding. Meanwhile I try to get in touch with the vstest team internally to better understand why they are terminating the application too soon.

ViktorHofer added a commit to ViktorHofer/coverlet that referenced this pull request Jan 24, 2019
…pped-hit-counts"

This reverts commit 0d285b1, reversing
changes made to 15a4e62.
ViktorHofer added a commit to ViktorHofer/coverlet that referenced this pull request Jan 24, 2019
…pped-hit-counts"

This reverts commit 0d285b1, reversing
changes made to 15a4e62.
ViktorHofer added a commit to ViktorHofer/coverlet that referenced this pull request Jan 24, 2019
…pped-hit-counts"

This reverts commit 0d285b1, reversing
changes made to 15a4e62.
ViktorHofer added a commit to ViktorHofer/coverlet that referenced this pull request Jan 24, 2019
…pped-hit-counts"

This reverts commit 0d285b1, reversing
changes made to 15a4e62.
tonerdo added a commit that referenced this pull request Jan 27, 2019
Revert "Merge pull request #276 from petli/memory-mapped-hit-counts"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants