-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing hit counts to a shared memory mapped area instead of to a file. #276
Writing hit counts to a shared memory mapped area instead of to a file. #276
Conversation
This might speed up UnloadModule enough that it will reliably execute within the short time that ProcessUnload allows it, even on CI servers with load. This is heavily based on work done by @Cyberboss, that unfortunately showed that memory mapped files were to slow to use directly in RecordHit.
Codecov Report
@@ Coverage Diff @@
## master #276 +/- ##
==========================================
+ Coverage 89.06% 89.36% +0.29%
==========================================
Files 16 16
Lines 1912 1946 +34
==========================================
+ Hits 1703 1739 +36
+ Misses 209 207 -2
Continue to review full report at Codecov.
|
Having to fall back to file-based memory maps on Linux. This means I'm not sure how much this will really help with #210. I did first do an experiment with having a background thread in ModuleTrackerTemplate that collects batches of hits from the threads and writes them to the memory map to try to do more work while the tests are running instead of all of it when it's done. It was predictably slower than this, but still faster than directly updating the memory maps in RecordHit. Perhaps something like that will end up being necessary. (In this case the work using a memory map instead of the hit file will not be wasted, as Coverage can remain as it is and the fancy stuff can be tried in ModuleTrackerTemplate.) In that experiment I just collected 10k hits in a dictionary before shipping them to the worker thread through a BlockingQueue. Based on that I've been thinking of this approach:
This would add a lot of overhead on calls to small methods, but that might perhaps be offset by a compensating lower total overhead on mid- to large methods. |
@tonerdo why is this PR not merged? |
Sorry @codemzs, I think this needs more discussion before merging as I'm dubious it helps much. I'm going to try another approach, so in the meantime I'll close the PR. |
@codemzs @tonerdo Boy, was my hunch wrong (again!). I added measurements to current coverlet, this branch, and another file -writing solution I wanted to try (details below). Writing hit counts to file, either solution, takes around 2000-3000 ticks, while the memory map writing takes around 100 ticks. Caveats here is the usual: it is difficult to compare the results on a developer laptop with the situation on a loaded build machine. If that has less memory paging could end up causing the mmap solution to be slow too, perhaps slower than the less memory-intense operation of writing to files, but I retract my close and reopens this. (
) |
@petli What about using |
@petli @tonerdo @Cyberboss Can we please get some traction on this PR? |
|
||
// No need to use Interlocked here since the mutex ensures only one thread updates | ||
// the shared memory map. | ||
*hitLocationArrayOffset += count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❔ Is the file guaranteed to be zero-filled when it is first created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's either a pure memory map (on Windows) or a new and thus empty file (on Unixlike), and in both cases zero pages are allocated for the map.
test/coverlet.core.tests/Instrumentation/ModuleTrackerTemplateTests.cs
Outdated
Show resolved
Hide resolved
@sharwell Does that help getting the bytes into the file? That seems to be more about copying an array into memory, but the Spans solve that without copying. |
@petli @tonerdo Is there an ETA when this fix can go in and a new nuget be released? We are trying to integrate coverlet in Microsoft's ML package and currently our CI builds are failing due to bug this PR fixes. I will highly appreciate an ETA to determine if we should use an alternative solution since we are under time constraint. Also, if there is a temporary workaround you can suggest that also will be appreciated. CC: @shauheen |
# Conflicts: # coverlet.sln
# Conflicts: # src/coverlet.template/ModuleTrackerTemplate.cs
} | ||
|
||
s_isTracking = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the no-longer threadstatic field is set to false, in the assumption that UnloadModule is not called in parallel in the same appdomain. This is necessary for the multiple-call test case, but if that assumption is not true (cf #291 which changed it from treadstatic) another solution could be found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on merging #297 first, that order will be easier to sort out codewise too. cc @tonerdo |
# Conflicts: # src/coverlet.core/Instrumentation/Instrumenter.cs # src/coverlet.template/ModuleTrackerTemplate.cs
@tonerdo I verified that the most recent form does not impact the intent of my changes that were recently merged |
@tonerdo Do we have an ETA for the new nuget with this change? |
@codemzs you'd have a new NuGet within the next 24 hours |
Unfortunately this broke System.Private.CoreLib coverage again as MemoryMappedFile isn't available in it: dotnet/corefx#34641 (comment). |
@petli you were looped in in the discussion in corefx where Jan pointed out that the early termination is not triggered by the CLR. I created microsoft/vstest#1900 as the vstest framework is the one who terminates the process before the OnProcessExit (+ ProcessExit events) call is completed. This happens if you use I think we should revert this change and follow Jan's recommendation and just do sequential IO transfer. This will unblock the currently failing System.Private.CoreLib coverage measurements. |
@ViktorHofer Would that break dotnet/machinelearning? |
I agree, no point complicating things with memory maps when they won't always help.
… On 24 Jan 2019, at 19:26, Sam Harwell ***@***.***> wrote:
@codemzs dotnet/machinelearning is still broken due to the vstest bug. I believe the performance improvements in #291 should avoid regressions until vstest is eventually updated to ensure the problem will not occur again.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thanks for understanding. Meanwhile I try to get in touch with the vstest team internally to better understand why they are terminating the application too soon. |
Revert "Merge pull request #276 from petli/memory-mapped-hit-counts"
This might speed up UnloadModule enough that it will reliably execute within the
short time that ProcessUnload allows it, even on CI servers with load.
This is heavily based on work done by @Cyberboss, that unfortunately
showed that memory mapped files were to slow to use directly in RecordHit.
Note: I've only had time to test this code on Windows, and based on #241 there might be some differences in the memory map handling on Linux. That must be tested before this can be merged.