-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trial tracemalloc as memory tracker (replacement v2 PR) #5946
Conversation
What I think we can deduce from the benchmark results
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5946 +/- ##
=======================================
Coverage 89.78% 89.78%
=======================================
Files 93 93
Lines 23007 23007
Branches 5017 5017
=======================================
Hits 20657 20657
Misses 1620 1620
Partials 730 730 ☔ View full report in Codecov by Sentry. |
Updated ideas@trexfeathers pointed out that the above were run on a machine with only 4 CPUs (hyperthread), hence nworkers=4/9 can be expected to be questionable. Hence changes in d039ad0
New resultstypical run ...
Some interpretationRunStyle checks are now more understandable.
BUT N.B. results are still not very stable from time to time
((but generally, ARE about the same for both methods, as we would expect))
MemcheckBlocksAndWorkersGeneral picture
MEMORY now makes more sense : can see memory reducing with nblocks, and increasing with nworkers compare methods
conclude
|
Outcome
|
nblocks = params["nblocks"] | ||
nworkers = params["nworkers"] | ||
|
||
nyfull = ysize // nblocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you're applying this division twice, I think that's why adding blocks has such an exagerated effect on performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes I see !
Hopefully that may explain the peculiar behaviour of the timings too.
I will fix this and re-investigate ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
84ff4ad fixes, I think
Full results, from new code introduced in 84ff4ad
|
Conclusions:
|
⏱️ Performance Benchmark Report: a3a20b5Performance shifts
Full benchmark results
Generated by GHA run |
Note: the results reported in the whole-benchmark runs behave a little differently.
|
Added Just for reference.
Some combined results
ConclusionsThe allocation pattern is a trifle hard to explain. |
WIP DO NOT MERGE THIS, EVER
Some modified code for memory benchmarks, evaluating "tracemalloc" against our existing Linux process-RSS measuring technique.
Additional context : sample output from a desktop run