Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Parallelize main loop #239
@@ Coverage Diff @@ ## master #239 +/- ## ========================================== + Coverage 87.35% 88.71% +1.36% ========================================== Files 12 13 +1 Lines 1336 1471 +135 Branches 243 267 +24 ========================================== + Hits 1167 1305 +138 + Misses 115 108 -7 - Partials 54 58 +4
Thank you for this awesome feature!
I've skimmed over the PR, and there are lots of unrelated changes regarding Docker and the README. Can we rewrite the history to exclude these changes? I'm not quite sure how to do this, possibly with
My goal regarding the git history is that the logical evolution of this code is evident when debugging in the future, not necessarily that it records your actual process.
This PR will conflict with #237, which changes argument parsing to the argparse module. This will change how optional arguments are handled. I think your
After these points are addressed I'll take another look. E.g. we might be able to avoid some competing gcov processes by grouping work items for the same directory? And it might be possible to avoid spinning up more threads than needed? I don't know yet. But that doesn't have to be part of this PR.
Please let me know if there's anything I can do to help, especially regarding Git history simplification.
@latk After some experimentation I can compress this into a single commit without Dockerfile, .dockerignore or changes to README.rst using git rebase. That probably makes most sense as then my local figuring stuff out commits don't pollute anything?
I think we can certainly do better with competing gcov processes for certain build types. Maybe calculating the potential wd in one pass, then running the threads in a second pass selecting items that don't conflict? Another option, but only applicable to later GCC versions, is for two gcov processes to share a directory and get one to use '-x' to append hashes to its output.
Ok, squashing all commits into one still results in a large commit, but the changes excluding the tests are fairly compact:
I don't know whether this or #237 will be merged first, so don't rebase prematurely. Second to merge gets to resolve the conflicts :)
I don't think
OK, done. I've merged all of the relevant commits into one.
@latk @mayeut I've submitted another revision (not rebased though so you can see the change) which I think addresses all the issues. I went with a sentinel object and with a per-thread context that I merge in main. It solves the deadlock and crashes on Windows and I think gives marginally better performance on Sparc too.
This is really great work, thank you so much for working through all these details!
Unfortunately, multithreaded code is like a hydra: solve one race condition and another pops up. Below, I have a few questions/observations about exception propagation.
I've found a few points, but am not sure how important they are.
My intention would be to merge this PR soonishly even if not all of my concerns are addressed, and then see if any problems crop up. It may be possible to refactor and simplify this code further at some point in the future.
Thank you, this looks great. I'll try to merge during the weekend.
One topic for future investigation is that the tests now seem to take 20% longer (judging from the Travis and Appveyor reports). This is more than I would have expected. This could be an artefact of the test runner, or an actual performance regression. If it's an actual problem, this might be fixable by bundling all jobs for one directory into a single job, as that reduces waiting on locks.
Can you re-verify the performance improvements on SPARC? I'm interested in wall-clock runtime/elapsed time comparison between gcovr on master and gcovr on this PR when running in parallel mode.