-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallelize main loop of gcovr #3
Comments
Max number of parallel jobs should depend on numof CPU cores others: http://stackoverflow.com/questions/1006289/how-to-find-out-the-number-of-cpus-using-python |
We've hit this issue running tests on a Sparc box so I've created a fork here: https://github.com/JamesReynolds/gcovr When I'm satisfied this is working and I've got all the tests working I'll submit a PR. Its necessarily python 2.7 for us as we're rather constrained by our environment. |
Thank you @JamesReynolds for looking into this. A bit of advice:
I can understand your desire for a faster gcovr. When I worked with Solaris/Sparc this was so painful that we only ran gcovr on a separate Linux/x86 box. |
Thank you! As far as the changes go:
We do our full build with a cross-compiler with only the tests running on Solaris Sparc. It could well be worth us investigating pulling the result back and running gcovr on a Linux/x86 box. |
If the time is actually spent parsing gcov files, to avoid the GIL, you could 1) use multiprocessing; 2) use Cython. Using Cython might require a large set of changes. I've written something similar to gcovr in Rust (https://github.com/marco-c/grcov). You might also experiment making gcovr use grcov for parsing. |
@latk I've got all of the tests parsing and it is a lot faster - but that is for my use case (~100 *.cpp files, same number (ish) of headers, boost & cmake): Linux (M-5Y71, Fedora 27)
Solaris Sparc (T4-1, Solaris 11.3, Virtualised with 8 vcpus)
It turns out parallelizing gcov isn't easy - it dumps all its output in the current directory so I've had to create a mutex that locks each directory. I think this limits the usefulness of the code as we're going to be serialized on the locked directory for some build systems. I've added in a temporary directory per thread though, so for some build systems (CMake...) it can output to its temporary directory and run all the gcov's at the same time. I've only added one threading test, which may not be good enough and I've only my benchmarks to go on - but I could start a PR now? @marco-c I might take a look. My issue appears to be mainly gcov though (and I'm now happy with the performance) but cython or your project might give me an additional boost. |
The easiest solution to this is to run gcov in different directories. |
It would be nice to see |
@JamesReynolds Those are fantastic performance improvements on the Sparc! Great work :) Please do open a Work-In-Progress PR. As there are some architectural changes that need to be discussed first this will take some time. However:
So it may be better to start a new branch on top of master+228, open the PR, and gradually add changes while they are discussed. I'll assist in any way that I can, but mostly by asking dumb questions :) |
@marco-c Thank you for building grcov, that looks interesting. Note that gcovr is currently tightly coupled to the gcov human-readable report format, so adding more backends is not immediately possible. If you're interested in doing the necessary work, we would first have to establish some kind of coverage backend interface. The current high-level flow is:
I assume only some parts of step 2 would have to change to accomodate grcov? What kind of info would be needed by all backends? Can some phases be generalized? Once you have familiarised yourself with the current code structure, you're welcome to submit a design for a stable coverage backend interface as an issue. Note that while I am open to making more backends possible, grcov will never become a required or recommended backend; the gcov tool that already ships with the compiler is preferable. TBH, it would probably be more beneficial to move more of gcov's functionality into gcovr itself, i.e. parsing the raw coverage directly. This should be faster by avoiding launching so many extra processes, and would be more easily parallelizeable. As you already have some experience with parsing raw coverage data, how would one approach this? Do gcc and llvm have a stable format for their raw coverage files? |
@marco-c How do you avoid the requirement for gcov to be run in the same directory as the original compilation? That is what required me to add a directory lock as multiple gcov invocations required the same folder and produced the same files. Unfortunately, getting rust running on Solaris sparc is (by the looks of things) non-trivial. I do have some other projects though that require tooling on Solaris/AIX so if I hit one that also needs a rust cross-compilation environment I'll revisit. @latk Thanks! I've got a bunch of stuff on today and tomorrow - so if I rebase from a branch with #228 on Wednesday then submit a PR that should work? |
@JamesReynolds That would be great! By that time, you can probably just rebase on the master branch. There's no hurry, I'll be grateful whenever you may find some time. |
Fixed by #239. |
grcov would replace step 1 as well.
grcov uses gcov too for GCC, and uses the LLVM API for LLVM.
I'd advise against this, the GCC format is not stable. The LLVM format has been stable so far (it's an old version of the GCC format), hasn't changed since when it was introduced, but if you intend to use the LLVM API you would need a C extension.
Are you sure about this requirement? IIRC, as long as you have the gcno and gcda files, you can run gcov anywhere (in fact, for Firefox we run gcov on an entirely different machine than the machine where the build is made). |
It takes a long time to run gcovr on a large project, especially when used as part of an automated analysis system. It would be nice if this loop were parallel:
(from gcovr ticket 3956)
The text was updated successfully, but these errors were encountered: