New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible to speed up gcovr on big projects? #36

Open
patapra opened this Issue Feb 8, 2014 · 9 comments

Comments

Projects
None yet
6 participants
@patapra

patapra commented Feb 8, 2014

hi,

i've written a tool, which uses gcovr to generate coverage reports for a big project. i am selectively instrumenting files based on a diff file users pass in (ie if the user changed file1, file2, file3, then touch these files and do a make with gcov flags enabled). in some cases, especially when a user touches a commonly included header file, hundreds of gcno files are created (i presume wherever these header files are included and possibly recursively?). in these cases, gcovr takes up to 3 hours to complete its analysis. i'm curious if there's a safe (maintains coverage accuracy) way for me to speed things up?

@whart222

This comment has been minimized.

Member

whart222 commented Jul 4, 2014

Scalability has been raised for several projects in the past. I need a case study to focus the performance tuning for gcovr. Is the issue simply the number of gcno files? I might be able to generate a case study that illustrates this situation.

@patapra

This comment has been minimized.

patapra commented Jul 4, 2014

my assumption is it's not just the number but also the size/complexity of
the gcno files?

if header1.hpp is included by file1.cpp, file2.cpp, and file3.cpp, can
accurate code coverage for header1.cpp be obtained by only instrumenting
header1.hpp and not file1, 2, or 3?

On Jul 4, 2014, at 5:01 PM, William Hart notifications@github.com wrote:

Scalability has been raised for several projects in the past. I need a case
study to focus the performance tuning for gcovr. Is the issue simply the
number of gcno files? I might be able to generate a case study that
illustrates this situation.


Reply to this email directly or view it on GitHub
#36 (comment).

@whart222 whart222 added this to the Gcovr 3.x milestone Jul 4, 2014

@whart222

This comment has been minimized.

Member

whart222 commented Jul 4, 2014

Perhaps you could make that sort of logical deduction, but that would
require parsing the C++ files to find the header dependencies.

On Fri, Jul 4, 2014 at 4:14 PM, patapra notifications@github.com wrote:

my assumption is it's not just the number but also the size/complexity of
the gcno files?

if header1.hpp is included by file1.cpp, file2.cpp, and file3.cpp, can
accurate code coverage for header1.cpp be obtained by only instrumenting
header1.hpp and not file1, 2, or 3?

On Jul 4, 2014, at 5:01 PM, William Hart notifications@github.com
wrote:

Scalability has been raised for several projects in the past. I need a
case
study to focus the performance tuning for gcovr. Is the issue simply the
number of gcno files? I might be able to generate a case study that
illustrates this situation.


Reply to this email directly or view it on GitHub
#36 (comment).


Reply to this email directly or view it on GitHub
#36 (comment).

@tsondergaard

This comment has been minimized.

tsondergaard commented Sep 22, 2014

On a project that I'm working on the following gcovr command-line takes 5 minutes:

gcovr --xml --root $MI_SOURCE_DIR --exclude=.*/tests/.* --exclude=.*/build/.* -o $MI_BUILD_DIR/coverage_report.xml $MI_SOURCE_DIR

I tried replacing it with the following:

gcov $(find . -name "*.gcda" -o -name "*.gcno") --branch-counts --branch-probabilities --preserve-paths
gcovr -g --xml --root $MI_SOURCE_DIR --exclude=.*/tests/.* --exclude=.*/build/.* -o $MI_BUILD_DIR/coverage_report.xml $MI_SOURCE_DIR

Here the gcov command takes about 25 seconds and the gcovr about 35 seconds. In other words it is a solid factor of five faster.

gcovr runs gcov once per .gcda (fallback .gcno) in the first case above (I checked, worst case it will run it more than once pr .gcda/.gcno file, if it doesn't find the working dir in the first try). I believe running gcov once or at least fewer times presents a great opportunity for improvement. Better than parallelizing the main loop as suggested in #3.

@tsondergaard

This comment has been minimized.

tsondergaard commented Sep 22, 2014

Sorry, turns out the benefit I posted was a misrepresentation. The second option takes about 3m20s, so the benefit is not even a factor of two. Also, I get exactly the same stats for files, classes, lines, but for conditionals the raw numbers are much higher for the second approach - 39728/286544 (12.7%) vs 87858/692094 (13.9%)

@mrx23dot

This comment has been minimized.

mrx23dot commented May 28, 2015

GCOV.exe could be easily called parallel, based on number of CPU cores.

@itavero

This comment has been minimized.

itavero commented Aug 20, 2015

@tsondergaard Did you ever manage to figure out why the numbers for the conditionals are being messed up?
We tried the same approach and our testing time went from over 9 minutes to approximately 3 minutes, which is a big difference if you ask me.
Unfortunately Gcovr now says our conditional coverage is just about half of what it used to be, even though the system under test and the tests have not been changed.

@tsondergaard

This comment has been minimized.

tsondergaard commented Aug 20, 2015

@itavero Unfortunately not.

@itavero

This comment has been minimized.

itavero commented Aug 20, 2015

I checked and, in our case, the report claims to have exactly twice as much branches as they actual have (what the previous reports show).
However the number of branches reached is not doubled.

What we are doing is running all the different tests for our entire code base and then we run gcov, similar to what was mentioned before.
After that gcovr is ran, again, similar to what was mentioned before.

If anyone has an idea about what might be causing this, I'd love to hear it.

@latk latk removed this from the Gcovr 3.x milestone Jan 27, 2018

latk added a commit that referenced this issue Mar 18, 2018

Merge pull request #239 from JamesReynolds
- closes issue #3 (parallelize main loop of gcovr)
- see issue #36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment