Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor gcov parsing code, support upcoming GCC 8 changes #228

Merged
merged 7 commits into from Mar 6, 2018

Conversation

latk
Copy link
Member

@latk latk commented Feb 25, 2018

This PR refactors the gcovr.gcov module. There are a lot of changes. In general, I tried to extract smaller, more meaningful functions, and reduce repeated code. Most functions should now fit into one or two screens.

  • Finding the source file from a .gcov file is now done by guess_source_file_name(), guess_source_file_name_via_aliases() and guess_source_file_name_heuristics(). The code is now clearer and less deeply nested.

  • The actual .gcov file parsing code was gathered into a GcovParser class. This class carries the parsing state from one line to the next. Previously, all of these members were variables around a huge loop.

    • parse_all_lines() runs the parser.
    • parse_line() parses a single line in a gcov report, and keeps track of relevant state.
    • parse_code_line() parses the normal 4: 7: foo() style gcov lines (execution count : line number : source code).
    • parse_tag_line() handles extra information, such as branch statistics and GCC 8 specialization sections.
    • parse_exclusion_marker assists parse_line() with markers like GCOV_EXCL_LINE etc.
    • check_unclosed_exclusions() and check_unrecognized_lines() emit warnings after parsing has completed
    • update_coverage() writes the data into a CoverageData object.
  • find_potential_working_directories_via_objdir() implements heuristics to suggest a gcov working directory.

  • run_gcov_and_process_files() runs the external gcov tool, collects the output, and processes the generated files. select_gcov_files_from_stdout() assists with finding the output files.

  • apply_filter_include_exclude() extracts the common filter application. It receives a (relative) filename, a list of filters that must match, and a list of exclusion filters that must not match. As filters are now processed in a single place, this will make them easier to improve via Exclude flag broken on Windows #191.

  • A gcovr.utils.Logger class has been created. This helps to have a more uniform message output format, and avoids countless if: options.verbose: ... branches. The available methods are:

    • verbose_msg() basically just calls if verbose: self.msg().
    • msg() writes info to STDOUT.
    • warn() writes a (WARNING) to STDERR.
    • error() writes an (ERROR) to STDERR.

    I suppose this can be improved/refined in the future.

  • The CoverageData class was simplified.

    • As an optimization, the constructor used to steal data structures from the gcov parsing code. Now, it maintains its own data structures that are only modified through the update() method. This drastically simplifies the code at no noticeable loss of performance.
    • smaller functions such as find_consecutive_ranges() were extracted.
    • this could be simplified even further with the collections.Counter class, but that would definitively break Python 2.6 compatibility.

The changes to support the upcoming GCC 8 were made in response to #226. The new gcov report format gets new sections that report coverage individually for each template or macro specialization. This PR does not support them, but merely strips away the section delimiters. The result is that branches are parsed for such lines, but I'm not sure how coverage from multiple sections on the same line is combined.

To prevent future problems like this, the parsing code was made more “robust”. The GcovParser.parse_all_lines() method catches any exceptions during parsing. The exception and the problematic line are stored. After parsing, the check_unrecognized_lines() method can be called to report these lines and exceptions. It will emit a warning on STDERR with all the offending lines, and also points to the Github issues. Previously, this message would only be shown in verbose mode, now it is shown always. Any exceptions are also printed as warnings (but I didn't find out how to include the stack trace). Afterwards, gcovr continues as usual and will not exit with an error. The implications of this design are:

  • If there is a parsing problem, there will always be a message on the console.
  • This will likely go unnoticed if gcovr is used automated, e.g. on a CI server, as gcovr will continue successfully.
  • There is one message per parsed file, which may spam the console for huge projects. There is currently no deduplication.
  • The error message points directly to Github rather than vaguely at “the gcovr developers”, which should make it easier for users to report problems.
  • The exceptions are reported separately from the lines that caused them, which makes it more difficult to act on a bug report.

So there's still a lot that can be improved with regards to gcov parsing and logging, but this is a start.

 - extract filter handling into apply_filter_include_exclude() function
 - split up process_datafile() function
 - remove insignificant optimizations in constructor
 - extract find_consecutive_ranges()
See gcovr#226 for background.

At this opportunity, the parsing was made a bit more robust.

  - Any exceptions during parsing are swallowed.
    Parsing will always succeed, but may be garbage.
  - After parsing, failed lines and any exceptions are summarized.
  - These errors are shown always, not just in verbose mode.
  - The error message links directly to the Github issues.
@latk latk added Type: Bug Type: Enhancement needs review Filters related to filters, include/exclude, path handling Gcov labels Feb 25, 2018
@codecov
Copy link

codecov bot commented Feb 25, 2018

Codecov Report

Merging #228 into master will increase coverage by 3.5%.
The diff coverage is 84.35%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #228     +/-   ##
========================================
+ Coverage      84%   87.5%   +3.5%     
========================================
  Files          11      12      +1     
  Lines        1194    1337    +143     
  Branches      248     240      -8     
========================================
+ Hits         1003    1170    +167     
+ Misses        131     116     -15     
+ Partials       60      51      -9
Impacted Files Coverage Δ
gcovr/utils.py 66.42% <100%> (+4.12%) ⬆️
gcovr/tests/test_gcov_parser.py 100% <100%> (ø)
gcovr/gcov.py 81.96% <78.39%> (+12.91%) ⬆️
gcovr/__main__.py 88.54% <88.88%> (+0.08%) ⬆️
gcovr/coverage.py 98.57% <97.36%> (-1.43%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12e0ec0...6d8c00c. Read the comment docs.

@hugovk
Copy link

hugovk commented Feb 25, 2018

Is Python 2.6 still needed? It's been EOL for over 4 years and is no longer receiving security updates.

It's also little used. Here's the pip installs for gcovr from PyPI for last month:

python_version percent download_count
2.7 67.45% 4,356
3.4 16.80% 1,085
3.5 13.16% 850
2.6 1.39% 90
3.6 1.18% 76
3.7 0.02% 1

Source: pypinfo --start-date -55 --end-date -25 --percent --pip --markdown gcovr pyversion

(Python 3.3 is also EOL not listed here, how about dropping that too?)

gnomesysadmins pushed a commit to GNOME/nautilus that referenced this pull request Feb 26, 2018
gcovr is broken with GCC 8 due to changes in Gcov report format. Revert
when gcovr/gcovr#228 is merged.
gnomesysadmins pushed a commit to GNOME/nautilus that referenced this pull request Feb 26, 2018
gcovr is broken with GCC 8 due to changes in Gcov report format. Revert
when gcovr/gcovr#228 is merged.
@@ -228,3 +228,43 @@ def build_filter(regex):
return re.compile(re.escape(os.getcwd() + "\\") + regex)
else:
return re.compile(os.path.realpath(regex))


class Logger(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a custom logger rather than python logging module ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I didn't know about logging previously. I'm open to switching to a “proper” logging framework in the future. Right now a low-ceremony custom logger is the least invasive change, so I would prefer to keep this part as it is.

args, kwargs: str.format arguments
"""
pattern = pattern + "\n"
sys.stdout.write(pattern.format(*args, **kwargs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no output file has been specified then the coverage output will be output on stdout as well. Is that a problem ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just extracted the existing sys.stderr.write() and print() calls into this class. So each message continues to go were it went previously: warnings/errors to stderr, and --verbose messages to stdout.

Arguably that's wrong (for exactly your reason) and should be redesigned. However, this PR already has more than enough changes … (sorry about that!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no regressions, that's fine by me. An issue shall probably be opened regarding this, and, as you said, this is another topic.

gcovr/gcov.py Outdated
for line in lines:
try:
self.parse_line(line, exclude_unreachable_branches)
except Exception as ex:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shall probably be an opt-in behavior. I'd rather have gcovr fail with an error message and reporting we can opt-in to ignore those by adding flag --ignore-parsing-errors or something like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I'll add a commit to this PR. Thank you for spotting this issue!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've added a --gcov-ignore-parse-errors flag that is default-off.

If the flag is enabled, gcovr will

  • print all offending lines,
  • print all caught exceptions,
  • and continue processing.

If the flag is not enabled (the default), gcovr does not continue but

  • also prints offending lines and caught exceptions,
  • then displays an error message that the flag can be enabled,
  • then rethrows the first exception if one was caught in order to display its traceback
  • or sys.exit(1) if no exceptions were caught.

In all cases, the errors are processed only after a the complete file has been parsed.

The new `--gcov-ignore-parse-errors` flag controls whether the
`gcovr.gcov.GcovParser.parse_all_lines()` method throws on errors.

If the flag is enabled, gcovr will

 - print all offending lines,
 - print all caught exceptions,
 - and continue processing.

If the flag is not enabled (the default), gcovr does not continue but

 - displays an error message that the flag can be enabled,
 - then rethrows the first exception if one was caught
 - or `sys.exit(1)` if no exceptions were caught.

In all cases, the errors are processed only *after* a the complete file
has been parsed.

To implement this change, the `check_*()` methods of `GcovParser` were
pulled into the `parse_all_lines()` method.
nodejs-github-bot pushed a commit to nodejs/node that referenced this pull request Jul 11, 2021
Update the version of `gcovr` used for C++ coverage from 3.4 to 4.2 for
compatibility with gcc/g++ 8.

PR-URL: #39326
Refs: #39303
Refs: gcovr/gcovr#228
Refs: nodejs/build#2705
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit to nodejs/node that referenced this pull request Jul 13, 2021
Update the version of `gcovr` used for C++ coverage from 3.4 to 4.2 for
compatibility with gcc/g++ 8.

PR-URL: #39326
Refs: #39303
Refs: gcovr/gcovr#228
Refs: nodejs/build#2705
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit to nodejs/node that referenced this pull request Sep 4, 2021
Update the version of `gcovr` used for C++ coverage from 3.4 to 4.2 for
compatibility with gcc/g++ 8.

PR-URL: #39326
Refs: #39303
Refs: gcovr/gcovr#228
Refs: nodejs/build#2705
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Filters related to filters, include/exclude, path handling Gcov Type: Bug Type: Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants