New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added decision coverage to HTML output #350
Conversation
Thank you for working on this! And for investing the effort of contributing your changes upstream! Many users have been confused by gcovr's branch coverage, so this might present a useful alternative. However, I'm somewhat uneasy with munging source code to attribute branch coverage to individual control flow constructs, as I discussed somewhere in the gcovr FAQ. It's worth pointing out that nothing* in gcovr is currently C or C++-specific, but that gcovr should work with any language that's supported by GCC/LLVM. So while I look forward to merging this, I'll have to think about what the default behaviour should be (balancing usability, correctness, back-compat, and so on). * --exclude-unreachable-branches is a heuristic that assumes C-like curly-brace languages This PR has some interactions with other work:
Those are just some things of the top of my head. I'm looking forward to look at this PR in more detail later, perhaps as early as this weekend. |
I have pushed some changes to the html-rewrite branch. Could you redo your changes to the HTML reports on that branch, i.e. rebase the entire pull request? The rewritten HTML generator is much simpler, and details about branch coverage is now shown in a popup like this: You could add the detailed messages about decision coverage to this popup. You can also rename the Branch column to show Decision data, but only if the user opts in. Perhaps a Also, I really do not understand the decision parsing code in |
Codecov Report
@@ Coverage Diff @@
## master #350 +/- ##
==========================================
- Coverage 96.08% 95.73% -0.35%
==========================================
Files 22 23 +1
Lines 2985 3213 +228
Branches 559 603 +44
==========================================
+ Hits 2868 3076 +208
- Misses 52 66 +14
- Partials 65 71 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Thank you for rebasing the pull request! I just pushed two further commits to the html-rewrite branch which should fix some of the test failures you were seeing, but the CI failures also indicate a typo in your I'll take a deeper look into your changes over the next couple of days. Displaying decision coverage alongside branch coverage data is fine by me. During the last few days I found and fixed some bugs so I am planning to cut a release within the next few weeks, and I want this decision coverage feature to be part of the release. With regards to this feature, the blockers are:
If you have time to work on this that would be great, otherwise I'll do a bit but I will be slower. |
Yeah, I'm currently fixing some typos I made and some other refinements. Also adding the feature to enable the decision coverage analysis via argument. Since I think that this feature is pretty relevant and my company also is interested in that, I'm able to spend quite a bit of time for this feature, so it's no problem for me to do as many of the tasks as possible! I'm guessing with a new HTML output and this feature, the next release will be a new major? |
Okay, so besides the documentation (working on that right now), most things are done. However, I still have some questions regarding the JSON-output: Right now - no matter if the --decisions argument is used or not - a dict called "decision" is created for every line, it's just empty for every line if the feature is disabled. I'm thinking if this should stay that way, or if the output of the "decision" dict should be disabled completely when the feature isn't enabled. And another notice: The HTML test should pass now, but it seems that it fails because of the different timestamps in reference and test. Somehow the test runner doesn't seem to change the timestamps to zeros... I didn't have time to check that in detail though. |
Thank you so much, this is converging to a state that's ready for merging. I will merge the html-rewrite into master prior to this, so the effective “size” of this pull request will be smaller. I'd be happy if you could add some more comments to the decision coverage analysis to make it clearer why you are doing something. This isn't necessarily obvious to me. You can add example C code snippets in these comments to show when which heuristic applies. Perhaps this could even be a doctest, like: def prep_decision_string(code):
"""remove comments from a line of code.
>>> prep_decision_string(' int x = foo(); // fooify something')
' int x = foo(); '
"""
... The HTML test uses some “scrubber” regexes to remove timestamps and specific versions, but these were broken during the HTML rewrite. I can fix them tomorrow, or you can try to update them. |
@glenntws: cool that you made this. I was also looking for this feature. |
I've rebased the branch to current master and created a PR to the fork of @glenntws. |
Is this PR to be abandoned? There are tons of conflicts here. |
@macetw I think a cherry pick based on the current master would be best. |
Summary: This commit introduces a completely rewritten parser for gcov's human-readable report format. The implementation pays closer attention to the upstream file format, e.g. with respect to the exact number formatting. Compared to the previous parser, the data flow has been greatly clarified, making this parser more obviously correct. The new parser will make it easier to react to future changes in the format, and to make use of improvements from newer GCC versions – such as the `Working directory` metadata. Background ---------- The old parser had some problems that could be addressed by this rewrite. * The old parser was very complex due to mixing low level parsing and high-level decisions about coverage exclusion. Fixed in the new parser: now, staging is split into multiple phases that clearly separate concerns. The low-level parsing phase produces a data model that is safe to use for the other phases. * The old parser used an object-oriented design were parsing state was stored in instance fields. This created a huge "state space" that was difficult to track, potentially leading to bugs. For example, it was not immediately clear in the investigation of <gcovr#511> whether the problem was just that the parser forgot to update its line number correctly. Fixed in the new parser: using a more functional/procedural design, the data flows are very clear. State is tracked explicitly. By separating parsing phases, the state space is much smaller. * To expand on the previous point: The old parser essentially consisted of multiple interleaved state machines. There was one state machine for processing coverage data, and an interleaved state machine for tracking active coverage exclusion markers. This interleaving made it difficult to understand whether the state transitions were correct. Fixed in the new parser: coverage exclusion patterns are collected in a separate phase, before the actual coverage data is processed. * The old parser made use of very fragile parsing strategies, such as using `str.split()` excessively. This gave rise to fragile assumptions about the exact format. For example, the IndexError in <gcovr#226> was an example of wrong assumptions. The new parser uses regular expressions to parse tag lines, and only uses `str.split()` for the more structured source code lines. This is more self-documenting. Treatment of numerical values was aligned with the routines in the gcov source code. Should the format deviate in the future, the regexes will fail to match, making it possible to detect and fix the errors. (Until then, `--gcov-ignore-parse-errors` can be used). Design of the new parser ------------------------ The new parser is more complex in the sense that there is a lot more code. But there is a clearer separation of concerns, and the parser was closely informed by the gcov documentation and source code. As a result, I am confident that it handles far more edge cases correctly, in particular relating to the handling of numbers/percentages. There are three items for external use: **`parse_metadata(lines)`** creates a dict of values from the metadata lines. The old parser handled the very first line of the report separately to extract the filename. The new parser uses the same more robust parsing code for this metadata. **`ParserFlags`** is a flag-enum that describes various boolean features. A single object with flags seems simpler to handle than multiple variables like `exclude_throw_branches`. **`parse_coverage(lines, ...)`** is the main function for parsing the coverage. It performs multiple phases: * Each input line is parsed/tokenized into an internal data model. The data model is strongly typed. The various classes like `_SourceLine` are implemented as NamedTuples, which is both very convenient and very memory-efficient. Relevant items: `_parse_line()`, `_parse_tag_line()`, data model * Exclusion markers are extracted from source code lines and arranged into a data structure for later lookup. Relevant items: `_find_excluded_ranges()`, `_make_is_in_any_range()` * Parsed lines are processed to populate a `FileCoverage` model. At this stage, exclusions are applied. The state space is very small, with only four variables that have to be tracked. Relevant items: `_ParserState`, `_gather_coverage_from_line()`, `_line_noncode_and_count()`, `_function_can_be_excluded()`, `_branch_can_be_excluded()`, `_add_coverage_for_function()` * Warnings are reported, and any potential errors re-raised. This is equivalent to the previous parser. Relevant items: `_report_lines_with_errors()` Impact on tests --------------- The new parser is almost completely bug-compatible with the old parser. This is e.g. visible in the potentially unintuitive handling of the `noncode` status. The new separation between low-level parsing and high-level decisions makes it more clear what is actually going on. There was a significant change in the Nautilus parser test. The input file contains the following pattern: ------------------ #####: 52:foo() ? bar(): Previously, this line 52 was not reported as uncovered. I consider that to be an error, and have updated the expected output correspondingly. This could indicate that the new parser is in fact more robust than the old parser when it comes to template specialization sections. Other than that, the tests were only updated to account for the different parser APIs. Internally, the new parser uses a lot of doctests. Future directions ----------------- The new parser extracts *all* available data, only to throw it away. It might now become feasible to make use of more of this data. In particular: * handling template specialization sections properly * collecting block-level coverage data * using the `working directory` metadata field Conflicts with other development efforts ---------------------------------------- * <gcovr#503> report of excluded coverage Makes a small patch to the parser. The same effect can be achieved by adding a few lines in `_gather_coverage_from_line()`. * <gcovr#484> tests with clang-10 Touches neighboring lines. Will be reported as a merge conflict by Git, but there's no semantic conflict. * <gcovr#474> abstract interface for reader/writer Small change in the parser code regarding `sys.exit(1)` (new parser: `raise SystemExit(1)`). It's worth noting that this is effectively unreachable code. Lines will only be reported if there was an exception, and if there was an exception it will be re-thrown. * <gcovr#361> --no-markers to ignore exclusion markers Touches the exclusion handling code. This is of course totally changed by the new parser. But the new parser would make it even easier to implement that functionality. * <gcovr#350> decision coverage Adds significant new parsing code, but most of it is outside of the gcov-parser. These changes could be ported with moderate effort to the new parser.
Summary: This commit introduces a completely rewritten parser for gcov's human-readable report format. The implementation pays closer attention to the upstream file format, e.g. with respect to the exact number formatting. Compared to the previous parser, the data flow has been greatly clarified, making this parser more obviously correct. The new parser will make it easier to react to future changes in the format, and to make use of improvements from newer GCC versions – such as the `Working directory` metadata. Background ---------- The old parser had some problems that could be addressed by this rewrite. * The old parser was very complex due to mixing low level parsing and high-level decisions about coverage exclusion. Fixed in the new parser: now, staging is split into multiple phases that clearly separate concerns. The low-level parsing phase produces a data model that is safe to use for the other phases. * The old parser used an object-oriented design were parsing state was stored in instance fields. This created a huge "state space" that was difficult to track, potentially leading to bugs. For example, it was not immediately clear in the investigation of <gcovr#511> whether the problem was just that the parser forgot to update its line number correctly. Fixed in the new parser: using a more functional/procedural design, the data flows are very clear. State is tracked explicitly. By separating parsing phases, the state space is much smaller. * To expand on the previous point: The old parser essentially consisted of multiple interleaved state machines. There was one state machine for processing coverage data, and an interleaved state machine for tracking active coverage exclusion markers. This interleaving made it difficult to understand whether the state transitions were correct. Fixed in the new parser: coverage exclusion patterns are collected in a separate phase, before the actual coverage data is processed. * The old parser made use of very fragile parsing strategies, such as using `str.split()` excessively. This gave rise to fragile assumptions about the exact format. For example, the IndexError in <gcovr#226> was an example of wrong assumptions. The new parser uses regular expressions to parse tag lines, and only uses `str.split()` for the more structured source code lines. This is more self-documenting. Treatment of numerical values was aligned with the routines in the gcov source code. Should the format deviate in the future, the regexes will fail to match, making it possible to detect and fix the errors. (Until then, `--gcov-ignore-parse-errors` can be used). Design of the new parser ------------------------ The new parser is more complex in the sense that there is a lot more code. But there is a clearer separation of concerns, and the parser was closely informed by the gcov documentation and source code. As a result, I am confident that it handles far more edge cases correctly, in particular relating to the handling of numbers/percentages. There are three items for external use: **`parse_metadata(lines)`** creates a dict of values from the metadata lines. The old parser handled the very first line of the report separately to extract the filename. The new parser uses the same more robust parsing code for this metadata. **`ParserFlags`** is a flag-enum that describes various boolean features. A single object with flags seems simpler to handle than multiple variables like `exclude_throw_branches`. **`parse_coverage(lines, ...)`** is the main function for parsing the coverage. It performs multiple phases: * Each input line is parsed/tokenized into an internal data model. The data model is strongly typed. The various classes like `_SourceLine` are implemented as NamedTuples, which is both very convenient and very memory-efficient. Relevant items: `_parse_line()`, `_parse_tag_line()`, data model * Exclusion markers are extracted from source code lines and arranged into a data structure for later lookup. Relevant items: `_find_excluded_ranges()`, `_make_is_in_any_range()` * Parsed lines are processed to populate a `FileCoverage` model. At this stage, exclusions are applied. The state space is very small, with only four variables that have to be tracked. Relevant items: `_ParserState`, `_gather_coverage_from_line()`, `_line_noncode_and_count()`, `_function_can_be_excluded()`, `_branch_can_be_excluded()`, `_add_coverage_for_function()` * Warnings are reported, and any potential errors re-raised. This is equivalent to the previous parser. Relevant items: `_report_lines_with_errors()` Impact on tests --------------- The new parser is almost completely bug-compatible with the old parser. This is e.g. visible in the potentially unintuitive handling of the `noncode` status. The new separation between low-level parsing and high-level decisions makes it more clear what is actually going on. There was a significant change in the **Nautilus parser test**. The input file contains the following pattern: ------------------ #####: 52:foo() ? bar(): Previously, this line 52 was not reported as uncovered. I consider that to be an error, and have updated the expected output correspondingly. This could indicate that the new parser is in fact more robust than the old parser when it comes to template specialization sections. In the **excl-branch test**, gcovr will encounter gcov input as the following when using GCC-8 or later: #####: 9: virtual ~Bar() #####: 10: {} // ... ------------------ Bar::~Bar(): function Bar::~Bar() called 0 returned 0% blocks executed 0% #####: 9: virtual ~Bar() #####: 10: {} // ... call 0 never executed call 1 never executed ------------------ Bar::~Bar(): function Bar::~Bar() called 0 returned 0% blocks executed 0% #####: 9: virtual ~Bar() #####: 10: {} // ... ------------------ The old parser associated the `function` annotations with line 11. This was clearly incorrect. The test reference was updated to associate the destructor with line 9. Other than that, the tests were only updated to account for the different parser APIs. Internally, the new parser uses a lot of doctests. Future directions ----------------- The new parser extracts *all* available data, only to throw it away. It might now become feasible to make use of more of this data. In particular: * handling template specialization sections properly * collecting block-level coverage data * using the `working directory` metadata field Conflicts with other development efforts ---------------------------------------- * <gcovr#503> report of excluded coverage Makes a small patch to the parser. The same effect can be achieved by adding a few lines in `_gather_coverage_from_line()`. * <gcovr#484> tests with clang-10 Touches neighboring lines. Will be reported as a merge conflict by Git, but there's no semantic conflict. * <gcovr#474> abstract interface for reader/writer Small change in the parser code regarding `sys.exit(1)` (new parser: `raise SystemExit(1)`). It's worth noting that this is effectively unreachable code. Lines will only be reported if there was an exception, and if there was an exception it will be re-thrown. * <gcovr#361> --no-markers to ignore exclusion markers Touches the exclusion handling code. This is of course totally changed by the new parser. But the new parser would make it even easier to implement that functionality. * <gcovr#350> decision coverage Adds significant new parsing code, but most of it is outside of the gcov-parser. These changes could be ported with moderate effort to the new parser.
Summary: This commit introduces a completely rewritten parser for gcov's human-readable report format. The implementation pays closer attention to the upstream file format, e.g. with respect to the exact number formatting. Compared to the previous parser, the data flow has been greatly clarified, making this parser more obviously correct. The new parser will make it easier to react to future changes in the format, and to make use of improvements from newer GCC versions – such as the `Working directory` metadata. Background ---------- The old parser had some problems that could be addressed by this rewrite. * The old parser was very complex due to mixing low level parsing and high-level decisions about coverage exclusion. Fixed in the new parser: now, staging is split into multiple phases that clearly separate concerns. The low-level parsing phase produces a data model that is safe to use for the other phases. * The old parser used an object-oriented design were parsing state was stored in instance fields. This created a huge "state space" that was difficult to track, potentially leading to bugs. For example, it was not immediately clear in the investigation of <gcovr#511> whether the problem was just that the parser forgot to update its line number correctly. Fixed in the new parser: using a more functional/procedural design, the data flows are very clear. State is tracked explicitly. By separating parsing phases, the state space is much smaller. * To expand on the previous point: The old parser essentially consisted of multiple interleaved state machines. There was one state machine for processing coverage data, and an interleaved state machine for tracking active coverage exclusion markers. This interleaving made it difficult to understand whether the state transitions were correct. Fixed in the new parser: coverage exclusion patterns are collected in a separate phase, before the actual coverage data is processed. * The old parser made use of very fragile parsing strategies, such as using `str.split()` excessively. This gave rise to fragile assumptions about the exact format. For example, the IndexError in <gcovr#226> was an example of wrong assumptions. The new parser uses regular expressions to parse tag lines, and only uses `str.split()` for the more structured source code lines. This is more self-documenting. Treatment of numerical values was aligned with the routines in the gcov source code. Should the format deviate in the future, the regexes will fail to match, making it possible to detect and fix the errors. (Until then, `--gcov-ignore-parse-errors` can be used). Design of the new parser ------------------------ The new parser is more complex in the sense that there is a lot more code. But there is a clearer separation of concerns, and the parser was closely informed by the gcov documentation and source code. As a result, I am confident that it handles far more edge cases correctly, in particular relating to the handling of numbers/percentages. There are three items for external use: **`parse_metadata(lines)`** creates a dict of values from the metadata lines. The old parser handled the very first line of the report separately to extract the filename. The new parser uses the same more robust parsing code for this metadata. **`ParserFlags`** is a flag-enum that describes various boolean features. A single object with flags seems simpler to handle than multiple variables like `exclude_throw_branches`. **`parse_coverage(lines, ...)`** is the main function for parsing the coverage. It performs multiple phases: * Each input line is parsed/tokenized into an internal data model. The data model is strongly typed. The various classes like `_SourceLine` are implemented as NamedTuples, which is both very convenient and very memory-efficient. Relevant items: `_parse_line()`, `_parse_tag_line()`, data model * Exclusion markers are extracted from source code lines and arranged into a data structure for later lookup. Relevant items: `_find_excluded_ranges()`, `_make_is_in_any_range()` * Parsed lines are processed to populate a `FileCoverage` model. At this stage, exclusions are applied. The state space is very small, with only four variables that have to be tracked. Relevant items: `_ParserState`, `_gather_coverage_from_line()`, `_line_noncode_and_count()`, `_function_can_be_excluded()`, `_branch_can_be_excluded()`, `_add_coverage_for_function()` * Warnings are reported, and any potential errors re-raised. This is equivalent to the previous parser. Relevant items: `_report_lines_with_errors()` Impact on tests --------------- The new parser is almost completely bug-compatible with the old parser. This is e.g. visible in the potentially unintuitive handling of the `noncode` status. The new separation between low-level parsing and high-level decisions makes it more clear what is actually going on. There was a significant change in the **Nautilus parser test**. The input file contains the following pattern: ------------------ #####: 52:foo() ? bar(): Previously, this line 52 was not reported as uncovered. I consider that to be an error, and have updated the expected output correspondingly. This could indicate that the new parser is in fact more robust than the old parser when it comes to template specialization sections. In the **excl-branch test**, gcovr will encounter gcov input as the following when using GCC-8 or later: #####: 9: virtual ~Bar() #####: 10: {} // ... ------------------ Bar::~Bar(): function Bar::~Bar() called 0 returned 0% blocks executed 0% #####: 9: virtual ~Bar() #####: 10: {} // ... call 0 never executed call 1 never executed ------------------ Bar::~Bar(): function Bar::~Bar() called 0 returned 0% blocks executed 0% #####: 9: virtual ~Bar() #####: 10: {} // ... ------------------ The old parser associated the `function` annotations with line 11. This was clearly incorrect. The test reference was updated to associate the destructor with line 9. Other than that, the tests were only updated to account for the different parser APIs. Internally, the new parser uses a lot of doctests. Future directions ----------------- The new parser extracts *all* available data, only to throw it away. It might now become feasible to make use of more of this data. In particular: * handling template specialization sections properly * collecting block-level coverage data * using the `working directory` metadata field Conflicts with other development efforts ---------------------------------------- * <gcovr#503> report of excluded coverage Makes a small patch to the parser. The same effect can be achieved by adding a few lines in `_gather_coverage_from_line()`. * <gcovr#484> tests with clang-10 Touches neighboring lines. Will be reported as a merge conflict by Git, but there's no semantic conflict. * <gcovr#474> abstract interface for reader/writer Small change in the parser code regarding `sys.exit(1)` (new parser: `raise SystemExit(1)`). It's worth noting that this is effectively unreachable code. Lines will only be reported if there was an exception, and if there was an exception it will be re-thrown. * <gcovr#361> --no-markers to ignore exclusion markers Touches the exclusion handling code. This is of course totally changed by the new parser. But the new parser would make it even easier to implement that functionality. * <gcovr#350> decision coverage Adds significant new parsing code, but most of it is outside of the gcov-parser. These changes could be ported with moderate effort to the new parser.
Summary: This commit introduces a completely rewritten parser for gcov's human-readable report format. The implementation pays closer attention to the upstream file format, e.g. with respect to the exact number formatting. Compared to the previous parser, the data flow has been greatly clarified, making this parser more obviously correct. The new parser will make it easier to react to future changes in the format, and to make use of improvements from newer GCC versions – such as the `Working directory` metadata. Background ---------- The old parser had some problems that could be addressed by this rewrite. * The old parser was very complex due to mixing low level parsing and high-level decisions about coverage exclusion. Fixed in the new parser: now, staging is split into multiple phases that clearly separate concerns. The low-level parsing phase produces a data model that is safe to use for the other phases. * The old parser used an object-oriented design were parsing state was stored in instance fields. This created a huge "state space" that was difficult to track, potentially leading to bugs. For example, it was not immediately clear in the investigation of <gcovr#511> whether the problem was just that the parser forgot to update its line number correctly. Fixed in the new parser: using a more functional/procedural design, the data flows are very clear. State is tracked explicitly. By separating parsing phases, the state space is much smaller. * To expand on the previous point: The old parser essentially consisted of multiple interleaved state machines. There was one state machine for processing coverage data, and an interleaved state machine for tracking active coverage exclusion markers. This interleaving made it difficult to understand whether the state transitions were correct. Fixed in the new parser: coverage exclusion patterns are collected in a separate phase, before the actual coverage data is processed. * The old parser made use of very fragile parsing strategies, such as using `str.split()` excessively. This gave rise to fragile assumptions about the exact format. For example, the IndexError in <gcovr#226> was an example of wrong assumptions. The new parser uses regular expressions to parse tag lines, and only uses `str.split()` for the more structured source code lines. This is more self-documenting. Treatment of numerical values was aligned with the routines in the gcov source code. Should the format deviate in the future, the regexes will fail to match, making it possible to detect and fix the errors. (Until then, `--gcov-ignore-parse-errors` can be used). Design of the new parser ------------------------ The new parser is more complex in the sense that there is a lot more code. But there is a clearer separation of concerns, and the parser was closely informed by the gcov documentation and source code. As a result, I am confident that it handles far more edge cases correctly, in particular relating to the handling of numbers/percentages. There are three items for external use: **`parse_metadata(lines)`** creates a dict of values from the metadata lines. The old parser handled the very first line of the report separately to extract the filename. The new parser uses the same more robust parsing code for this metadata. **`ParserFlags`** is a flag-enum that describes various boolean features. A single object with flags seems simpler to handle than multiple variables like `exclude_throw_branches`. **`parse_coverage(lines, ...)`** is the main function for parsing the coverage. It performs multiple phases: * Each input line is parsed/tokenized into an internal data model. The data model is strongly typed. The various classes like `_SourceLine` are implemented as NamedTuples, which is both very convenient and very memory-efficient. Relevant items: `_parse_line()`, `_parse_tag_line()`, data model * Exclusion markers are extracted from source code lines and arranged into a data structure for later lookup. Relevant items: `_find_excluded_ranges()`, `_make_is_in_any_range()` * Parsed lines are processed to populate a `FileCoverage` model. At this stage, exclusions are applied. The state space is very small, with only four variables that have to be tracked. Relevant items: `_ParserState`, `_gather_coverage_from_line()`, `_line_noncode_and_count()`, `_function_can_be_excluded()`, `_branch_can_be_excluded()`, `_add_coverage_for_function()` * Warnings are reported, and any potential errors re-raised. This is equivalent to the previous parser. Relevant items: `_report_lines_with_errors()` Impact on tests --------------- The new parser is almost completely bug-compatible with the old parser. This is e.g. visible in the potentially unintuitive handling of the `noncode` status. The new separation between low-level parsing and high-level decisions makes it more clear what is actually going on. There was a significant change in the **Nautilus parser test**. The input file contains the following pattern: ------------------ #####: 52:foo() ? bar(): Previously, this line 52 was not reported as uncovered. I consider that to be an error, and have updated the expected output correspondingly. This could indicate that the new parser is in fact more robust than the old parser when it comes to template specialization sections. In the **excl-branch test**, gcovr will encounter gcov input as the following when using GCC-8 or later: #####: 9: virtual ~Bar() #####: 10: {} // ... ------------------ Bar::~Bar(): function Bar::~Bar() called 0 returned 0% blocks executed 0% #####: 9: virtual ~Bar() #####: 10: {} // ... call 0 never executed call 1 never executed ------------------ Bar::~Bar(): function Bar::~Bar() called 0 returned 0% blocks executed 0% #####: 9: virtual ~Bar() #####: 10: {} // ... ------------------ The old parser associated the `function` annotations with line 11. This was clearly incorrect. The test reference was updated to associate the destructor with line 9. Other than that, the tests were only updated to account for the different parser APIs. Internally, the new parser uses a lot of doctests. Future directions ----------------- The new parser extracts *all* available data, only to throw it away. It might now become feasible to make use of more of this data. In particular: * handling template specialization sections properly * collecting block-level coverage data * using the `working directory` metadata field Conflicts with other development efforts ---------------------------------------- * <gcovr#503> report of excluded coverage Makes a small patch to the parser. The same effect can be achieved by adding a few lines in `_gather_coverage_from_line()`. * <gcovr#484> tests with clang-10 Touches neighboring lines. Will be reported as a merge conflict by Git, but there's no semantic conflict. * <gcovr#474> abstract interface for reader/writer Small change in the parser code regarding `sys.exit(1)` (new parser: `raise SystemExit(1)`). It's worth noting that this is effectively unreachable code. Lines will only be reported if there was an exception, and if there was an exception it will be re-thrown. * <gcovr#361> --no-markers to ignore exclusion markers Touches the exclusion handling code. This is of course totally changed by the new parser. But the new parser would make it even easier to implement that functionality. * <gcovr#350> decision coverage Adds significant new parsing code, but most of it is outside of the gcov-parser. These changes could be ported with moderate effort to the new parser.
7074bf9
to
e0a0fe0
Compare
@latk I've squashed the branch for better handling of rebase because there where several changes which where undone. BTW: Do you have an idea why pypy3 tests are hanging since today? |
Hope this can be integrated ! |
Updated comments on copyright... Updated documentation for added decision coverage. Redordered textcase execution Improved switch-case detection for older GCC iterations. Fixed tests to fit to GCC 5 spec. Removed verbose print. Added logging. Repaired the test case. Added JSON check. Reorganized decision analysis. Additional check to improve decision detection accuracy. Changed argument to fit better Added decision analysis to JSON output. Small cleanups. Remove unused functions Cleaned up the code, decision coverage is in a seperate module. Enabling via argument added. Added warning for unchecked decisions, cleaned up some functions Fixed some typos Updated test files for changed unicodes of check and cross Added reference files for tests Added decision coverage for HTML outputs Co-authored-by: Lukas Atkinson <opensource@LukasAtkinson.de> Co-authored-by: Spacetown <michael.foerderer@gmx.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't pretend to understand the heuristics that parse the source code, but for reasonably-formatted code it should be fine. And since this feature is opt-in, the general design is OK.
There are various points that should be addressed before this can be released, though. Not fixing these issues would produce significant technical debt down the line.
For example, the JSON format does not look like a lossless serialization of our data model. That the decision coverage analyzer opens the source code files by itself required significant changes throughout other code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments! My greatest concerns have been addressed, so in principle this is ready to merge.
I now understand that the JSON data model is a more accurate description of the decision coverage analysis than the internal data model, so the JSON format is OK (but could maybe improved).
Most important remaining change: not removing the txt-branch
config key, as removing it would break backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments! The new coverage data model raises questions how totals should be calculated.
But I'm really happy that we're managing to tear through the backlog of pull requests! We're close to getting this one over the finish line as well.
One thing that I noticed is that there are no docs for this feature, currently. I think I can write a section about this feature after the PR is merged. (Have been toying with the docs anyway, will submit a PR soonishly.)
Then I'll merge back after fixing the CI. |
8ff3c9f
to
14ce504
Compare
Hello everybody,
my company is active in embedded software development for the automotive industry. To achieve certain levels of security (ISO 26262, ASIL A,B and C), "taken at least once" is completely overkill and "branches executed" is not usefull to see, if certain decisions have been made by the software.
The result is that I implemented a new decision coverage feature which extends the already existing outputs.
A new test case also is integrated.
The strategy how the decisions get analyzed is the following:
Maybe someone is interested in integrating this output into other file formats. HTML was the most important for us for now.