Context-based Reporting #759

strichter · 2019-01-24T13:39:28Z

This PR implements context-based reporting in two ways:

Select the contexts to report on in the "report" and "html" reporting function. (Globs are supported)

  $ coverage report --contexts=my.pkg.sub1*,my.pkg.sub2* ...

This feature in combination with --include allows you to ensure that the tests for a particular sub-package cover the package itself as expected.

Report the contexts in which a line was covered as part of the HTML coverage report. For each covered line, a little hover target is provided that gives you a popover witht he contexts. It can be invoked using:

  $ coverage html --show-contexts ...

In order to accomplish implement the feature, we made the following extensions:

Function contexts are now fully-qualified Python paths. This allows you to select a group of contexts easily and avoids incorrect context reporting due to identical names.
The signature of CoverageData.lines(filename, context=None) and CoverageData.arcs(filename, context=None) changed to expecting multiple context specs. The context argument got renamed to contexts.
The CoverageData classes grew a method set_query_contexts(contexts=None) that allows one to set contexts for all read operations. This is not just a convenience method, but also an important optimization, since we can cache the context id lookup, which is expensive on a larger code base with lots of tests. (We produced 14k contexts and are frequently selecting a few 1k contexts for analysis.)
The CoverageData classes grew a method called contexts_by_lineno(filename) that reports the contexts for each covered line of a given file. It uses a simple join to make that lookup most efficient.
The CoverageJsonData class was adjusted to handle all new context argumetns and methods and provides decent defaults that mimic ignoring contexts completely.

@nedbat Please let me know what you need us to do in order to get this PR accepted:

Where would be a good place to document the new functionality?
The test_data.py and test_html.py tests are a bit of a mess (and you seem to agree based on the comments in them), so how much do you want us to test the new functionality? The entire project is not 100% code covered so it is hard to tell what we are missing.
I have tried to conform to your coding style as much as possible, but sometimes it was not clear especially line width. If you could review the code, then I will fix things up.

nedbat · 2019-01-24T14:27:16Z

@strichter Thanks for all this work. Generally for significant contributions, it's better to discuss them before implementing. It will take me some time to get to this, and I need to understand what you've done here.

codecov-io · 2019-01-24T14:33:56Z

Codecov Report

Merging #759 into master will increase coverage by 0.07%.
The diff coverage is 93.58%.

@@            Coverage Diff             @@
##           master     #759      +/-   ##
==========================================
+ Coverage    90.6%   90.68%   +0.07%     
==========================================
  Files          81       81              
  Lines       11170    11303     +133     
  Branches     1149     1167      +18     
==========================================
+ Hits        10121    10250     +129     
- Misses        939      940       +1     
- Partials      110      113       +3

Impacted Files	Coverage Δ
tests/test_plugins.py	`99.23% <ø> (ø)`	⬆️
coverage/control.py	`90.14% <ø> (+1.44%)`	⬆️
coverage/config.py	`98.29% <100%> (+0.01%)`	⬆️
coverage/cmdline.py	`91.09% <100%> (+0.09%)`	⬆️
coverage/summary.py	`75.92% <100%> (+0.22%)`	⬆️
tests/test_api.py	`97% <100%> (-0.2%)`	⬇️
tests/test_context.py	`100% <100%> (ø)`	⬆️
tests/test_cmdline.py	`100% <100%> (ø)`	⬆️
tests/test_html.py	`99.78% <100%> (+0.66%)`	⬆️
coverage/html.py	`99.06% <100%> (+0.01%)`	⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c4f538a...3069f50. Read the comment docs.

alga · 2019-01-25T18:34:47Z

Here are a few scripts that make use of this functionality. First, ensuring that subpackages are sufficiently covered by the tests contained inside them, rather than being accidentally covered by functional tests in other areas.

The config sits in the coveragerc in the following format:

[thresholds]
package.subpackage1 = 95.0
package.subpackage2 = 90.0

Then the script that checks if these standards are maintained is as follows:

#!/usr/bin/env python3
"""Enforce minimum coverage stadards for subpackages.

This script checks whether certain packages are covered to a given
level by only the tests contained in these packages.  The
configuration for this lives in the [thresholds] section of
coveragerc.
"""
import sys
import os
import configparser
from coverage import Coverage
from coverage.results import Numbers
from coverage.summary import SummaryReporter
from prettytable import PrettyTable


COVERAGERC = 'etc/coveragerc'


def main():

    config = configparser.ConfigParser()
    config.read(COVERAGERC)

    coverage = Coverage(config_file=COVERAGERC)
    coverage.load()

    result = 0
    tbl = PrettyTable(["Package", "Result", "Expected", "Actual", "Lines"])
    tbl.align = "l"

    packages = []
    for package in config.options('thresholds'):
        expected = config.getfloat('thresholds', package)

        contexts = [package + "*", '']
        coverage.config.from_args(
            query_contexts=contexts,
            omit=['/tests*'],
            report_include=[os.path.join('src', *package.split('.'), '*')]
        )
        total = Numbers()
        summary = SummaryReporter(coverage, coverage.config)
        reporters = summary.find_file_reporters(None)

        coverage.get_data().set_query_contexts(contexts)
        for fr in reporters:
            analysis = coverage._analyze(fr)
            total += analysis.numbers

        actual = total.pc_covered
        covered_lines, total_lines = total.ratio_covered
        row = [
            package,
            "PASS" if actual >= expected else "FAIL",
            f"{expected:5.1f}%", f"{actual:5.1f}%",
            f"{covered_lines}/{total_lines}"
        ]
        tbl.add_row(row)
        packages.append(row)
        result = result or (actual < expected)

    print(tbl)

    if result:
        print("FAIL")
    else:
        print("PASS")
    return result


if __name__ == '__main__':
    sys.exit(main())

and the example output is:

+------------------------+--------+----------+--------+-----------+
| Package                | Result | Expected | Actual | Lines     |
+------------------------+--------+----------+--------+-----------+
| shoobx.app.document    | FAIL   | 100.0%   |  97.1% | 5861/6034 |
| shoobx.app.hr          | FAIL   | 100.0%   |  24.1% | 224/928   |
| shoobx.app.process     | FAIL   |  90.0%   |  23.7% | 1378/5825 |
| shoobx.app.sbt         | FAIL   | 100.0%   |  20.6% | 328/1592  |
| shoobx.app.stakeholder | FAIL   | 100.0%   |  34.4% | 1176/3415 |
+------------------------+--------+----------+--------+-----------+
FAIL

alga · 2019-01-25T18:46:09Z

Here is a script that, given coverage data, figures out which tests cover the lines of code changed in the branch/sandbox relative to master and runs only those:

#!/usr/bin/env python3
"""Run only the tests that cover the changed lines"""
import os
import re
import sys
import subprocess

from diff_cover.diff_reporter import GitDiffReporter
from diff_cover.git_diff import GitDiffTool
from coverage.sqldata import CoverageSqliteData


COVERAGE_PATH = "var/coverage/coverage-data"
TARGET_BRANCH = "origin/master"


class OriginalDiffReporter(GitDiffReporter):
    """Diff reporter that reports the original line numbers (before changes.)"""
    # @@ -a,b +c,d @@ text
    #  a,b -- start and length before; c,d -- start and length after changes
    HUNK_LINE_RE = re.compile(r'-([0-9]*)')


def main():
    diff = OriginalDiffReporter(
        TARGET_BRANCH, git_diff=GitDiffTool(), ignore_staged=False,
        ignore_unstaged=False, exclude=None)

    data = CoverageSqliteData(COVERAGE_PATH)
    data.read()
    result = set()
    for src_path in diff.src_paths_changed():
        # Diff reporter reports relative source paths, coverage tracks
        # full paths.
        fullpath = os.path.join(os.getcwd(), src_path)

        contexts = data.contexts_by_lineno(fullpath)
        for line in diff.lines_changed(src_path):
            result |= set(contexts[line])

    if result:
        args = ['bin/test'] + sys.argv[1:]
        for test in result:
            if not test:
                continue
            cls, meth = test.rsplit('.', 1)
            args += ['-t', meth + '..' + cls]
        print("Test command:\n", " ".join(args).replace(" -t", " \\\n    -t"))
        subprocess.call(args)
    else:
        print("No changes.")

if __name__ == "__main__":
    main()

The output is like:

bin/python scripts/testchanges.py -cv
Test command:
 bin/test -cv \
    -t test_foo..shoobx.app.document.tests.test_base.BaseDocumentTests \
    -t test_bar..shoobx.app.document.tests.test_base.BaseDocumentTests
---- PID: 13938 ----
Running tests at level 1
Running zope.testrunner.layer.UnitTests tests:
  Set up zope.testrunner.layer.UnitTests in 0.000 seconds.
  Running:
..
  Ran 2 tests with 0 failures, 0 errors, 0 skipped in 0.009 seconds.
Tearing down left over layers:
  Tear down zope.testrunner.layer.UnitTests in 0.000 seconds.

nedbat · 2019-02-17T20:07:53Z

Thanks, I'm looking over this pull request, and I like lots of it.

Can you tell me more about this?

In order to support dynamically generated tests, you can specify __coverage_context__ on the class level. (I do that all the time when creating large file-comparison-based test suites.)

I'm reluctant to include this as-is for a few reasons:

I'm not a fan of using dunder names, since they should be reserved to Python.
Why can we override the context on the class level, but not the function or module level? Generally I've tried to provide hooks, and let people implement their own logic, rather than hardcode specifics.

The idea from the beginning was that should_start_context_test_function would be generalized into a plugin, and people could provide their own implementations.

How are you dynamically generating classes?

strichter · 2019-02-24T20:46:17Z

I am fine with any other method including some sort of plugin. But from the existing code I simply did not find a good hook to do so. If you do not like dunder attributes, I am not married to them.

According to the unittest module, test classes only have to implement runTest(). That turns out to be super-handy when you want to create a class that tests mulitple things. I use this, for example, to test that document conversions are done correctly. You simply create one class instance for every test file. The problem is that then the test ids are not unique since they all have the same module, class and function name. So I generate a new type for each test file to make the name unique.

A public example is here: https://github.com/zopefoundation/z3c.rml/blob/master/src/z3c/rml/tests/test_rml.py

It is undesirable to run all files with one test case since it prohibits async execution.

As I said, I am not married to the approach, jsut let me know how this can be done better.

Regards,
Stephan

strichter · 2019-02-27T12:44:51Z

@nedbat Shoobx will provide a PR in the next few days implementing a simple plugin mechanism for dynamic contexts. That should remove the need to allow customization via classes this way.

Background: We had a huge perf issue identified with coverage today. reportlab has a method called test() in its code and in dev mode it is called a lot (like 10M in one of our tests) slowing down the test from 40ms to 62,000ms. While we will turn off that call through configuration in reportlab, we are wary that other methods named test*() will create similar perf issues. For our plugin we will limit the dynamic context scope to source files within our project code.

sjustas · 2019-03-07T12:17:18Z

@nedbat, here are two PRs related to discussion above.

Dynamic context plugins - #783, and API Coverage.switch_context - #782

Plugins provide nice non-intrusive flexibility, and API access would allow improving coverage-aware tools like nose (details in #782). What are your thoughts?

…way we can report on namespaces.

… way to update the expected output.)

…(For large coverage data sets, reporting the context can be expensive.)

strichter · 2019-05-12T17:23:37Z

@nedbat Hi Ned, I have addressed your concerns above by removing the special variable. This is possible, since we can now use the new plugins. We have also merged in master and fixed all new test failures.

I hope this removes the last obstacle for this PR.

Regards,
Stephan

nedbat · 2019-05-27T15:35:00Z

Thanks, I'm starting to look at the code. I'd like --contexts to apply to the HTML report also, so I'm making some changes.

nedbat · 2019-06-10T21:49:41Z

Thanks, I've rebased this onto master due to conflicts. Some cleanup, changelog, etc, to come.

strichter · 2019-06-10T23:41:30Z

@nedbat Thanks for merging this in! Very exciting. This code has worked well for us the last 5 months. When do you plan the 5.0 release?

alga force-pushed the sbx-report-contexts branch from af31979 to 41d154f Compare January 25, 2019 19:55

strichter force-pushed the sbx-report-contexts branch from 09bb09b to dce8777 Compare May 12, 2019 15:04

strichter and others added 16 commits May 12, 2019 11:20

Support for multiple, glob-based contexts in result queries.

c111081

Initialize attributes.

7c5451e

Pass the contexts to the report command.

4e9e541

Sigh, sqlite3 does not support proper array serialization.

fad4cf9

Provide a default for query contexts.

54db11e

Make sure to report the full Python path for the test function. That …

f9bc560

…way we can report on namespaces.

Add ability to report contexts for all lines of a file.

a8210a6

Extend HTML report to to show contexts for every covered line.

d3b51db

Fix styling and tests. (The tests are slightly insane with no obvious…

d422014

… way to update the expected output.)

Handle TestCase classes with just one test, too.

61a0af2

Tricky case: Thread switching is getting in the way.

1b90d58

Fix a test.

d89c16f

Only compute contexts in HTML coverage when the option is specified. …

729d648

…(For large coverage data sets, reporting the context can be expensive.)

Add some tests for new methods.

df6f315

Support for full qnames for old-style classes.

d9f82d7

Update tests to use updated APIs.

3069f50

strichter force-pushed the sbx-report-contexts branch from dce8777 to 3069f50 Compare May 12, 2019 16:18

nedbat closed this Jun 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context-based Reporting #759

Context-based Reporting #759

strichter commented Jan 24, 2019 •

edited

nedbat commented Jan 24, 2019

codecov-io commented Jan 24, 2019 •

edited

alga commented Jan 25, 2019

alga commented Jan 25, 2019

nedbat commented Feb 17, 2019

strichter commented Feb 24, 2019

strichter commented Feb 27, 2019

sjustas commented Mar 7, 2019 •

edited

strichter commented May 12, 2019

nedbat commented May 27, 2019 •

edited

nedbat commented Jun 10, 2019

strichter commented Jun 10, 2019

Context-based Reporting #759

Context-based Reporting #759

Conversation

strichter commented Jan 24, 2019 • edited

nedbat commented Jan 24, 2019

codecov-io commented Jan 24, 2019 • edited

Codecov Report

alga commented Jan 25, 2019

alga commented Jan 25, 2019

nedbat commented Feb 17, 2019

strichter commented Feb 24, 2019

strichter commented Feb 27, 2019

sjustas commented Mar 7, 2019 • edited

strichter commented May 12, 2019

nedbat commented May 27, 2019 • edited

nedbat commented Jun 10, 2019

strichter commented Jun 10, 2019

strichter commented Jan 24, 2019 •

edited

codecov-io commented Jan 24, 2019 •

edited

sjustas commented Mar 7, 2019 •

edited

nedbat commented May 27, 2019 •

edited