Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add coverage.py support to Sage's doctests #25263

Open
embray opened this issue Apr 29, 2018 · 11 comments
Open

Add coverage.py support to Sage's doctests #25263

embray opened this issue Apr 29, 2018 · 11 comments

Comments

@embray
Copy link
Contributor

embray commented Apr 29, 2018

I would like to have better line coverage reports for Sage's doctests. There are a few hurdles to this simply due to Sage's custom test runner, and the sheer size of Sage, but nothing that can't be overcome. My initial testing just with coverage.py's command-line interface has shown some reasonable results (by running tests with --serial. But we really need to integrate coverage directly into the doctest runner for parallelization support.

coverage.py does support parallel tests--it has built-in support for writing coverage results to per-process files, and then merging those files. The pytest-cov plugin, which supports this use case, may provide some valuable guidance on this.

One other issue is that the Cython.Coverage plugin to coverage.py, provided by Cython, seems to still have a few issues. In particular it makes a couple assumptions that don't currently work for Sage, including that the code is built/installed in "in-place" mode, and that C sources are output to the same directory as the Cython sources. My experiments indicate that both of these limitations can be overcome with a bit of tinkering, so we don't need to make any major changes to how Sage is currently built in order to support this.

The one thing that does need to be done is that Sage needs to be compiled with the linetrace=True directive passed to the Cython compiler, and with the CYTHON_TRACE=1 macro set when compiling the C sources. Alternatively (and perferably) we can set CYTHON_TRACE_NO_GIL=1 which in principle even allows tracing lines in nogil blocks. However, my initial experiments with that led to segfaults, which I will investigate more later. As a first pass we can do without it.

So what I would propose would be to:

  1. Add some options to setup.py to enable line tracing in Cython: Add option to enable Cython line tracing #25264

  2. Add flags to the doctest runner to optionally enable coverage testing and only if Sage was built with the necessary profiling flags (it could check the .cython_version file for this). Then actually start the coverage engine for each test process, and merge the results at the end.

  3. Subclass and/or monkeypatch the Cython.Coverage plugin, which currently doesn't quite work for how Sage is developed (though if there are any improvements worth upstreaming we should do so).

Component: doctest coverage

Issue created by migration from https://trac.sagemath.org/ticket/25263

@embray embray added this to the sage-8.3 milestone Apr 29, 2018
@jdemeyer

This comment has been minimized.

@embray

This comment has been minimized.

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:3

Report so far: I was able to integrate coverage.py into Sage's doctest runner using its API--that aspect was reasonably easy, though not perfect. The ugliest part is outputting thousands of per-module coverage files for the parallel tests, and then merging them into a single .coverage file. coverage.py provides this ability built-in, but I'm not totally happy with how it works.

The other problem is that Cython's plugin for coverage.py is not perfect--this is no surprise, as it probably doesn't get used much, and Sage is a massive stress-test of it. For one, in order for it to work at all, I had to copy all of the c/cpp files into site-packages. We don't want to do that in normal use, so I think I'll write our own wrapper around Cython.Coverage which can handle alternate search paths for C sources.

There were also several small bugs in the plugin, especially in handling .pxi files :) I think I have temporary hacks around most of the bugs, but now I need to clean things up.

Finally, while most of the tests ran just fine under profiling, about a dozen just outright timed out. I have a few ideas for speeding up line profiling in Cython, but I'm not sure how much of a difference it will make.

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:4

Oh, some other notes:

The coverage report I was able to produce, finally, for the full doctest suite (sans "long" tests) suggested 77% coverage! For some bizarre reason the file sphinx/pycode/pgen2/parse.py got included in the coverage report--the only file outside of the sage package that was included. The configuration was such that it was only supposed to include files in Sage. I'm not sure how that happened yet.

Also, in reality, the coverage is probably even a bit higher: Cython's line profiling itself has a few bugs. In particular it omits the def lines for all functions and methods, even those that are definitely run. This should be easy to fix. It is also omitting all module-level Python code in .pyx files (including import statements!), which of course should be covered so long as the modules import successfully. This might be slightly trickier to fix but I'm not sure.

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:5

You can see this example coverage report here: https://embray.github.io/sagemath-coverage/

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:6

Replying to @embray:

In particular it omits the def lines for all functions and methods, even those that are definitely run.

Looking more at the coverage report, I'm seeing this problem even on pure Python modules, so maybe it's less a problem with the tracing than with reporting. I've never seen this problem before on pure-Python code coverage (and I confirmed even with the same version of coverage.py I don't see this problem normally).

The Cython plugin really shouldn't be handling pure-Python modules at all, but it might be that the Cython plugin is still interfering somehow.

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:7

def and class statements in some modules aren't being tracked because those line are executed at module level as well, so any module-level code not being traced is the same problem, of course.

I've also found now that it's not a problem in all pure-Python modules, just some (and AFAICT in all Cython modules).

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:8

At least for a trivial example I tried line tracing worked for a Cython module just fine, including all module-level statements.

Given that this appears to be affecting Cython and Python modules alike, my next guess it that it has something to do with LazyImport...

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:9

Oh, silly me. It's probably nothing to do with LazyImport, but much simpler: the init_sage() function run by the DocTestDispatcher itself probably winds up importing many if not most modules in Sage, and that is not currently run with the coverage tracer enabled. I'll fix that and see how it goes.

@embray
Copy link
Contributor Author

embray commented May 1, 2018

comment:10

Yes, that more or less did the trick. I also had to move enabling of coverage tracking to very early in the sage-runtests script, so that it can pick up some of the lines covered simply by importing sage.doctest.

@embray
Copy link
Contributor Author

embray commented Jul 18, 2018

comment:11

I have a mostly working prototype for this, but it's somewhat marred by drastic slowdown, especially in Cython modules. It might not be (as) useful for running coverage on the full doctest suite. But it definitely helps for checking coverage of doctests provided in a single module (especially useful in Sage since the tests tend to be very localized). It also might be more useful in conjunction with #25270 to get reasonable estimates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants