Differences in coverage percentage between Python 3.7 and Python 3.8 #866

sanjioh · 2019-11-04T11:47:54Z

Describe the bug
The coverage percentage appears to be sensible to the Python version under which coverage run and coverage combine + coverage report are executed. Specifically, the coverage percentage varies depending on the combinations of Python versions those commands are run with (while using the same version of coverage.py).

I have experienced the following patterns:

coverage run	coverage combine + coverage report	coverage percentage
Python 3.7	Python 3.7	100%
Python 3.8	Python 3.8	100%
Python 3.7	Python 3.8	92%
Python 3.8	Python 3.7	100%

To Reproduce

Clone this repository: https://github.com/sanjioh/tox-interpreters.
Run tox -r -e py38-tox314,coverage-report. This should report a 100% coverage.
Run tox -r -e py37-tox314,coverage-report. This should report a 92% coverage.
If basepython for the coverage-report tox env is changed to python3.7, coverage consistently reports 100%, as per the above table.

coverage.py version: 4.5.4

Expected behavior
I would expect to get 100% coverage for the combination of coverage run run with Python 3.7, coverage combine + coverage report run with Python 3.8.

Thanks for your support, please let me know if you need any further information.

The text was updated successfully, but these errors were encountered:

nedbat · 2019-11-05T00:46:01Z

This is an unfortunate side effect of changes in details of the trace function between versions of Python. In 3.7 and earlier, a decorated function definition would invoke the trace function only for the decorator line, not for the def line. In 3.8 and later, the trace function is called for both the decorator line and the def line.

So your Python 3.7 run collected the data that the decorator lines were run, but not the def line, because that is how 3.7 behaves. When reporting the coverage on 3.8, coverage.py knows that both the decorator and def lines could have been marked as run, and sees only the decorator line in the collected data, and so marks the def line as not run.

I'm not sure what coverage.py can do to improve this situation. I guess in theory the "what could have been run" logic could use the Python version noted in the data, not the Python version it's running against, but I haven't thought through how feasible that is.

Can you coordinate to use the same version for both measurement and reporting?

sanjioh · 2019-11-05T09:19:55Z

Hi,

thanks for thorough explanation, that makes sense indeed.
My suspects were towards decorated functions, but I couldn't figure out the reasons behind.
Just a little weirdness I've noticed: what's special about the _split() method that makes coverage behave differently (and correctly)? Could this be a starting point for something?

While it would be surely nice for coverage.py to abstract away the details of the underlying tracing logic, I totally understand that's not something trivial to achieve.
For now, I'm going to combine the coverage results across all the Python versions I'm supporting, so the impact of this issue should be minimal.

Thanks again for your help.

nedbat · 2019-11-05T10:10:15Z

I noticed that about _split also. The default argument for sep causes the trace function to be called for the def line also, which I hadn't realized before. This gets messy... :(

ArturKlauser · 2019-12-04T15:37:14Z

Just FYI (I ran into the same issue):
I think this problem is bound to become more prevalent now that python 3.8 is starting to become the 'default' python version in places, e.g. CI environments. E.g. I was running coverage over a test matrix of python versions and then combining+reporting with the default (i.e. newest stable) python version, which brings out this decorator problem. Maybe add to FAQ?

When running `coverage combine` under python 3.8, function definition lines after function decorators are marked as not executed. This issue doesn't happen with earlier versions of python, so using v3.7 instead. See nedbat/coveragepy#866 for an explanation.

nedbat · 2019-12-04T19:47:19Z

@ArturKlauser I'm wondering what kind of thing to put in the FAQ. It could be, "If you measure and report on different versions of Python, you could get confusing results." Or it could be, "If you measure on 3.7 and report on 3.8, a decorated function will mark the 'def' line as not run."

That is, how specific a FAQ entry are you thinking of?

ArturKlauser · 2019-12-05T11:08:34Z

I was thinking more along the lines of the latter, but a bit more general, like "If you measure on < 3.8 and report on >= 3.8, a decorated function will mark the 'def' line as not run." The < 3.8 is from my experience with reporting on 2.7 and 3.7, but I assume it generally holds for < 3.8 given your description of the cause earlier in this bug report. The >= 3.8 is my assumption that the current 3.8.0 behavior is going to be kept going forward.

I think two good candidate locations for this warning would be the FAQ about "Q: Why do the bodies of functions (or classes) show as executed, but the def lines do not?" or close to it, or the end of the "Things that cause trouble" page.

Thanks for considering it.

nedbat · 2019-12-08T15:45:25Z

I've updated the FAQ in 8ed8cbe

When collecting coverage data on Python 3.7 and earlier but generating the coverage report in 3.8 or later, some decorators show incorrect coverage information (see nedbat/coveragepy#866 for more details). This commit changes the github action to use Python 3.7 when generating the coverage report, hopefully working around this issue for some of the Windows-only modules which are only run on Python 3.6 and 3.7.

mthuurne · 2023-02-25T08:03:26Z

Did the decorator coverage change again in Python 3.11? When measuring branch coverage on both 3.10 and 3.11, everything is fine if I run coverage combine under Python 3.10, but if I combine under 3.11 the branch coverage will be incomplete, saying the def line doesn't jump to the decorator's line.

sanjioh added the bug Something isn't working label Nov 4, 2019

nedbat closed this as completed Nov 5, 2019

jamadden mentioned this issue Jan 21, 2020

Remove checking of PURE_PYTHON at build time zopefoundation/zope.interface#151

Merged

nedbat mentioned this issue Mar 9, 2020

Report ignores decorators when run on Python 3.8 if testing was done on previous Python versions. #941

Closed

mogoh mentioned this issue Jul 28, 2023

Django 4.2/DjangoCMS 3.11 support django-cms/djangocms-picture#122

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differences in coverage percentage between Python 3.7 and Python 3.8 #866

Differences in coverage percentage between Python 3.7 and Python 3.8 #866

sanjioh commented Nov 4, 2019

nedbat commented Nov 5, 2019

sanjioh commented Nov 5, 2019

nedbat commented Nov 5, 2019

ArturKlauser commented Dec 4, 2019

nedbat commented Dec 4, 2019

ArturKlauser commented Dec 5, 2019

nedbat commented Dec 8, 2019

mthuurne commented Feb 25, 2023

Differences in coverage percentage between Python 3.7 and Python 3.8 #866

Differences in coverage percentage between Python 3.7 and Python 3.8 #866

Comments

sanjioh commented Nov 4, 2019

nedbat commented Nov 5, 2019

sanjioh commented Nov 5, 2019

nedbat commented Nov 5, 2019

ArturKlauser commented Dec 4, 2019

nedbat commented Dec 4, 2019

ArturKlauser commented Dec 5, 2019

nedbat commented Dec 8, 2019

mthuurne commented Feb 25, 2023