Skip to content

Dramatic slowdown when running coverage on PyPy (~20x slower than CPython) #2155

@pomponchik

Description

@pomponchik

Summary

Running coverage with branch coverage enabled on PyPy results in a ~20x slowdown compared to CPython (3.5-5+ minutes vs ~15 seconds for the same test suite). This makes it impractical to enforce coverage on PyPy in CI.

You can view my experiments in the PR to my repository (commits starting at d91abf2c8fdb6ca7c89cd18c90bcb666c09b951f and ending at 9fbe0d4999b66a53a0dfdf2f0ef1bd5b7e235416).

Environment

  • coverage version: 7.6.1
  • PyPy versions tested: pypy3.9 (7.3.16), pypy3.10 (7.3.19), pypy3.11 (7.3.21)
  • CPython baseline: 3.8-3.15
  • OS: macOS, Ubuntu, Windows (all three affected equally)
  • Test runner: pytest 8.3.5 + pytest-xdist 3.8.0 (-n auto)
  • Coverage config:
    [tool.coverage.run]
    branch = true
    parallel = true
    plugins = ["coverage_pyver_pragma"]
    source = ["suby"]

The project

suby is a small subprocess wrapper library (~330 statements, ~105 branches). The test suite has 341 tests. Most tests spawn a short-lived child process via subprocess.Popen.

On CPython, the full test suite with branch coverage and xdist parallelization completes in ~15 seconds.

What happens on PyPy

The same test suite with the same coverage configuration takes 3.5-7 minutes on PyPy, depending on the approach used.

Approach 1: coverage run + .pth file (same as CPython)

COVERAGE_PROCESS_START="pyproject.toml" coverage run -m pytest -n auto
coverage combine
coverage report -m --fail-under=100

With a .pth file bootstrapping coverage.process_startup() in xdist workers, and a pytest_configure hook removing COVERAGE_PROCESS_START from the worker environment to prevent child processes from inheriting it.

Result: ~3.5 minutes (was ~15 seconds on CPython).

Approach 2: pytest-cov

pytest --cov=suby --cov-branch --cov-fail-under=100 -n auto

Result: 5-7 minutes. Even worse because pytest-cov's .pth file (COV_CORE_* env vars) activates coverage in xdist workers at Python startup, and then the pytest-cov plugin starts a second coverage instance in the same workers, resulting in double tracing.

Approach 3: Single-process (no xdist)

coverage run -m pytest

We also tried running without xdist to avoid the .pth subprocess overhead entirely. This was still dramatically slower than CPython, confirming that the tracing itself (not subprocess bootstrap) is the bottleneck.

Root cause analysis

We traced the slowdown to coverage's line/branch tracer running as pure Python on PyPy (no C extension). Specifically:

  1. sys.settrace overhead: On CPython, coverage uses a C-based tracer. On PyPy, it falls back to a pure Python tracer. Every line of executed code (including pytest framework code, not just the measured source) passes through this tracer.

  2. Branch coverage amplifies the cost: With branch = true, the tracer does additional work per line to track branch transitions. On CPython's C tracer this is negligible; on PyPy's pure Python tracer it's expensive.

  3. source filtering doesn't help enough: Even with source = ["suby"] limiting coverage measurement to a small package, the sys.settrace callback still fires for ALL executed code. The filtering happens inside the callback, but the call overhead remains.

What we tried to mitigate

  • Removed coverage from child subprocesses (conftest hook removing COVERAGE_PROCESS_START / COV_CORE_* from xdist workers' env) - this prevented the ~300 child processes per run from loading coverage, but the xdist worker tracing overhead remained.
  • Tried pytest-cov instead of manual coverage run - made things worse due to double coverage instances.
  • Disabled pytest-cov with -p no:cov while using coverage run - eliminated double coverage but PyPy tracing is still inherently slow.

Expected behavior

Coverage on PyPy should be within a reasonable factor (2-3x) of CPython performance, not 20x+ slower. PyPy's JIT should in theory be able to optimize a hot tracing function.

Possible directions

  • A PyPy-optimized tracer (perhaps using PyPy's JIT-friendly patterns instead of sys.settrace)
  • An option to only invoke the trace callback for files matching source, rather than filtering inside the callback
  • CFFI-based tracer for PyPy instead of pure Python

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions