Skip to content

Flaky WSL cuda.bindings benchmark smoke test: pyperf raises ValueError: benchmark function returned zero #1885

@rwgk

Description

@rwgk

Observed shortly after #1880 was merged.

I expect this will haunt us until we tweak the test.

Cursor-generated writeup:


Description

The WSL CI lane Test linux-64 / py3.12, 13.2.0, wheels, rtx4090, wsl can fail in the Run cuda.bindings benchmarks (smoke test) step with a zero-duration pyperf measurement:

python run_pyperf.py --fast --loops 1 --min-time 0
...
ValueError: benchmark function returned zero

This does not look like a functional CUDA regression. In the failing run, the job had already completed:

  • Run cuda.pathfinder tests with see_what_works
  • Run cuda.bindings tests

and then failed specifically in:

  • Run cuda.bindings benchmarks (smoke test)

The benchmark smoke step currently comes from .github/workflows/test-wheel-linux.yml:

pip install pyperf
pushd cuda_bindings/benchmarks
python run_pyperf.py --fast --loops 1 --min-time 0
popd

The benchmark implementation that was called out in the PR #1817 analysis is:

def bench_pointer_get_attribute(loops: int) -> float:
    _cuPointerGetAttribute = cuda.cuPointerGetAttribute
    _attr = ATTRIBUTE
    _ptr = PTR

    t0 = time.perf_counter()
    for _ in range(loops):
        _cuPointerGetAttribute(_attr, _ptr)
    return time.perf_counter() - t0

With --loops 1 --min-time 0, a zero-length timing sample on WSL and fast hardware seems plausible, so this looks like a benchmark flake rather than a deterministic product failure.

Failing example

Passing comparisons

Why this looks flaky

  • The failure occurs in the benchmark smoke step, not in the regular cuda.bindings test suite.
  • Nearby runs of the same WSL lane passed.
  • The exception comes from pyperf rejecting a zero-duration sample, not from a CUDA API error.

Possible follow-ups

  • Increase the amount of work in the smoke benchmark, for example by raising --loops or using a positive --min-time.
  • Retry the benchmark smoke step once before failing the whole job.
  • Consider using different benchmark-smoke settings on WSL than on non-WSL Linux runners.

Metadata

Metadata

Assignees

Labels

P1Medium priority - Should docuda.bindingsEverything related to the cuda.bindings moduletestImprovements or additions to teststriageNeeds the team's attention

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions