Flaky WSL `cuda.bindings` benchmark smoke test: `pyperf` raises `ValueError: benchmark function returned zero`

Observed shortly after #1880 was merged.

I expect this will haunt us until we tweak the test.

Cursor-generated writeup:

___

# Description

The WSL CI lane `Test linux-64 / py3.12, 13.2.0, wheels, rtx4090, wsl` can fail in the `Run cuda.bindings benchmarks (smoke test)` step with a zero-duration `pyperf` measurement:

```text
python run_pyperf.py --fast --loops 1 --min-time 0
...
ValueError: benchmark function returned zero
```

This does not look like a functional CUDA regression. In the failing run, the job had already completed:

- `Run cuda.pathfinder tests with see_what_works`
- `Run cuda.bindings tests`

and then failed specifically in:

- `Run cuda.bindings benchmarks (smoke test)`

The benchmark smoke step currently comes from `.github/workflows/test-wheel-linux.yml`:

```bash
pip install pyperf
pushd cuda_bindings/benchmarks
python run_pyperf.py --fast --loops 1 --min-time 0
popd
```

The benchmark implementation that was called out in the PR #1817 analysis is:

```python
def bench_pointer_get_attribute(loops: int) -> float:
    _cuPointerGetAttribute = cuda.cuPointerGetAttribute
    _attr = ATTRIBUTE
    _ptr = PTR

    t0 = time.perf_counter()
    for _ in range(loops):
        _cuPointerGetAttribute(_attr, _ptr)
    return time.perf_counter() - t0
```

With `--loops 1 --min-time 0`, a zero-length timing sample on WSL and fast hardware seems plausible, so this looks like a benchmark flake rather than a deterministic product failure.

## Failing example

- PR analysis comment: [PR #1817 comment](https://github.com/NVIDIA/cuda-python/pull/1817#issuecomment-4211747125)
- Overall run attempt: [run 24172970743 attempt 1](https://github.com/NVIDIA/cuda-python/actions/runs/24172970743/attempts/1?pr=1817)
- Direct failing job: [Test linux-64 / py3.12, 13.2.0, wheels, rtx4090, wsl](https://github.com/NVIDIA/cuda-python/actions/runs/24172970743/job/70549032768)

## Passing comparisons

- Later rerun of the same lane: [job 70558881026](https://github.com/NVIDIA/cuda-python/actions/runs/24176145731/job/70558881026)
- Nearby successful `main` run: [job 70467435657](https://github.com/NVIDIA/cuda-python/actions/runs/24147800390/job/70467435657)
- Another successful `main` run on the same lane: [job 70460149358](https://github.com/NVIDIA/cuda-python/actions/runs/24145709851/job/70460149358)

## Why this looks flaky

- The failure occurs in the benchmark smoke step, not in the regular `cuda.bindings` test suite.
- Nearby runs of the same WSL lane passed.
- The exception comes from `pyperf` rejecting a zero-duration sample, not from a CUDA API error.

## Possible follow-ups

- Increase the amount of work in the smoke benchmark, for example by raising `--loops` or using a positive `--min-time`.
- Retry the benchmark smoke step once before failing the whole job.
- Consider using different benchmark-smoke settings on WSL than on non-WSL Linux runners.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky WSL `cuda.bindings` benchmark smoke test: `pyperf` raises `ValueError: benchmark function returned zero` #1885

Description

Failing example

Passing comparisons

Why this looks flaky

Possible follow-ups

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flaky WSL cuda.bindings benchmark smoke test: pyperf raises ValueError: benchmark function returned zero #1885

Description

Description

Failing example

Passing comparisons

Why this looks flaky

Possible follow-ups

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Flaky WSL `cuda.bindings` benchmark smoke test: `pyperf` raises `ValueError: benchmark function returned zero` #1885