Skip to content

[OPT] enable strict_xfail in pytest or remove memory leak tests #6475

@gforsyth

Description

@gforsyth

Every Python run in conda-python-tests-singlegpu runs a set of "memory leak pytests" tests.
These take anywhere from 16 to 18 minutes to complete, and there is currently no useful signal derived from these tests.

One example from earlier today:

https://github.com/rapidsai/cuml/actions/runs/13986886321/job/39162102022#step:9:2140

RAPIDS logger » [03/21/25 08:43:13]
┌───────────────────────────┐
|    memory leak pytests    |
└───────────────────────────┘

/opt/conda/envs/test/lib/python3.12/site-packages/pytest_benchmark/logger.py:39: PytestBenchmarkWarning: Benchmarks are automatically disabled because xdist plugin is active.Benchmarks cannot be performed reliably in a parallelized environment.
  warner(PytestBenchmarkWarning(text))
============================= test session starts ==============================
platform linux -- Python 3.12.9, pytest-7.4.4, pluggy-1.5.0
benchmark: 5.1.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /__w/cuml/cuml/python/cuml
configfile: pyproject.toml
plugins: xdist-3.6.1, hypothesis-6.129.3, cases-3.8.6, benchmark-5.1.0, cov-6.0.0
created: 1/1 worker
1 worker [676 items]

xxxxxxssssssssssssssssssssssssssssssssssssssssssssssssXXXXXXXXXXXXXXXXXX [ 10%]
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [ 21%]
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [ 31%]
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXsssssssssssssssssssssssssssssssss [ 42%]
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 53%]
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 63%]
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 74%]
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 85%]
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss [ 95%]
sssssssssxssssssssxssssssssx                                             [100%]
---- generated xml file: /__w/cuml/cuml/test-results/junit-cuml-memleak.xml ----

---------- coverage: platform linux, python 3.12.9-final-0 -----------
Coverage XML written to file /__w/cuml/cuml/coverage-results/cuml-memleak-coverage.xml

=========== 466 skipped, 9 xfailed, 201 xpassed in 966.75s (0:16:06) ===========

Every test in there is either skipped, xfailed, or is actually XPASSing.

If half the XPASSing tests started xfailing again, there would be no indication of that regression unless someone noticed the lower-case x marks.

I don't think this (quite long) test bundle is accomplishing anything and I would recommend either:

  1. Ripping it all out and saving 18 minutes per-worker per-run
  2. Enabling strict xfail in pytest so that XPASSing tests are marked as failures and then remove the defunct xfail markers.
  3. Start from scratch with a well-curated set of ideas / failure modes that we want to test.

Metadata

Metadata

Assignees

Labels

Tech DebtIssues related to debtcitestsUnit testing for project

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions