Improvement cuStabilizer DEM sampling backend and consolidate tests by kvmto · Pull Request #32 · NVIDIA/Ising-Decoding

kvmto · 2026-03-31T19:03:46Z

Summary

Remove the torch fallback path from dem_sampling.py, making cuQuantum's cuStabilizer BitMatrixSampler the sole DEM sampling backend. This eliminates the USE_CUSTAB env-var toggle, the _use_custab() helper, and the custab_matrix_sampling() indirection — dem_sampling() now calls cuST directly.
Simplify the sampler cache: replace the id(H) address-based cache key with a proper is-identity check (_cached_H is not H) and pre-cache H.T to avoid repeated transposes.
Consolidate test files: merge test_dem_sampling_custab.py and test_dem_sampling_integration.py into test_dem_sampling.py, removing the now-unnecessary fallback-specific tests.
Add cuquantum>=26.3.0 as an explicit dependency in requirements_public_train.txt.
Fix CI to install from requirements_public_train.txt (which includes inference deps) instead of requirements_public_inference.txt alone.

Test plan

CI GPU tests pass (test_dem_sampling.py covers CPU, GPU, statistical, integration, and cuST-vs-torch deterministic cross-checks)
Verify cuquantum>=26.3.0 resolves correctly in the CI environment
Confirm no regressions in training pipeline end-to-end

Remove the torch fallback path from dem_sampling.py — cuquantum's BitMatrixSampler is now the only sampling backend, simplifying the module and eliminating the USE_CUSTAB toggle. The sampler cache uses identity-based comparison with a pre-cached transpose to avoid redundant reallocation. Merge test_dem_sampling_custab.py and test_dem_sampling_integration.py into test_dem_sampling.py for a single, comprehensive test suite. Also: - Add cuquantum>=26.3.0 to requirements_public_train.txt - Fix CI to install train (not inference) requirements for GPU tests - Apply yapf formatting (Google style, 100-col limit) Signed-off-by: kvmto <kmato@nvidia.com>

Signed-off-by: kvmto <kmato@nvidia.com>

…lure The cuquantum meta-package fails to build in environments where pkg_resources is unavailable. Pin the CUDA-12 specific wheel directly to bypass the broken auto-detection setup.py. Signed-off-by: kvmto <kmato@nvidia.com>

Signed-off-by: kvmto <kmato@nvidia.com>

…zer-cuXY

ivanbasov

CI failures root cause

Two related issues, both from the same design decision — making cuquantum.BitMatrixSampler the sole backend without adding skip guards in the tests.

Issue 1 — Unit-test CI (CPU runners): `ModuleNotFoundError: No module named 'cuquantum'`

dem_sampling() now does a hard import of cuquantum.stabilizer unconditionally at call time (line 81 of the new dem_sampling.py):

def dem_sampling(...):
    from cuquantum.stabilizer.dem_sampling import BitMatrixSampler  # <-- always runs
    from cuquantum.stabilizer.simulator import Options

The CPU CI runner doesn't have cuquantum installed, so every test that calls dem_sampling() crashes immediately. Affected test classes in test_dem_sampling.py:

TestBitMatrixSamplerIsUsed — imports cuquantum directly, no skip guard
TestDemSamplingCPU — calls dem_sampling(), no skip guard
TestDEMSamplingIntegration — calls generate_batch() → dem_sampling(), no skip guard
TestCustVsTorchDeterministic — calls dem_sampling(), no skip guard
TestCustVsTorchStatistical.test_small_matrix_cpu — calls dem_sampling(), no skip guard

The old test_dem_sampling_custab.py handled this correctly with @unittest.skipUnless(_custab_available(), ...) at class level. That decorator was dropped when the files were consolidated.

Issue 2 — GPU CI runner: `ModuleNotFoundError: No module named 'cuquantum.stabilizer'`

The GPU runner has cuquantum installed, but an older version that predates the stabilizer subpackage (which requires >=26.3.0). Same test classes fail for the same reason — no skip guard catches the import error.

Suggested fix

The cleanest approach is to move the cuquantum import to module level with a try/except (restoring _CUSTAB_AVAILABLE), and add @unittest.skipUnless(_dem_mod._CUSTAB_AVAILABLE, "cuquantum>=26.3.0 not available") to all five affected test classes. The old code on main already does this correctly in _custab_available() / _CUSTAB_AVAILABLE — the pattern just needs to be carried forward.

# dem_sampling.py — module level
try:
    from cuquantum.stabilizer.dem_sampling import BitMatrixSampler
    from cuquantum.stabilizer.simulator import Options
    _CUSTAB_AVAILABLE = True
except ImportError:
    BitMatrixSampler = None
    Options = None
    _CUSTAB_AVAILABLE = False

# test_dem_sampling.py
@unittest.skipUnless(_dem_mod._CUSTAB_AVAILABLE, "cuquantum>=26.3.0 (stabilizer) not available")
class TestDemSamplingCPU(unittest.TestCase): ...

@unittest.skipUnless(_dem_mod._CUSTAB_AVAILABLE, "cuquantum>=26.3.0 (stabilizer) not available")
class TestBitMatrixSamplerIsUsed(unittest.TestCase): ...

# ... same for TestDEMSamplingIntegration, TestCustVsTorchDeterministic, TestCustVsTorchStatistical

This makes all five classes skip gracefully on CPU CI (no cuquantum) and on GPU CI until the runner is updated to cuquantum>=26.3.0, while still running correctly when the right version is present.

Otherwise, looks good!

Since custabilizer-cuXY is not a viable standalone package, there is no need to try to make CUDA major version specific files. Rather, we just rely on the auto detection logic in cuquantum-python.

bmhowe23 · 2026-04-01T01:47:54Z

Note: CI passed with 3b8015d, so we could go with that. However, I prefer one commit later (ce055a2). It is currently failing the CI, but I think the cuquantum team is publishing a change that would fix that.

This reverts commit ce055a2ddcb735f43dc7bd7a21e15e3bcc64dd09.

bmhowe23 · 2026-04-01T14:34:11Z

Note: CI passed with 3b8015d, so we could go with that. However, I prefer one commit later (ce055a2). It is currently failing the CI, but I think the cuquantum team is publishing a change that would fix that.

The cuquantum issue won't have a published fix until April 10th, so I reverted ce055a2.

Signed-off-by: kvmto <kmato@nvidia.com>

bmhowe23 · 2026-04-01T22:35:00Z

Note: CI passed with 3b8015d, so we could go with that. However, I prefer one commit later (ce055a2). It is currently failing the CI, but I think the cuquantum team is publishing a change that would fix that.

The cuquantum issue won't have a published fix until April 10th, so I reverted ce055a2.

OK, they overachieved, and it is ready now: https://pypi.org/project/cuquantum-python/26.3.0.post0/

I would like to re-revert ce055a2 (i.e. revert 05e92f8). I will let the CI for Kevin's changes finish up before doing that though. This will be much cleaner to not have the cu12/cu13 split.

This reverts commit 05e92f842a6b01cf1b327b4162e5881d0b951f58.

bmhowe23 · 2026-04-01T22:44:49Z

Note: CI passed with 3b8015d, so we could go with that. However, I prefer one commit later (ce055a2). It is currently failing the CI, but I think the cuquantum team is publishing a change that would fix that.

The cuquantum issue won't have a published fix until April 10th, so I reverted ce055a2.

OK, they overachieved, and it is ready now: https://pypi.org/project/cuquantum-python/26.3.0.post0/

I would like to re-revert ce055a2 (i.e. revert 05e92f8). I will let the CI for Kevin's changes finish up before doing that though. This will be much cleaner to not have the cu12/cu13 split.

The CI is failing on cu12/cu13 auto-detection again, this time with a new error. It looks like this will have to wait for another day. I will revert the most recent commit (84c814b) and once it passes the CI, I propose we merge it.

…ts files""" This reverts commit 84c814baa4b5d9911506d74b5cf03c7f12128785.

kvmto · 2026-04-01T22:46:24Z

Note: CI passed with 3b8015d, so we could go with that. However, I prefer one commit later (ce055a2). It is currently failing the CI, but I think the cuquantum team is publishing a change that would fix that.

The cuquantum issue won't have a published fix until April 10th, so I reverted ce055a2.

OK, they overachieved, and it is ready now: https://pypi.org/project/cuquantum-python/26.3.0.post0/
I would like to re-revert ce055a2 (i.e. revert 05e92f8). I will let the CI for Kevin's changes finish up before doing that though. This will be much cleaner to not have the cu12/cu13 split.

The CI is failing on cu12/cu13 auto-detection again, this time with a new error. It looks like this will have to wait for another day. I will revert the most recent commit (84c814b) and once it passes the CI, I propose we merge it.

agree

) * Make cuStabilizer the sole DEM sampling backend and consolidate tests Remove the torch fallback path from dem_sampling.py — cuquantum's BitMatrixSampler is now the only sampling backend, simplifying the module and eliminating the USE_CUSTAB toggle. The sampler cache uses identity-based comparison with a pre-cached transpose to avoid redundant reallocation. Merge test_dem_sampling_custab.py and test_dem_sampling_integration.py into test_dem_sampling.py for a single, comprehensive test suite. Also: - Add cuquantum>=26.3.0 to requirements_public_train.txt - Fix CI to install train (not inference) requirements for GPU tests - Apply yapf formatting (Google style, 100-col limit) Signed-off-by: kvmto <kmato@nvidia.com> * fixed license Signed-off-by: kvmto <kmato@nvidia.com> * fix: use cuquantum-python-cu12 wheel to avoid pkg_resources build failure The cuquantum meta-package fails to build in environments where pkg_resources is unavailable. Pin the CUDA-12 specific wheel directly to bypass the broken auto-detection setup.py. Signed-off-by: kvmto <kmato@nvidia.com> * lazy imports for safe separation between training and inference Signed-off-by: kvmto <kmato@nvidia.com> * quick fix to CI Signed-off-by: kvmto <kmato@nvidia.com> * route cuQuantum dem_sampling tests to GPU CI Signed-off-by: kvmto <kmato@nvidia.com> * left behind change Signed-off-by: kvmto <kmato@nvidia.com> * missing bash session Signed-off-by: kvmto <kmato@nvidia.com> * Make CUDA major version specific requirements files and use custabilizer-cuXY * Revert some changes to test files that are hopefully no longer needed * Revert REQUIRE_CUQUANTUM changes * Change custabilizer version to 0.3.0 * Change custabilizer back to cuquantum-python * Skip test_dem_sampling.py if required deps are not present * Try again * Skip a few more tests if cuquantum-python not installed * Revert CUDA major version specific requirements files Since custabilizer-cuXY is not a viable standalone package, there is no need to try to make CUDA major version specific files. Rather, we just rely on the auto detection logic in cuquantum-python. * Revert "Revert CUDA major version specific requirements files" This reverts commit ce055a2. * small torch device object bug fix for nccl Signed-off-by: kvmto <kmato@nvidia.com> * overcome custab device id limitation Signed-off-by: kvmto <kmato@nvidia.com> * added tiny logging Signed-off-by: kvmto <kmato@nvidia.com> * linted Signed-off-by: kvmto <kmato@nvidia.com> * Revert "Revert "Revert CUDA major version specific requirements files"" This reverts commit 05e92f8. * Revert "Revert "Revert "Revert CUDA major version specific requirements files""" This reverts commit 84c814b. --------- Signed-off-by: kvmto <kmato@nvidia.com> Co-authored-by: Ben Howe <bhowe@nvidia.com>

kvmto requested review from bmhowe23 and ivanbasov March 31, 2026 19:03

bmhowe23 reviewed Mar 31, 2026

View reviewed changes

Comment thread code/qec/dem_sampling.py Outdated

kvmto and others added 8 commits March 31, 2026 19:39

fixed license

9e26338

Signed-off-by: kvmto <kmato@nvidia.com>

lazy imports for safe separation between training and inference

0815796

Signed-off-by: kvmto <kmato@nvidia.com>

quick fix to CI

5d4b98a

Signed-off-by: kvmto <kmato@nvidia.com>

route cuQuantum dem_sampling tests to GPU CI

485ea80

Signed-off-by: kvmto <kmato@nvidia.com>

left behind change

f7d7349

Signed-off-by: kvmto <kmato@nvidia.com>

missing bash session

e8b30d6

Signed-off-by: kvmto <kmato@nvidia.com>

Make CUDA major version specific requirements files and use custabili…

a2f559e

…zer-cuXY

bmhowe23 reviewed Mar 31, 2026

View reviewed changes

Comment thread .github/workflows/ci-gpu.yml Outdated

Revert some changes to test files that are hopefully no longer needed

96d37ba

bmhowe23 reviewed Mar 31, 2026

View reviewed changes

Comment thread code/scripts/check_python_compat.sh

bmhowe23 added 2 commits March 31, 2026 15:47

Revert REQUIRE_CUQUANTUM changes

5517d84

Change custabilizer version to 0.3.0

f3d6ff3

ivanbasov reviewed Mar 31, 2026

View reviewed changes

Comment thread .github/workflows/ci-gpu.yml Outdated

ivanbasov reviewed Mar 31, 2026

View reviewed changes

bmhowe23 added 5 commits March 31, 2026 17:22

Change custabilizer back to cuquantum-python

3bd1740

Skip test_dem_sampling.py if required deps are not present

ee15114

Try again

a43279e

Skip a few more tests if cuquantum-python not installed

3b8015d

Revert CUDA major version specific requirements files

ce055a2

Since custabilizer-cuXY is not a viable standalone package, there is no need to try to make CUDA major version specific files. Rather, we just rely on the auto detection logic in cuquantum-python.

Revert "Revert CUDA major version specific requirements files"

05e92f8

This reverts commit ce055a2ddcb735f43dc7bd7a21e15e3bcc64dd09.

ivanbasov mentioned this pull request Apr 1, 2026

feat(decoder_ablation): TRT pre-decoder backend + cudaq-qec docs and tests #34

Merged

3 tasks

ivanbasov approved these changes Apr 1, 2026

View reviewed changes

small torch device object bug fix for nccl

91b0fa3

Signed-off-by: kvmto <kmato@nvidia.com>

kvmto added 3 commits April 1, 2026 22:10

overcome custab device id limitation

0e7150e

Signed-off-by: kvmto <kmato@nvidia.com>

added tiny logging

5a5da42

Signed-off-by: kvmto <kmato@nvidia.com>

linted

894e508

Signed-off-by: kvmto <kmato@nvidia.com>

Revert "Revert "Revert CUDA major version specific requirements files""

84c814b

This reverts commit 05e92f842a6b01cf1b327b4162e5881d0b951f58.

Revert "Revert "Revert "Revert CUDA major version specific requiremen…

413d1f8

…ts files""" This reverts commit 84c814baa4b5d9911506d74b5cf03c7f12128785.

kvmto merged commit 5aeebdf into NVIDIA:main Apr 1, 2026
13 checks passed

bmhowe23 deleted the custab_int_clean branch April 1, 2026 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement cuStabilizer DEM sampling backend and consolidate tests#32

Improvement cuStabilizer DEM sampling backend and consolidate tests#32
kvmto merged 24 commits into
NVIDIA:mainfrom
kvmto:custab_int_clean

kvmto commented Mar 31, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ivanbasov left a comment •

edited

Loading

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

kvmto commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kvmto commented Mar 31, 2026

Summary

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ivanbasov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

CI failures root cause

Issue 1 — Unit-test CI (CPU runners): ModuleNotFoundError: No module named 'cuquantum'

Issue 2 — GPU CI runner: ModuleNotFoundError: No module named 'cuquantum.stabilizer'

Suggested fix

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

bmhowe23 commented Apr 1, 2026

Uh oh!

kvmto commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ivanbasov left a comment •

edited

Loading

Issue 1 — Unit-test CI (CPU runners): `ModuleNotFoundError: No module named 'cuquantum'`

Issue 2 — GPU CI runner: `ModuleNotFoundError: No module named 'cuquantum.stabilizer'`