[CI] Fix olddeps / opt-deps / gym smoke tests broken by #3738 and #3704 by vmoens · Pull Request #3739 · pytorch/rl

vmoens · 2026-05-12T08:17:06Z

Summary

Three currently red CI jobs on every PR / main:

unittests-gym — every gym/gymnasium smoke test errors at import with
ModuleNotFoundError: No module named 'torch._higher_order_ops'. Caused by
[Feature][Performance] Triton backend for GRU / LSTM with intermediate resets #3738, which added
_has_torch_scan = importlib.util.find_spec(\"torch._higher_order_ops.scan\")
to rnn.py. The job installs torch 2.0.1, and find_spec eagerly imports
the missing parent and raises instead of returning None.
tests-olddeps — 3660 identical
TypeError: ProbabilisticTensorDictModule.__init__() got an unexpected keyword argument 'generator'.
[Feature] Forward generator kwarg through ProbabilisticActor #3704 unconditionally forwards generator= from SafeProbabilisticModule to
the tensordict parent class, but the olddeps job was pinned to
tensordict>=0.12.0,<0.13.0 (introduced in [BugFix] Fix old pytorch dependencies #3266 to guard against tensordict
main dropping older Pythons), and that release line does not include the
generator kwarg added in tensordict [BugFix] MaxValueWriter cuda compatibility #1689.
tests-optdeps — long-standing flake in
test/test_distributions.py::TestDelta::test_tanhdelta_inv_ones: 4M unseeded
float32 randn values occasionally exceed
atanh(1 - finfo.resolution) ≈ 7.25, so SafeTanhTransform saturates and
the inv∘fwd roundtrip can't recover the input.

Changes

torchrl/modules/tensordict_module/rnn.py — gate _has_torch_scan on
version.parse(torch.__version__) >= version.parse(\"2.6.0\") (matching the
existing @implement_for(\"torch\", \"2.6.0\", ...) decorator on _scan).
.github/unittest/linux_olddeps/scripts_gym_0_13/install.sh — install
tensordict from source on PR/nightly runs, from PyPI only on release/*,
matching linux/scripts/run_all.sh and linux_distributed/scripts/install.sh.
Restores the pybind11[global] install needed by the source build.
.github/workflows/test-linux.yml — drop the now-unused
TORCHRL_TENSORDICT_SPEC export.
test/test_distributions.py — seed and clamp inputs in
test_tanhdelta_inv_ones so the test stays inside the SafeTanh roundtrip
region.

Trade-off

Going back to source builds for olddeps re-exposes the brittleness #3266 was
guarding against: the day tensordict main drops Python 3.10 the olddeps job
breaks until we bump the Python version or temporarily re-pin. tensordict main
currently still declares requires-python = \">=3.10\", so this resolves today.

Test plan

unittests-gym passes (every gym version)
tests-olddeps passes (no more 3660 TypeError)
tests-optdeps passes (no flake in test_tanhdelta_inv_ones)
All other jobs green

* rnn.py: gate ``_has_torch_scan`` on ``torch.__version__ >= 2.6`` instead of ``importlib.util.find_spec("torch._higher_order_ops.scan")``. ``find_spec`` eagerly imports the (missing) ``torch._higher_order_ops`` parent on torch < 2.4 and raises ``ModuleNotFoundError`` instead of returning ``None``, which broke every gym smoke test at import time (torch 2.0.1 stack). * olddeps install.sh: install tensordict from source on PR / nightly runs and from PyPI only on ``release/*`` branches, matching every other CI job (``linux/scripts/run_all.sh``, ``linux_distributed/scripts/install.sh``). The previous ``tensordict>=0.12.0,<0.13.0`` pin (introduced in #3266 to guard against tensordict main dropping Python 3.10) froze us below the ``generator`` kwarg added in tensordict #1689, which #3704 forwards from ``SafeProbabilisticModule``, causing 3660 ``TypeError`` failures. * test-linux.yml: drop the now-unused ``TORCHRL_TENSORDICT_SPEC`` export. * test_distributions.py: ``test_tanhdelta_inv_ones`` was flaky on CUDA float32 -- with 4M unseeded ``randn`` samples a handful of draws had ``|x|`` past ``atanh(1 - finfo.resolution) ≈ 7.25`` and could not roundtrip through ``SafeTanhTransform``. Seed for determinism and clamp inputs into the non-saturated region.

pytorch-bot · 2026-05-12T08:17:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3739

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull jobs on OSDC in pull requests shadow mode

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Same bug pattern as the ``torch._higher_order_ops.scan`` probe: Triton 2.0 (shipped with torch 2.0.1 on the gym smoke matrix) lacks ``triton.language.extra``, so ``find_spec("triton.language.extra.libdevice")`` eagerly imports the missing parent and raises ``ModuleNotFoundError`` at torchrl import time. Read the version from package metadata and gate on ``triton >= 2.2`` instead. Applied identically in ``rnn.py`` and ``_rnn_triton.py``.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 12, 2026

github-actions Bot added CI Has to do with CI setup (e.g. wheels & builds, tests...) distributions Modules Integrations/torch_geometric Integrations labels May 12, 2026

vmoens merged commit 3df2f4a into main May 12, 2026
110 checks passed

vmoens deleted the fix-old-opt-gym branch May 12, 2026 09:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Fix olddeps / opt-deps / gym smoke tests broken by #3738 and #3704#3739

[CI] Fix olddeps / opt-deps / gym smoke tests broken by #3738 and #3704#3739
vmoens merged 2 commits into
mainfrom
fix-old-opt-gym

vmoens commented May 12, 2026

Uh oh!

pytorch-bot Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vmoens commented May 12, 2026

Summary

Changes

Trade-off

Test plan

Uh oh!

pytorch-bot Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3739

❗ 1 Active SEVs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pytorch-bot Bot commented May 12, 2026 •

edited

Loading