Run CI with PyTorch 2.0 #3192

eb8680 · 2023-03-20T13:50:06Z

This is a dummy PR to trigger a test run with the new PyTorch 2.0 release and see what breaks.

Log from running pytest -v --stage=unit --tb=no -n auto on my local machine (Ubuntu 22.04, Python 3.10.9, torch==2.0.0):

=========================================================================================================== warnings summary ============================================================================================================
pyro/generic.py:9: 24 warnings
  /home/eli/development/pyro/pyro/generic.py:9: DeprecationWarning: pyro.generic has moved to the pyroapi package
    warnings.warn("pyro.generic has moved to the pyroapi package", DeprecationWarning)

tests/distributions/test_stable.py:8: 24 warnings
  /home/eli/development/pyro/tests/distributions/test_stable.py:8: DeprecationWarning: Please use `IntegrationWarning` from the `scipy.integrate` namespace, the `scipy.integrate.quadpack` namespace is deprecated.
    from scipy.integrate.quadpack import IntegrationWarning

tests/ops/test_contract.py: 165 warnings
  /home/eli/development/pyro/pyro/ops/contract.py:433: DeprecationWarning: 'ubersum' is deprecated, use 'pyro.ops.contract.einsum' instead
    warnings.warn(

tests/infer/test_predictive.py::test_posterior_predictive_svi_auto_delta_guide[False]
tests/infer/test_predictive.py::test_posterior_predictive_svi_auto_delta_guide[True]
tests/infer/test_predictive.py::test_posterior_predictive_svi_auto_diag_normal_guide[False]
tests/infer/test_predictive.py::test_posterior_predictive_svi_one_hot
  /home/eli/development/pyro/pyro/infer/predictive.py:284: DeprecationWarning: The method `.get_samples` has been deprecated in favor of `.forward`.
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================================================================== short test summary info ========================================================================================================
FAILED tests/distributions/test_rejector.py::test_rejector[0.25-0.5] - AssertionError: bug in .rsample()
FAILED tests/distributions/test_stable.py::test_additive[1.01-0.1-0.9--0.5--0.9] - assert 0.04126178292409227 > 0.05
FAILED tests/distributions/test_stable.py::test_additive[0.99-0.1-0.9--0.5--0.5] - assert 0.03966362560959423 > 0.05
FAILED tests/distributions/test_stable.py::test_additive[1.01-0.1-0.9--0.5--0.5] - assert 0.03518876957795689 > 0.05
FAILED tests/distributions/test_stable.py::test_additive[0.99-0.1-0.9--0.5-0.5] - assert 0.03966362560959423 > 0.05
FAILED tests/distributions/test_stable.py::test_additive[1.01-0.1-0.9--0.5-0.9] - assert 0.04639733883233797 > 0.05
FAILED tests/distributions/test_stable.py::test_additive[0.99-0.1-0.9--0.5-0.9] - assert 0.04822817851247244 > 0.05
FAILED tests/distributions/test_stable.py::test_additive[0.5-0.1-0.9--0.5-0.5] - assert 0.03662861114079258 > 0.05
FAILED tests/infer/test_autoguide.py::test_exact[AutoMultivariateNormal] - UserWarning: operator() profile_node %627 : int[] = prim::profile_ivalue(%dims.20)
FAILED tests/nn/test_module.py::test_mixin_factory - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/nn/test_module.py::test_torch_serialize_decorators[True] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/nn/test_module.py::test_torch_serialize_decorators[False] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/nn/test_module.py::test_pyro_serialize - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/distributions/test_cuda.py::test_sample[LKJ] - assert False
FAILED tests/distributions/test_cuda.py::test_sample[LKJCholesky] - assert False
FAILED tests/infer/reparam/test_neutra.py::test_neals_funnel_smoke[AutoMultivariateNormal-True] - UserWarning: operator() profile_node %331 : int[] = prim::profile_ivalue(%dims.24)
FAILED tests/infer/test_autoguide.py::test_median[JitTrace_ELBO-AutoMultivariateNormal] - UserWarning: operator() profile_node %627 : int[] = prim::profile_ivalue(%dims.20)
FAILED tests/distributions/test_cuda.py::test_log_prob[LKJ] - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
FAILED tests/distributions/test_cuda.py::test_log_prob[LKJCholesky] - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
FAILED tests/infer/test_autoguide.py::test_median[JitTrace_ELBO-AutoStructured] - UserWarning: operator() profile_node %550 : int[] = prim::profile_ivalue(%dims.24)
FAILED tests/optim/test_optim.py::test_dynamic_lr[scheduler0] - UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will res...
FAILED tests/optim/test_optim.py::test_dynamic_lr[scheduler1] - UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will res...
FAILED tests/optim/test_optim.py::test_dynamic_lr[scheduler2] - UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will res...
FAILED tests/optim/test_optim.py::test_checkpoint[Adam-config0] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/optim/test_optim.py::test_checkpoint[ClippedAdam-config1] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/optim/test_optim.py::test_checkpoint[DCTAdam-config2] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/optim/test_optim.py::test_checkpoint[RMSprop-config3] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/optim/test_optim.py::test_checkpoint[SGD-config4] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/optim/test_optim.py::test_checkpoint[LambdaLR-config5] - UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will res...
FAILED tests/optim/test_optim.py::test_checkpoint[StepLR-config6] - UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will res...
FAILED tests/optim/test_optim.py::test_checkpoint[ExponentialLR-config7] - UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will res...
FAILED tests/optim/test_optim.py::test_checkpoint[ReduceLROnPlateau-config8] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/params/test_param.py::ParamStoreDictTests::test_save_and_load - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/reparam/test_stable.py::test_stable[LatentStableReparam-(4,)] - AssertionError: tensor([[-0.6467, -0.5359, -0.5765, -0.1425],
FAILED tests/infer/reparam/test_stable.py::test_symmetric_stable[(4,)] - AssertionError: tensor([-0.0403, -0.1768, -0.0293,  0.0101]) vs tensor([-0.0199,  0.1129, -0.0009, -0.0492])
FAILED tests/infer/reparam/test_stable.py::test_distribution[LatentStableReparam-0.1--0.5] - assert 0.04393955208216094 > 0.05
FAILED tests/infer/test_autoguide.py::test_serialization[AutoDelta-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoDiagonalNormal-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoMultivariateNormal-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoNormal-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoLowRankMultivariateNormal-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[auto_guide_list_x-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[auto_guide_module_callable-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[nested_auto_guide_callable-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[auto_class9-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[auto_class10-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[auto_class11-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[auto_class12-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoStructured-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoStructured_median-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoGaussian-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoNormalMessenger-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
FAILED tests/infer/test_autoguide.py::test_serialization[AutoHierarchicalNormalMessenger-nojit] - TypeError: cannot pickle 'weakref.ReferenceType' object
==================================================================== 53 failed, 19419 passed, 380 skipped, 376 xfailed, 2 xpassed, 217 warnings in 190.36s (0:03:10) ====================================================================

eb8680 · 2023-05-17T22:45:55Z

Seems like #3212 fixed most of the failures we saw before. From the logs, the only remaining failures are:

A single mysterious AssertionError in Rejector.rsample()
Several Stable-related tests are failing with what seem like small numerical errors, presumably because some PyTorch operator implementations changed? It might be fine to just tweak the tolerances on these tests.
MNIST dataset changes in examples/air.py
A few different TorchScript-related failures that may not be worth fixing given the effective deprecation of TorchScript in PyTorch 2.0.

martinjankowiak · 2023-05-17T22:54:54Z

@eb8680 what/where's the rejector error? i don't see anything informative in the logs...

eb8680 · 2023-05-17T22:58:58Z

In the integration_1 stage: https://github.com/pyro-ppl/pyro/actions/runs/5008196842/jobs/8975868828?pr=3192#step:5:23797

FAILED tests/distributions/test_rejector.py::test_rejector[0.25-0.5] - AssertionError: bug in .rsample()

I don't see a more detailed traceback further up in the log, so I'm not quite sure what's going on.

martinjankowiak · 2023-05-17T23:15:59Z

i see i think we can safely declare that and the stable errors as failing due to flakiness resulting from small numerical differences in ops

…in prodlda.ipynb

review-notebook-app · 2023-05-18T21:45:05Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

martinjankowiak · 2023-05-22T01:33:18Z

@fritzo i made the rejector test pass and some of the stable tests but some of the stable tests are still failing... maybe you can take a look?

martinjankowiak · 2023-05-23T23:14:52Z

@eb8680 @fritzo should we just xfail the remaining tests and cut a release? the jit problems could be tricky to fix (?), and the stable test failures are probably just indicative of mundane numeric instabilities encountered with these distributions as opposed to a serious numerical regression

eb8680 · 2023-05-24T11:27:41Z

@martinjankowiak that would be fine with me.

fritzo · 2023-05-24T15:57:15Z

I'm fine xfailling. The Stable tests work on my machine.

fritzo

Happy to merge when tests pass.

I'd like to merge #3220 before releasing.

Trigger CI with PyTorch 2.0

e1e0fbc

eb8680 added testing Blocked labels Mar 20, 2023

eb8680 added 3 commits March 20, 2023 09:57

python 3.8

9aa2a38

unpin in ci.yml

87c1ba8

repin

43080f3

eb8680 mentioned this pull request May 17, 2023

Clear .unconstrained weakrefs before pickling; rebuild them more often #3212

Merged

Bump torch minor version to rerun CI

98281fe

Merge branch 'dev' into pytorch-2-ci

c2b8614

This was referenced May 18, 2023

Some distribution tests fail under PyTorch 2.0 #3214

Open

Example smoke test failures under PyTorch 2.0 #3215

Closed

TorchScript-related test failures under PyTorch 2.0 #3216

Open

Martin Jankowiak added 3 commits May 18, 2023 19:15

add sep keyword to read_csv in baseball.py

a0522db

attempt to fix multi_mnist with dtype=object kwarg

75d1918

try vectorizer.get_feature_names -> vectorizer.get_feature_names_out …

8724b80

…in prodlda.ipynb

Martin Jankowiak added 2 commits May 21, 2023 15:07

update thresholds in test_rejector.py and test_stable.py

abdb4fb

adjust stable/rejector thresholds more

a11fe92

fritzo and others added 2 commits May 21, 2023 19:52

Slightly relax tolerance on StableReparam test

97335b8

increase steps in flaky minipyro jit test

8eb5f3e

fritzo added 2 commits May 27, 2023 13:07

Mark xfailing tests

9c4055a

Mark cevae tests xfail

ecf0523

Mark LatentStable tests xfail

ec788d4

fritzo approved these changes May 27, 2023

View reviewed changes

fritzo merged commit 89a56ea into dev May 28, 2023
9 checks passed

francois-rozet mentioned this pull request Jul 13, 2023

pyro 1.8.5 requires torch 2.0.1 #3239

Closed

hmaarrfk mentioned this pull request Jul 24, 2023

pyro-ppl v1.8.5 conda-forge/pyro-ppl-feedstock#11

Closed

3 tasks

fritzo deleted the pytorch-2-ci branch August 10, 2023 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run CI with PyTorch 2.0 #3192

Run CI with PyTorch 2.0 #3192

eb8680 commented Mar 20, 2023 •

edited

Loading

eb8680 commented May 17, 2023

martinjankowiak commented May 17, 2023

eb8680 commented May 17, 2023

martinjankowiak commented May 17, 2023

review-notebook-app bot commented May 18, 2023

martinjankowiak commented May 22, 2023

martinjankowiak commented May 23, 2023

eb8680 commented May 24, 2023

fritzo commented May 24, 2023

fritzo left a comment

Run CI with PyTorch 2.0 #3192

Run CI with PyTorch 2.0 #3192

Conversation

eb8680 commented Mar 20, 2023 • edited Loading

eb8680 commented May 17, 2023

martinjankowiak commented May 17, 2023

eb8680 commented May 17, 2023

martinjankowiak commented May 17, 2023

review-notebook-app bot commented May 18, 2023

martinjankowiak commented May 22, 2023

martinjankowiak commented May 23, 2023

eb8680 commented May 24, 2023

fritzo commented May 24, 2023

fritzo left a comment

Choose a reason for hiding this comment

eb8680 commented Mar 20, 2023 •

edited

Loading