-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
simple v *= v_scale error #46820
Comments
do you get a warning like the following? main:1: UserWarning: Output 0 of SplitBackward is a view and is being modified inplace. This view is an output of a function that returns multiple views. Inplace operators on such views are being deprecated and will be forbidden starting from version 1.8. Consider using |
Yes, the following is the warning, what could have been the problem?: UserWarning: Output 1 of SplitBackward is a view and is being modified inplace. This view is an output of a function that returns multiple views. Inplace operators on such views are being deprecated and will be forbidden starting from version 1.8. Consider using |
Thanks for the hints. I just changed the following and it works now:
|
related: huggingface/transformers#8022 |
Should we be throwing a better error message? |
After discussing it on slack, the right solution here is most likely to finish the deprecation cycle for the view/inplace behavior. This will make it have the proper error message and not an internal assert error.
|
Hi, I found that I 've gotten very bad results compared to the original inplace operation in pytorch 1.6.0 original code: vs new code: I would expect the same results from these two codes isn't it? |
Unfortunately, the old version was not doing the right thing and so was silently returning wrong gradients. This is why we are doing this BC-breaking change to prevent people from doing that. |
Fixes this issue: pytorch/pytorch#46820 I came across this when I was running the code with pytorch==1.7, getting this error message (and this change would fix the issue): """ /home/iman/projs/NVAE/distributions.py:31: UserWarning: Output 0 of SplitBackward is a view and is being modified inplace. This view is an output of a function that returns multiple views. Inplace operators on such views are being deprecated and will be forbidden starting from version 1.8. Consider using `unsafe_` version of the function that produced this view or don't modify this view inplace. (Triggered internally at /pytorch/torch/csrc/autograd/variable.cpp:491.) self.mu = soft_clamp5(mu) /home/iman/projs/NVAE/distributions.py:32: UserWarning: Output 1 of SplitBackward is a view and is being modified inplace. This view is an output of a function that returns multiple views. Inplace operators on such views are being deprecated and will be forbidden starting from version 1.8. Consider using `unsafe_` version of the function that produced this view or don't modify this view inplace. (Triggered internally at /pytorch/torch/csrc/autograd/variable.cpp:491.) log_sigma = soft_clamp5(log_sigma) Traceback (most recent call last): File "train.py", line 415, in <module> init_processes(0, size, main, args) File "train.py", line 281, in init_processes fn(args) File "train.py", line 92, in main train_nelbo, global_step = train(train_queue, model, cnn_optimizer, grad_scalar, global_step, warmup_iters, writer, logging) File "train.py", line 164, in train logits, log_q, log_p, kl_all, kl_diag = model(x) File "/home/iman/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/iman/projs/NVAE/model.py", line 358, in forward dist = Normal(mu_q, log_sig_q) # for the first approx. posterior File "/home/iman/projs/NVAE/distributions.py", line 32, in __init__ log_sigma = soft_clamp5(log_sigma) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript (most recent call last): File "/home/iman/projs/NVAE/distributions.py", line 19, in soft_clamp5 # xx = 5.0*torch.tanh( x / 5.0) # return 5.0*torch.tanh( x / 5.0) return x.div_(5.).tanh_().mul(5.) # 5. * torch.tanh(x / 5.) <--> soft differentiable clamp between [-5, 5] ~~~~~~ <--- HERE RuntimeError: diff_view_meta->output_nr_ == 0 INTERNAL ASSERT FAILED at "/pytorch/torch/csrc/autograd/variable.cpp":363, please report a bug to PyTorch. """
馃悰 Bug
To Reproduce
Steps to reproduce the behavior:
v *= v_scale
RuntimeError: diff_view_meta->output_nr_ == 0 INTERNAL ASSERT FAILED at "../torch/csrc/autograd/variable.cpp":363, please report a bug to PyTorch.
I printed the v value as
tensor([[[[ 1.2014e-02, 1.2068e-02, 9.7856e-03, 8.8714e-03, 8.5734e-03,
2.9168e-03, 2.1199e-05, -2.8829e-03, -8.2607e-03, -1.5328e-02,
-2.5013e-02, -3.1222e-02],
[ 1.0266e-02, 6.2610e-03, 5.2078e-03, 5.4408e-03, 4.0872e-03,
-5.5038e-04, -4.3396e-03, -7.8755e-03, -1.2391e-02, -1.8298e-02,
-2.3656e-02, -2.3437e-02],
[ 5.6735e-03, 4.5379e-03, 3.9397e-03, 4.7426e-03, 1.8061e-03,
-2.3692e-03, -7.8620e-03, -1.2199e-02, -1.4298e-02, -1.7047e-02,
-2.0498e-02, -2.0740e-02],
[ 8.3388e-03, 5.1143e-03, 2.9209e-03, 5.5275e-03, 2.9254e-03,
-2.9371e-04, -4.7937e-03, -1.0673e-02, -1.2947e-02, -1.5372e-02,
-1.9295e-02, -2.1778e-02],
[ 8.7581e-03, 7.9502e-03, 7.1754e-03, 7.7227e-03, 6.9899e-03,
5.8939e-03, 5.0108e-03, -7.9185e-03, -9.1207e-03, -1.3086e-02,
-1.6816e-02, -2.1605e-02],
[ 7.9434e-03, 6.1096e-03, 3.8051e-03, 3.1724e-03, 2.2082e-03,
2.1375e-03, -2.9212e-04, -3.9539e-03, -7.1065e-03, -1.2883e-02,
-1.7969e-02, -2.4725e-02]]], [[[ 6.5310e-02, 5.6975e-02, 4.5132e-02, 5.0792e-02, 5.6704e-02,
6.2313e-02, 6.5036e-02, 6.4221e-02, 5.9827e-02, 6.2116e-02,
6.5767e-02, 7.4780e-02],
[ 5.6775e-02, 5.0288e-02, 5.4376e-02, 6.2040e-02, 6.0906e-02,
5.9159e-02, 5.9787e-02, 6.1453e-02, 5.8250e-02, 5.6893e-02,
6.1770e-02, 6.7854e-02],
[ 4.7995e-02, 4.8208e-02, 5.2908e-02, 5.5330e-02, 6.1871e-02,
5.6681e-02, 5.6556e-02, 6.0526e-02, 5.0920e-02, 5.3691e-02,
5.8827e-02, 6.3531e-02],
[ 3.0758e-02, 4.0276e-02, 4.7759e-02, 4.0098e-02, 4.0556e-02,
3.1987e-02, 3.8289e-02, 4.4429e-02, 4.1669e-02, 4.7020e-02,
5.2110e-02, 5.7387e-02],
[ 1.5687e-02, 2.2496e-02, 2.2303e-02, 3.8733e-03, -7.9459e-03,
-1.0541e-02, -6.2762e-03, 1.3099e-02, 2.7646e-02, 3.7377e-02,
4.5027e-02, 4.2339e-02],
[-1.0777e-02, -1.2646e-02, -1.3509e-02, -1.3324e-02, -1.8688e-02,
-3.3734e-02, -2.8426e-02, -1.3815e-02, 6.6503e-03, 1.6921e-02, 3.4458e-02, 3.6185e-02]]]], device='cuda:0',grad_fn=SplitBackward )
v_scale = 3.2000000000000006
Is very strange that there is error during the mulitplcation of a scaler with a tensor.
I guess the errors occurs at the autograd backward part
Expected behavior
Environment
Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).
You can get the script and run it with:
Collecting environment information...
PyTorch version: 1.8.0a0+37dbc61
Is debug build: True
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.1 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.18.2
Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce RTX 3090
Nvidia driver version: 455.32.00
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.4
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.8.0a0
[pip3] torchvision==0.8.0a0+cffac64
[conda] blas 1.0 mkl
[conda] magma-cuda110 2.5.2 1 pytorch
[conda] mkl 2020.2 256
[conda] mkl-include 2020.2 256
[conda] mkl-service 2.3.0 py38he904b0f_0
[conda] mkl_fft 1.2.0 py38h23d657b_0
[conda] mkl_random 1.1.1 py38h0573a6f_0
[conda] numpy 1.19.1 py38hbc911f0_0
[conda] numpy-base 1.19.1 py38hfa32c7d_0
[conda] torch 1.8.0a0 pypi_0 pypi
[conda] torchvision 0.8.0a0+cffac64 pypi_0 pypi
Additional context
cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved
The text was updated successfully, but these errors were encountered: