RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. #44023

curehabit · 2020-09-02T07:13:56Z

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

follow the mnist tutorial (https://pytorch-lightning.readthedocs.io/en/stable/new-project.html)
use Trainer(tpu_cores=1)
run

Traceback (most recent call last):
  File "plmnist.py", line 80, in <module>
    trainer.fit(model, train_loader, val_loader)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/trainer/states.py", line 48, in wrapped_fn
    result = fn(self, *args, **kwargs)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py",line 1078, in fit
    self.accelerator_backend.train(model)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/accelerators/tpu_backend.py", line 87, in train
    start_method=self.start_method
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/torch_xla/distributed/xla_multiprocessing.py", line 284, in spawn
    return _run_direct(fn, args, nprocs, join, daemon, start_method)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/torch_xla/distributed/xla_multiprocessing.py", line 245, in _run_direct
    fn(0, *args)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/accelerators/tpu_backend.py", line 112, in tpu_train_in_process
    results = trainer.run_pretrain_routine(model)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py",line 1239, in run_pretrain_routine
    self.train()
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 394, in train
    self.run_training_epoch()
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 491, in run_training_epoch
    batch_output = self.run_training_batch(batch, batch_idx)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 844, in run_training_batch
    self.hiddens
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 1049, in optimizer_closure
    training_step_output_for_epoch_end = copy(training_step_output)
  File "/***/anaconda3/envs/turing/lib/python3.7/copy.py", line 88, in copy
    return copier(x)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/pytorch_lightning/core/step_result.py", line 302, in __copy__
    newone[k] = copy(v)
  File "/***/anaconda3/envs/turing/lib/python3.7/copy.py", line 96, in copy
    rv = reductor(4)
  File "/***/anaconda3/envs/turing/lib/python3.7/site-packages/torch/tensor.py", line 87, in __reduce_ex__
    args = (self.cpu().numpy(),
RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

Expected behavior

Environment

Environment
CUDA:

GPU:
available: False
version: None
Packages:
numpy: 1.19.0
pyTorch_debug: False
pyTorch_version: 1.6.0.dev20200622
pytorch-lightning: 0.9.0
tensorboard: 2.2.0
tqdm: 4.48.2
System:
OS: Linux
architecture:
64bit
processor:
python: 3.7.7
version: Matrix multiplication operator #1 SMP Debian 4.14.81.bm.15 Sun Sep 8 05:02:31 UTC 2019

PyTorch Version (e.g., 1.0): 1.6
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, source): pip
Build command you used (if compiling from source):
Python version: 3.7.8
CUDA/cuDNN version: None
GPU models and configuration: None
Any other relevant information: torch_xla:1.6.0

Additional context

/pytorch_lightning/trainer/training_loop.py(1049)optimizer_closure()

1044
1045                # if the user decides to finally reduce things in epoch_end, save raw output without graphs
1046                if isinstance(training_step_output_for_epoch_end, torch.Tensor):
1047                    training_step_output_for_epoch_end = training_step_output_for_epoch_end.detach()
1048                elif is_result_obj:
1049B->              training_step_output_for_epoch_end = copy(training_step_output) ###<- there should be detach before copy
1050                    training_step_output_for_epoch_end.detach()
1051                else:
1052                    training_step_output_for_epoch_end = recursive_detach(training_step_output_for_epoch_end)
1053

cc @brianjo @mruberry @jlin27

The text was updated successfully, but these errors were encountered:

mariosasko · 2020-09-05T14:42:58Z

The error is pretty self-explanatory. You can't call .numpy() on a tensor if that tensor is part of the computation graph. You first have to detach it from the graph and this will return a new tensor that shares the same underlying storage but doesn't track gradients (requires_grad is False). Then you can call .numpy() safely. So just replace tensor.numpy() with tensor.detach().numpy().

If this doesn't work, please open an issue in the PytorchLightning repo.

fe1w0 · 2021-06-14T18:26:34Z

The error is pretty self-explanatory. You can't call .numpy() on a tensor if that tensor is part of the computation graph. You first have to detach it from the graph and this will return a new tensor that shares the same underlying storage but doesn't track gradients (requires_grad is False). Then you can call .numpy() safely. So just replace tensor.numpy() with tensor.detach().numpy().

If this doesn't work, please open an issue in the PytorchLightning repo.

My temporary solution is to directly modify the source code according to the error message. At the same time, I don't think this is a good solution. In any case, it did solve my problem.
like:

mariosasko · 2021-06-14T22:10:25Z

@fe1w0 Yes, modifying the source is not the best solution. Can you please provide the minimal reproducible example? I'm not even sure this issue is still relevant as I can't find the relevant code in the PyTorchLightning source. Which version of PyTorchLightning are you using?

fe1w0 · 2021-06-21T00:27:56Z

@fe1w0 Yes, modifying the source is not the best solution. Can you please provide the minimal reproducible example? I'm not even sure this issue is still relevant as I can't find the relevant code in the PyTorchLightning source. Which version of PyTorchLightning are you using?

I'm very sorry😵‍💫, but I haven't responded to the message until now. I think this is not a pytorch problem, it should be related to my tenserflow environment. When I reconfigure tenserflow, after
conda install pytorch, so far no problems have been encountered. Thank you very much for your attention.

This is the reference link for my reinstallation, suitable for apple m1
https://developer.apple.com/metal/tensorflow-plugin/

mruberry · 2021-08-10T18:34:09Z

This appears to be an issue with PyTorch Lightning (and possibly only an older version of it). So closing it here.

You might want to open an issue at the PyTorch Lightning Github (https://github.com/PyTorchLightning/pytorch-lightning), @curehabit.

solarshao1006 · 2022-04-22T21:59:58Z

When I run this:

#run style transfer
max_iter = 500
show_iter = 50
optimizer = optim.LBFGS([opt_img]);
n_iter=[0]

while n_iter[0] <= max_iter:

    def closure():
        optimizer.zero_grad()
        
        out = vgg(opt_img, loss_layers)
        layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a,A in enumerate(out)]
        
        loss = sum(layer_losses)
        loss.backward()
        n_iter[0]+=1
        #print loss
        if n_iter[0]%show_iter == (show_iter-1):
            print('Iteration: %d, loss: %f'%(n_iter[0]+1, loss.item()))
#             print([loss_layers[li] + ': ' +  str(l.data[0]) for li,l in enumerate(layer_losses)]) #loss of each layer
        return loss

    optimizer.step(closure)
    
#display result
out_img = postp(opt_img.data[0].cpu().squeeze())
imshow(out_img)
gcf().set_size_inches(10,10)

I got this error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-40-f6b88457c654> in <module>
     22         return loss
     23 
---> 24     optimizer.step(closure)
     25 
     26 #display result

~/opt/anaconda3/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     24         def decorate_context(*args, **kwargs):
     25             with self.__class__():
---> 26                 return func(*args, **kwargs)
     27         return cast(F, decorate_context)
     28 

~/opt/anaconda3/lib/python3.7/site-packages/torch/optim/lbfgs.py in step(self, closure)
    309 
    310         # evaluate initial f(x) and df/dx
--> 311         orig_loss = closure()
    312         loss = float(orig_loss)
    313         current_evals = 1

~/opt/anaconda3/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     24         def decorate_context(*args, **kwargs):
     25             with self.__class__():
---> 26                 return func(*args, **kwargs)
     27         return cast(F, decorate_context)
     28 

<ipython-input-40-f6b88457c654> in closure()
     13         layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a,A in enumerate(out)]
     14         print(layer_losses)
---> 15         loss = sum(layer_losses)
     16         loss.backward()
     17         n_iter[0]+=1

<__array_function__ internals> in sum(*args, **kwargs)

~/opt/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py in sum(a, axis, dtype, out, keepdims, initial, where)
   2258 
   2259     return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
-> 2260                           initial=initial, where=where)
   2261 
   2262 

~/opt/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
     84                 return reduction(axis=axis, out=out, **passkwargs)
     85 
---> 86     return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
     87 
     88 

~/opt/anaconda3/lib/python3.7/site-packages/torch/tensor.py in __array__(self, dtype)
    628             return handle_torch_function(Tensor.__array__, relevant_args, self, dtype=dtype)
    629         if dtype is None:
--> 630             return self.numpy()
    631         else:
    632             return self.numpy().astype(dtype, copy=False)

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

I believe this error is related to the source code and changing the tensor source code is not the best solution. Any recommendations on what to do?

mruberry · 2022-04-24T01:52:49Z

When I run this:

#run style transfer
max_iter = 500
show_iter = 50
optimizer = optim.LBFGS([opt_img]);
n_iter=[0]

while n_iter[0] <= max_iter:

    def closure():
        optimizer.zero_grad()
        
        out = vgg(opt_img, loss_layers)
        layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a,A in enumerate(out)]
        
        loss = sum(layer_losses)
        loss.backward()
        n_iter[0]+=1
        #print loss
        if n_iter[0]%show_iter == (show_iter-1):
            print('Iteration: %d, loss: %f'%(n_iter[0]+1, loss.item()))
#             print([loss_layers[li] + ': ' +  str(l.data[0]) for li,l in enumerate(layer_losses)]) #loss of each layer
        return loss

    optimizer.step(closure)
    
#display result
out_img = postp(opt_img.data[0].cpu().squeeze())
imshow(out_img)
gcf().set_size_inches(10,10)

I got this error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-40-f6b88457c654> in <module>
     22         return loss
     23 
---> 24     optimizer.step(closure)
     25 
     26 #display result

~/opt/anaconda3/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     24         def decorate_context(*args, **kwargs):
     25             with self.__class__():
---> 26                 return func(*args, **kwargs)
     27         return cast(F, decorate_context)
     28 

~/opt/anaconda3/lib/python3.7/site-packages/torch/optim/lbfgs.py in step(self, closure)
    309 
    310         # evaluate initial f(x) and df/dx
--> 311         orig_loss = closure()
    312         loss = float(orig_loss)
    313         current_evals = 1

~/opt/anaconda3/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     24         def decorate_context(*args, **kwargs):
     25             with self.__class__():
---> 26                 return func(*args, **kwargs)
     27         return cast(F, decorate_context)
     28 

<ipython-input-40-f6b88457c654> in closure()
     13         layer_losses = [weights[a] * loss_fns[a](A, targets[a]) for a,A in enumerate(out)]
     14         print(layer_losses)
---> 15         loss = sum(layer_losses)
     16         loss.backward()
     17         n_iter[0]+=1

<__array_function__ internals> in sum(*args, **kwargs)

~/opt/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py in sum(a, axis, dtype, out, keepdims, initial, where)
   2258 
   2259     return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
-> 2260                           initial=initial, where=where)
   2261 
   2262 

~/opt/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
     84                 return reduction(axis=axis, out=out, **passkwargs)
     85 
---> 86     return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
     87 
     88 

~/opt/anaconda3/lib/python3.7/site-packages/torch/tensor.py in __array__(self, dtype)
    628             return handle_torch_function(Tensor.__array__, relevant_args, self, dtype=dtype)
    629         if dtype is None:
--> 630             return self.numpy()
    631         else:
    632             return self.numpy().astype(dtype, copy=False)

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

I believe this error is related to the source code and changing the tensor source code is not the best solution. Any recommendations on what to do?

It looks like you're calling NumPy's sum on a PyTorch tensor that requires grad, but NumPy doesn't support gradients. You probably want to use torch.sum.

smessmer added module: docs Related to our documentation, both in docs/ and docblocks triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Sep 2, 2020

mruberry closed this as completed Aug 10, 2021

mruberry added the hackathon label Aug 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. #44023

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. #44023

curehabit commented Sep 2, 2020 •

edited by pytorch-probot bot

mariosasko commented Sep 5, 2020

fe1w0 commented Jun 14, 2021

mariosasko commented Jun 14, 2021

fe1w0 commented Jun 21, 2021 •

edited

mruberry commented Aug 10, 2021

solarshao1006 commented Apr 22, 2022 •

edited by mruberry

mruberry commented Apr 24, 2022

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. #44023

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. #44023

Comments

curehabit commented Sep 2, 2020 • edited by pytorch-probot bot

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

mariosasko commented Sep 5, 2020

fe1w0 commented Jun 14, 2021

mariosasko commented Jun 14, 2021

fe1w0 commented Jun 21, 2021 • edited

mruberry commented Aug 10, 2021

solarshao1006 commented Apr 22, 2022 • edited by mruberry

mruberry commented Apr 24, 2022

curehabit commented Sep 2, 2020 •

edited by pytorch-probot bot

fe1w0 commented Jun 21, 2021 •

edited

solarshao1006 commented Apr 22, 2022 •

edited by mruberry