RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #498

superctj · 2020-07-17T18:55:10Z

Describe the bug
I encountered the error "RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn" when generating adversarial examples using AutoProjectedGradientDescent. It looks like input tensors do not have 'requires_grad' set to True when the loss is backpropagated.

To Reproduce
Steps to reproduce the behavior:

Create a PyTorch model and load it with weights
Wrap the model with the ART PyTorch Classifier
Initiate an instance of AutoProjectedGradientDescent and pass in the wrapped model
Loop through the data loader and generate adversarial examples batch by batch
See error

Expected behavior
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Screenshots

System information (please complete the following information):

OS: Linux
Python version: 3.7
ART version or commit number: 1.3.1
PyTorch version: 1.4.0

beat-buesser · 2020-07-17T22:04:57Z

Hi @superctj I have not yet been able to reproduce the issue based on the traceback posted above. Could you please post a short code snippet that produces the issue?

superctj · 2020-07-18T02:18:56Z

Hi @beat-buesser Thank you for your quick response. Here is a screenshot of my code

beat-buesser · 2020-07-18T19:41:43Z

Hi @superctj I have run this script similar to yours but without a data loader and it works:

import numpy as np

from art.utils import load_mnist
from art.attacks.evasion import AutoProjectedGradientDescent

from tests.utils import get_image_classifier_pt

(x_train, y_train), (x_test, y_test), min_pixel_value, max_pixel_value = load_mnist()

x_train = np.swapaxes(x_train, 1, 3).astype(np.float32)
x_test = np.swapaxes(x_test, 1, 3).astype(np.float32)

classifier = get_image_classifier_pt(load_init=True, from_logits=True)

attack = AutoProjectedGradientDescent(estimator=classifier,
                                      norm=np.inf,
                                      eps=0.3,
                                      eps_step=0.1,
                                      batch_size=50,
                                      loss_type='cross_entropy')

x_test_adv = attack.generate(x=x_test[0:110], y=y_test[0:110])

print('Max difference:', np.max(np.abs(x_test_adv - x_test[0:110])))

Could you please try to run your script with (adding .cpu()):

x_batch = x_batch.detach().cpu().numpy()
y_batch = y_batch.detach().cpu().numpy()

superctj · 2020-07-18T20:17:12Z

Hi @beat-buesser Thank you for your suggestion. Unfortunately, it doesn't fix the error.

Could you please try to replicate the error by using a PyTorch data loader?

beat-buesser · 2020-07-20T11:49:51Z

Hi @superctj The script below seems to work with a torch.utils.data.DataLoader:

import numpy as np
import torch
import torchvision

from art.utils import load_mnist
from art.attacks.evasion import AutoProjectedGradientDescent

from tests.utils import get_image_classifier_pt

(x_train, y_train), (x_test, y_test), min_pixel_value, max_pixel_value = load_mnist()

x_train = np.swapaxes(x_train, 1, 3).astype(np.float32)
x_test = np.swapaxes(x_test, 1, 3).astype(np.float32)

classifier = get_image_classifier_pt(load_init=True, from_logits=True)

batch_size = 500

attack = AutoProjectedGradientDescent(estimator=classifier,
                                      norm=np.inf,
                                      eps=0.3,
                                      eps_step=0.1,
                                      batch_size=batch_size,
                                      loss_type='cross_entropy')

data_loader = torch.utils.data.DataLoader(torchvision.datasets.MNIST('./files/', train=False, download=True,
                                                                     transform=torchvision.transforms.Compose(
                                                                         [torchvision.transforms.ToTensor(), ])),
                                          batch_size=batch_size,
                                          shuffle=True)

for batch_idx, (x_batch, y_batch) in enumerate(data_loader):
    x_batch = x_batch.detach().cpu().numpy()
    y_batch = y_batch.detach().cpu().numpy()

    x_batch_adv = attack.generate(x=x_batch, y=y_batch)

    print('Max difference:', np.max(np.abs(x_batch_adv - x_batch)))

How do you define your data loader?

superctj · 2020-07-20T21:50:51Z

I have tried to debug this myself but haven't got any luck. I thought the issue was related to the gradient attribute, the tensor type, or the axis order but none of them helps. Basically, I have a custom dataset instance and wrap it with the PyTorch data loader. Do you have any ideas? I attach the data loader code for your reference.

beat-buesser · 2020-07-20T22:01:01Z

Does the error occur for the first or last batch?

beat-buesser · 2020-07-20T22:04:37Z

What is the type of the elements in self.y?

superctj · 2020-07-21T00:26:57Z

The error occurs for the first batch.
The type of the elements in self.y is integer.

beat-buesser · 2020-07-21T00:38:40Z

How do you define your model? Can you also show the output of a forward pass with your model e.g. the output of model(x)

superctj · 2020-07-21T00:47:01Z

I load a pre-trained model and wrap it into an ART PyTorch classifier. The output of model(x_batch) is a bunch of logits

superctj · 2020-07-21T01:04:20Z

Could you replicate the error if you set model.eval() before you wrap the model into the ART PyTorch classifier?

beat-buesser · 2020-07-21T13:25:02Z

model.eval() does not change anything for me. I noticed that my script only runs inside of the examples directory of ART. Does model.eval() change anything for you?

superctj · 2020-07-21T13:59:00Z

Nah. I just thought it could be the difference between your and my code. Are you saying that you can reproduce the error outside the examples directory of ART.

beat-buesser · 2020-07-21T14:04:01Z

No, unfortunately not, it's just that get_image_classifier_pt uses relative paths to load a small trained classifier. I have repeated the test outside if examples with a new model and it still runs. Are you running on GPU? If yes, can you try to run on CPU only?

beat-buesser · 2020-07-21T14:17:20Z

Another debugging approach could be to test the your script by running the line x_adv_batch = attack.generate(x=x_batch, y=y_batch) with two arrays for x_batch and y_batch created with numpy (e.g. random numbers) instead of getting them from the data loader. That could show if the data loader or the attack and classifier combination are causing the problem.

superctj · 2020-07-21T14:17:21Z

Unfortunately, the error persists when I run on CPU only. Could it be a problem with PyTorch? I am using PyTorch 1.4.0.

beat-buesser · 2020-07-21T14:18:34Z

It should not, I have also been using PyTorch 1.4.0.

superctj · 2020-07-21T14:26:59Z

Cool, the error doesn't show up if I create x_batch and y_batch with numpy directly. It should be a problem with the data loader then. But it is interesting that x_batch = x_batch.detach().cpu().numpy() triggers the error while x_batch = np.ones(x_batch.shape) does not.

beat-buesser · 2020-07-21T14:32:23Z

Can you print the type and content of x_batch and y_batch after the lines x_batch = x_batch.detach().cpu().numpy() and y_batch = y_batch.detach().cpu().numpy() using the data loader?

superctj · 2020-07-21T14:47:45Z

x_batch is a <class 'numpy.ndarray'> of shape (50, 3, 256, 256).

y_batch is also a <class 'numpy.ndarray'> of shape (50,). It looks like [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

beat-buesser · 2020-07-21T14:51:38Z

I think I found the bug, in your first script above a line has a typo: y_baych = y_batch.detach().cpu().numpy() instead of y_batch = y_batch.detach().cpu().numpy() (y instead of t in y_batch). That way the original tensor y_batch gets to the attack instead of the numpy array in y_baych.

A second question: Is it correct that all the labels are 0 for this batch?

superctj · 2020-07-21T14:57:07Z

That was a typo. I found that as well yesterday but it didn't fix the error lol.

Yeah, I didn't shuffle the test set so it starts with images from label 0.

superctj · 2020-07-24T20:12:54Z

Hi @beat-buesser. I create a minimal example to reproduce the error. Could you please take a look and see if you can reproduce the error?

superctj · 2020-07-24T20:45:35Z

With the help of my labmate, we found the problem was with torch.no_grad, which prevents the attack from being able to backpropagate to the inputs. Do you have any suggestions other than removing with torch.no_grad or is there a better solution?

beat-buesser · 2020-07-24T20:57:18Z

Hi @superctj Thank you very much for the minimal example, that's great!

Do you mean the with torch.no_grad in https://github.com/superctj/error_demo/blob/620368e21b7ae56fbcdab28fcba1281ddfd42073/eval.py#L30 ?

superctj · 2020-07-24T21:31:55Z

Yeah, after removing that line, I can run the PGD attack smoothly.

beat-buesser · 2020-07-24T22:10:33Z

Ok, that makes sense, great catch, I hadn't noticed it.

White-box attacks like ProjectedGradientDescent will not work inside of a with torch.no_grad: block because they are calculating loss or class gradients required to run their attack algorithm.

It is very likely possible that black-box attacks, like HopSkipJump, which don't require any gradient calculation by the framework will work inside of a with torch.no_grad: block. However since ART 1.3 PytorchClassifier.predict actually uses with torch.no_grad: inside of predict to take advantage of the faster model evaluation if gradients are disabled.

superctj · 2020-07-25T01:42:53Z

That's good to know! Thank you very much for your time and patience. I appreciate it.

beat-buesser self-assigned this Jul 17, 2020

superctj mentioned this issue Jul 18, 2020

RuntimeError: Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same #497

Closed

beat-buesser added the question Further information is requested label Aug 7, 2020

beat-buesser closed this as completed Aug 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #498

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #498

superctj commented Jul 17, 2020

beat-buesser commented Jul 17, 2020 •

edited

superctj commented Jul 18, 2020

beat-buesser commented Jul 18, 2020

superctj commented Jul 18, 2020 •

edited

beat-buesser commented Jul 20, 2020 •

edited

superctj commented Jul 20, 2020 •

edited

beat-buesser commented Jul 20, 2020

beat-buesser commented Jul 20, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020 •

edited

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

superctj commented Jul 24, 2020

superctj commented Jul 24, 2020

beat-buesser commented Jul 24, 2020

superctj commented Jul 24, 2020

beat-buesser commented Jul 24, 2020

superctj commented Jul 25, 2020

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #498

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #498

Comments

superctj commented Jul 17, 2020

beat-buesser commented Jul 17, 2020 • edited

superctj commented Jul 18, 2020

beat-buesser commented Jul 18, 2020

superctj commented Jul 18, 2020 • edited

beat-buesser commented Jul 20, 2020 • edited

superctj commented Jul 20, 2020 • edited

beat-buesser commented Jul 20, 2020

beat-buesser commented Jul 20, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020 • edited

superctj commented Jul 21, 2020

beat-buesser commented Jul 21, 2020

superctj commented Jul 21, 2020

superctj commented Jul 24, 2020

superctj commented Jul 24, 2020

beat-buesser commented Jul 24, 2020

superctj commented Jul 24, 2020

beat-buesser commented Jul 24, 2020

superctj commented Jul 25, 2020

beat-buesser commented Jul 17, 2020 •

edited

superctj commented Jul 18, 2020 •

edited

beat-buesser commented Jul 20, 2020 •

edited

superctj commented Jul 20, 2020 •

edited

beat-buesser commented Jul 21, 2020 •

edited