Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 1, 256, 256]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). #1

Closed
zeeshannisar opened this issue Feb 7, 2022 · 2 comments

Comments

@zeeshannisar
Copy link

zeeshannisar commented Feb 7, 2022

Hi,

Frankly speaking, I am a newbie to Pycharm and familiar with Tensorflow. While reproducing your code using Pycharm as the IDE, I am facing the following error. Can you please help me to resolve it? I would really appreciate your help. Thanks in advance.

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 1, 256, 256]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I have just edited the dataset file (to load the dataset in an unsupervised way from the directories) and created a new train.py file to run the code in Pycharm. All of the remaining code is exactly the same as you have shared.

The dataset.py file is edited like this:

import os
import glob
import torch
import random
import torch.utils.data as data
from PIL import Image
import torchvision.transforms as transforms


class Images_with_Names(data.Dataset):
    """ can act both as Supervised or Un-supervised """

    def __init__(self, directory_A, directory_B, unsupervised=True, transform=None):
        self.directory_A = directory_A
        self.directory_B = directory_B
        self.unsupervised = unsupervised
        self.transform = transform

        self.imageList_A = sorted(glob.glob(f"{directory_A}/*.jpg*"))
        self.imageList_B = sorted(glob.glob(f"{directory_B}/*.jpg*"))

    def __getitem__(self, index):
        image_A = Image.open(self.imageList_A[index])
        if self.unsupervised:
            image_B = Image.open(self.imageList_B[random.randint(0, len(self.imageList_B) - 1)])
        else:
            image_B = Image.open(self.imageList_B[index])

        if self.transform is not None:
            image_A = self.transform(image_A)
            image_B = self.transform(image_B)

        return image_A, image_B

    def __len__(self):
        return max(len(self.imageList_A), len(self.imageList_B))

def preprocessing(x):
    x = (x / 127.5) - 1
    x = torch.reshape(x, (-1, x.shape[0], x.shape[1], x.shape[2]))
    return x

The train.py file is:

import os
import torch
import torchvision.transforms as transforms
from torchsummary import summary

from utils import train_UGAC
from dataset import Images_with_Names
from dataset import preprocessing
from Networks import CasUNet_3head, NLayerDiscriminator


# First instantiate the generators and discriminators
netG_A = CasUNet_3head(3, 3)
netD_A = NLayerDiscriminator(3, n_layers=4)
netG_B = CasUNet_3head(3, 3)
netD_B = NLayerDiscriminator(3, n_layers=4)

data_directory = "../code/UncertaintyAwareCycleConsistency/data/"
directory_A = os.path.join(data_directory, "A")
directory_B = os.path.join(data_directory, "B")

data_transformer = transforms.Compose([transforms.PILToTensor(),
                                       transforms.Lambda(lambda x: preprocessing(x))])

train_loader = Images_with_Names(directory_A=directory_A, directory_B=directory_B, unsupervised=True,
                                 transform=data_transformer)

# summary(netG_A.cuda(), input_size=(3, 256, 256))
train_UGAC(netG_A, netG_B, netD_A, netD_B, train_loader, dtype=torch.cuda.FloatTensor, device='cuda',
           num_epochs=10, init_lr=1e-5, ckpt_path='..saved_models/checkpoints/UGAC',
           list_of_hp=[1, 0.015, 0.01, 0.001, 1, 0.015, 0.01, 0.001, 0.05, 0.05, 0.01])

Attempts that I have tried to resolve the issue are:

  1. Setting inplace=False to all Relu and LeakyReluactivations following this but failed.
  2. Tried to get traceback of forward call that caused the error with torch.autograd.set_detect_anomaly(True), it says the following:
[W python_anomaly_mode.cpp:104] Warning: Error detected in ReluBackward0. Traceback of forward call that caused the error:

File "/home/xyz/code/UncertaintyAwareCycleConsistency/src/train.py", line 29, in <module>
    netG_A, netG_B, netD_A, netD_B = train_UGAC(netG_A, netG_B, netD_A, netD_B, train_loader, dtype=torch.cuda.FloatTensor,
  File "/home/xyz/code/UncertaintyAwareCycleConsistency/src/utils.py", line 69, in train_UGAC
    t0, t0_alpha, t0_beta = netG_B(xA)
  File "/home/xyz/.conda/envs/pytorch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/xyz/code/UncertaintyAwareCycleConsistency/src/Networks.py", line 205, in forward
    y = self.unet_list[i](y + x)
  File "/home/xyz/.conda/envs/pytorch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/xyz/code/UncertaintyAwareCycleConsistency/src/Networks.py", line 181, in forward
    y_mean, y_alpha, y_beta = self.out_mean(x), self.out_alpha(x), self.out_beta(x)

Looking forward to hearing from you soon. Thanks.

@kumudlakara
Copy link

Facing the same issue. @zeeshannisar were you able to find a solution to this?

@zeeshannisar
Copy link
Author

@kumudlakara Initially I was using pytorch_v_1.10. Later I downgraded it to v_1.6.0 (as mentioned in the repository) and it worked for me.

Additional Comments: I tried it for the Facades dataset and the results were very poor (as compared to the results provided in paper).

@udion udion closed this as completed Mar 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants