Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time. #10

Closed
chang4869 opened this issue Jan 20, 2021 · 6 comments
Labels
question Further information is requested

Comments

@chang4869
Copy link

please let me know where is wrony

@davda54
Copy link
Owner

davda54 commented Jan 20, 2021

Hi, could you please provide a minimal working example of the code causing this exception? Are you doing two separate forward-backward passes as illustrated in the README file? This error suggest that you do a single forward pass followed by multiple backward passes.

@davda54 davda54 added the question Further information is requested label Jan 20, 2021
@jeongHwarr
Copy link

jeongHwarr commented Jan 27, 2021

I had a similar problem.
Here is my code that made the mistake.

output, logit_map = model(image)

loss = 0
for t in range(num_tasks):
    loss_t, acc_t = get_loss(output, target, t, device, cfg)
    loss += loss_t
    loss_sum[t] += loss_t.item()
    acc_sum[t] += acc_t.item()

optimizer.zero_grad()
loss.backward()
optimizer.first_step(zero_grad=True)
output, logit_map = model(image)

for t in range(num_tasks):
    loss_t, acc_t = get_loss(output, target, t, device, cfg)
    loss += loss_t
    loss_sum[t] += loss_t.item()
    acc_sum[t] += acc_t.item()
loss.backward()
optimizer.second_step(zero_grad=True)

I think it is probably due to this issue

The loss variable for the first backward and the loss variable for the second backward must be different.
I forgot to initialize the variables.

Or check if you forgot to let the model predict the results again

@davda54
Copy link
Owner

davda54 commented Jan 27, 2021

Yeah exactly, you shouldn't use the same loss variable for both forward passes -- in your code, loss is the sum of loss_ts from both cycles, so autograd fails to backpropagate "through the graph a second time" :)

@AntBlo
Copy link

AntBlo commented Jan 31, 2021

I made the mistake of doing:

loss = criterion(outputs, labels)
loss.backward()
optimizer.first_step(zero_grad=True)

criterion(outputs, labels).backward()
optimizer.second_step(zero_grad=True)

But yeah, just make sure you don't reuse model outputs. If you follow davda54's direction and use criterion(model(inputs), labels) on the second backward pass, you should be good. Just as a heads-up if anyone else stumbles on it as I did.

Thank you davda54 for implementing this, by the way! I hope I'm not being rude in asking, but are we allowed to use this code in Kaggle contests? I saw there weren't any license for the code, so just curious.

@davda54
Copy link
Owner

davda54 commented Jan 31, 2021

Of course, feel free to use it wherever you want :) Thanks for asking, I've added an MIT license to the repo to make it clear.

@davda54 davda54 closed this as completed Jan 31, 2021
@want2333
Copy link

Hello Is there a connection between these two questions?

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2048, 196]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

 for i, data in enumerate(tqdm(trainloader)):
            if set == 'CUB':
                images, labels, _, _ = data
            else:
                images, labels = data
            images, labels = images.cuda(), labels.cuda()

        optimizer.zero_grad()


        proposalN_windows_score, proposalN_windows_logits, indices, \
        window_scores, _, raw_logits, local_logits, _ = model(images, epoch, i, 'train')

        def closure():

            raw_loss = criterion(raw_logits, labels)
            local_loss = criterion(local_logits, labels)
            windowscls_loss = criterion(proposalN_windows_logits,
                                        labels.unsqueeze(1).repeat(1, proposalN).view(-1))
            if epoch < 2:
                total_loss = raw_loss
            else:
                total_loss = raw_loss + local_loss + windowscls_loss
            total_loss.backward()             ###problem line
            return total_loss


        raw_loss = criterion(raw_logits, labels)
        local_loss = criterion(local_logits, labels)
        windowscls_loss = criterion(proposalN_windows_logits,
                           labels.unsqueeze(1).repeat(1, proposalN).view(-1))

        if epoch < 2:
            total_loss = raw_loss
        else:
            total_loss = raw_loss + local_loss + windowscls_loss

        total_loss.backward(retain_graph=True)

        optimizer.step(closure)

    scheduler.step()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants