A bug in batchnorm double backward

I am working on adding batchnorm in the discriminator in WGAN-GP. However, I encountered a bug where **gpu memory continues to increase** when using **batchnorm double backprop**. This bug only occurs when using batchnorm. If i remove batchnorm from the model, the bug doesn't occur.

Here's the code you can experiment with.

```python
import torch
import torch.nn as nn
from torch.autograd import Variable

# Model (partial discriminator)
D = nn.Sequential(
    nn.Conv2d(3, 64, kernel_size=4, stride=2, padding=1),
    nn.BatchNorm2d(64),   # if you remove this, the bug does not occur.
    nn.LeakyReLU(0.2))

D.cuda()


for i in range(1000):
    # Input 
    x = Variable(torch.randn(10, 3, 128, 128).cuda(), requires_grad=True)
    out = D(x)
    
    grad = torch.autograd.grad(outputs=out,
                               inputs=x,
                               grad_outputs=torch.ones(out.size()).cuda(),
                               retain_graph=True,
                               create_graph=True,
                               only_inputs=True)[0]
    
    grad_norm = grad.pow(2).sum().sqrt()
    loss = torch.mean((grad_norm - 1)**2)

    # Reset grad and backprop
    D.zero_grad()
    loss.backward()
    
    if (i+1) % 10 == 0:
        print (i+1)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A bug in batchnorm double backward #2309

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

A bug in batchnorm double backward #2309

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions