Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deform Pool: Multiple Forward-Backward Gives Buffer / Retain_graph Related Bugs #3

Open
chengdazhi opened this issue Dec 22, 2018 · 4 comments

Comments

@chengdazhi
Copy link

Hi,

When I want to run deformable pooling (no matter modulated or not) two times (forward -> backward -> forward -> backward), the second backward yields follow bug:

Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

This doesn't happen on Deform Convs, so it's related to pooling only. What is your opinion?

@CharlesShang
Copy link
Owner

Hi Dazhi,
Could you check if you are on the latest version?
I just run the code below, it works fine.

    # mdformable pooling (V2)
    dpooling = DCNPooling(spatial_scale=1.0 / 4,
                          pooled_size=7,
                          output_dim=32,
                          no_trans=False,
                          group_size=1,
                          trans_std=0.1,
                          deform_fc_dim=1024).cuda()

    # first
    for _ in range(10):
        dout = dpooling(input, rois)
        target = dout.new(*dout.size())
        target.data.uniform_(-0.1, 0.1)
        error = (target - dout).mean()
        error.backward()

@chengdazhi
Copy link
Author

Hi Charles, thanks for your reply. I have to use your code implemented for pytorch 0.4, and my problem was also triggered on pytorch 0.4.1.

I came up with an ugly but workable solution, which is to replace Function.save_for_backward and Function.saved_tensors with directly storing tensors as the Function objects' members. I have checked that this workaround produces correct results, but I'm not sure whether this would cause other problems.

@CharlesShang
Copy link
Owner

I dont think save_for_backward is the reason to "buffers have already been freed" error.
I think you might call backward twice, like forward -> backward -> backward.

@chengdazhi
Copy link
Author

I am pretty positive that I didn't call two backwards consecutively. If you are not convinced about my conclusion, it should be very easy for you to switch to pytorch 0.4 implementation and add a loop to your example_mdpooling() in test.py. The error is triggered when calling _, _, ..., _ = self.saved_tensors, and it seems relative to your cuda code. I'm not familiar with pytorch, and I failed to pinpoint the root cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants