New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why is detach necessary #116

Closed
rogertrullo opened this Issue Mar 20, 2017 · 4 comments

Comments

Projects
None yet
4 participants
@rogertrullo

rogertrullo commented Mar 20, 2017

Hi, I am wondering why is detach necessary in this line:

output = netD(fake.detach())

I understand that we want to update the gradients of netD without changin the ones of netG. But if the optimizer is only using the parameters of netD, then only its weight will be updated. Am I missing something here?
Thanks in advance!

@soumith

This comment has been minimized.

Member

soumith commented Mar 20, 2017

you are correct. it is done for speed, not correctness. The computation of gradients wrt the weights of netG can be fully avoided in the backward pass if the graph is detached where it is.

@soumith soumith closed this Mar 20, 2017

@rogertrullo

This comment has been minimized.

rogertrullo commented Mar 20, 2017

Thanks for the quick reply @soumith!

@sunshineatnoon

This comment has been minimized.

sunshineatnoon commented Apr 1, 2017

Missed detach when implementing dcgan in pytorch, and it gives me this error:

RuntimeError: Trying to backward through the graph second time, but the buffers have already been freed. Please specify retain_variables=True when calling backward for the first time.
@plopd

This comment has been minimized.

plopd commented Jul 1, 2018

@soumith This is not true. Detaching fake from the graph is necessary to avoid forward-passing the noise through G when we actually update the generator. If we do not detach, then, although fake is not needed for gradient update of D, it will still be added to the computational graph and as a consequence of backward pass which clears all the variables in the graph (retain_graph=False by default), fake won't be available when G is updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment