Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do we have detach() in self.alphas = alphas.detach() in Attention class in Chapter 9. #23

Closed
eisthf opened this issue Jul 26, 2021 · 2 comments

Comments

@eisthf
Copy link

eisthf commented Jul 26, 2021

I wonder why alphas is "detach()"ed and before saved to self.alphas in Attention class. I tried self.alphas = alphas, that is, without detach and trained the model. There is no difference in performance. So I believe the reason is in something else.

Thank you for your great teaching in your great book!

@dvgodoy
Copy link
Owner

dvgodoy commented Jul 26, 2021

Hi,

Thank you for supporting my work, and for your kind words :-)

Regarding the "detachment" of the alphas, the main idea is to prevent unintentional changes to the dynamic computation graph.
If you don't detach the alphas, it shouldn't change anything in the training process, as you already noticed.

But let's say you pause training, and decides to take a peek at the alphas. You may end-up performing an operation on them, and, since the graph keeps track of every operation performed on gradient-requiring tensors and its dependencies, it will impact the graph. That may be an issue if you resume training afterward.

In other circumstances, like the validation loop, we wrap the operations with a no_grad context manager to prevent potential problems.
The same goes for the detachment of the alphas - it's there as a safeguard, to make sure that it's totally safe to play with the values in self.alphas. It's also convenient, because you'd need to detach them anyway if you wanted to make the alphas Numpy arrays.

Hope it helps :-)

@eisthf
Copy link
Author

eisthf commented Jul 26, 2021

Thank you so much! :-))

@eisthf eisthf closed this as completed Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants