-
Notifications
You must be signed in to change notification settings - Fork 25.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use functional interface for softmax in attention #14198
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Tom.
Let's just remove the torch.
prefix for consistency. nn
should already be imported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your PR! I see a few left (in research_projects/bertabs and tests), is there a reason to leave those?
Probably it's an attention thing. :) |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Closing this as stale. |
Out of curiosity, why didn't you want to fix the last two instances and closed your PR instead? |
@sgugger To be honest, I'd have preferred to not comment on this further. But as you asked: So from my side, this is what happened
Then instead of merging:
I agree with the bot that the PR isn't moving with the expected speed and but I don't want to spend more time with it. I thought that searching for the same patterns across your codebase and fixing the other 27 places would have been a reasonable trade-off between "don't submit 28 identical one-liners" and "don't make it more complicated than it needs to be", obviously, you did not agree. That is OK, but I don't want to do the extra work that would be needed to make this patch acceptable to you. So you work at scale and process hundreds of patches any given day and I am not saying your process isn't adequate. It is just not for me. |
There seems to have been a misunderstanding here and I just wish you had either told us that the remaining instances you had found were fine in your book, or that you didn't want to work further on this PR. It was just a suggestion on my side and I never said the PR would not be merged if you didn't include those last two instances. I apologize if my comment upset you. If you want to reopen your PR, we'll be happy to merge it. |
No worries, there is nothing wrong with it nor with your comment, it's just that I don't want to change this patch anymore. (And maybe I wasn't in the mood for your bot comment, but hey.) |
The bot is there to remind us when we forget PRs for a long time like this one, sorry about it :-) |
Thank you! You're awesome! |
* use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!
There are several instances of (ab)using the PyTorch modular interface to compute softmax where it would be more natural to use the functional interface. This patch changes the occurrences I found.