use functional interface for softmax in attention #14198

t-vi · 2021-10-28T18:28:04Z

There are several instances of (ab)using the PyTorch modular interface to compute softmax where it would be more natural to use the functional interface. This patch changes the occurrences I found.

…ely calling it

stas00

Thank you, Tom.

Let's just remove the torch. prefix for consistency. nn should already be imported.

sgugger

Thanks a lot for your PR! I see a few left (in research_projects/bertabs and tests), is there a reason to leave those?

t-vi · 2021-10-28T20:21:50Z

Probably it's an attention thing. :)

github-actions · 2021-11-28T15:01:49Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

t-vi · 2021-11-28T15:57:39Z

Closing this as stale.

sgugger · 2021-11-28T23:56:47Z

Out of curiosity, why didn't you want to fix the last two instances and closed your PR instead?

t-vi · 2021-11-30T09:56:53Z

@sgugger To be honest, I'd have preferred to not comment on this further. But as you asked:

So from my side, this is what happened

I use transformers as a reference for some custom implementation (i.e. I don't plan on using transformers itself) and notice a dubious code pattern of instantiating Softmax and immediately using it.
Out of courtesy, I submit a PR. To make the review worthwhile, I look for other instances of the identical pattern and fix these, too.
At this point, I submitted a patch that looked "very low risk (because I never say obviously correct), easy to review" to me.
It probably doesn't fix all dubious patterns in transformers, but it takes out 28 instances spread across as many files.
The patch is promptly approved by you and stas. Thank you!

Then instead of merging:

You point to two other directories that you say contain more of it (I still don't know what you mean, there are uses of the modular Softmax, but in a perfectly OK way to me, I didn't want to audit your entire codebase for dubious patterns).
Nothing happens for a month.
Now your bot says the PR is stale and places the burden of moving it along on me.

I agree with the bot that the PR isn't moving with the expected speed and but I don't want to spend more time with it. I thought that searching for the same patterns across your codebase and fixing the other 27 places would have been a reasonable trade-off between "don't submit 28 identical one-liners" and "don't make it more complicated than it needs to be", obviously, you did not agree. That is OK, but I don't want to do the extra work that would be needed to make this patch acceptable to you.

So you work at scale and process hundreds of patches any given day and I am not saying your process isn't adequate. It is just not for me.

sgugger · 2021-11-30T16:22:49Z

There seems to have been a misunderstanding here and I just wish you had either told us that the remaining instances you had found were fine in your book, or that you didn't want to work further on this PR. It was just a suggestion on my side and I never said the PR would not be merged if you didn't include those last two instances. I apologize if my comment upset you.

If you want to reopen your PR, we'll be happy to merge it.

t-vi · 2021-11-30T16:35:56Z

No worries, there is nothing wrong with it nor with your comment, it's just that I don't want to change this patch anymore. (And maybe I wasn't in the mood for your bot comment, but hey.)
If it's still useful, I'm happy to have it merged, if not, it's OK too.

sgugger · 2021-11-30T16:48:27Z

The bot is there to remind us when we forget PRs for a long time like this one, sorry about it :-)
Thanks again for your contribution!

t-vi · 2021-11-30T16:51:52Z

Thank you! You're awesome!

* use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!

use functional interface instead of instantiating module and immediat…

0a35567

…ely calling it

stas00 approved these changes Oct 28, 2021

View reviewed changes

stas00 requested review from sgugger and LysandreJik October 28, 2021 19:53

fix torch.nn.functional to nn.functional. Thank you Stas!

c07c763

sgugger approved these changes Oct 28, 2021

View reviewed changes

t-vi closed this Nov 28, 2021

t-vi reopened this Nov 30, 2021

sgugger merged commit 6ed9882 into huggingface:master Nov 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use functional interface for softmax in attention #14198

use functional interface for softmax in attention #14198

t-vi commented Oct 28, 2021

stas00 left a comment

sgugger left a comment

t-vi commented Oct 28, 2021

github-actions bot commented Nov 28, 2021

t-vi commented Nov 28, 2021

sgugger commented Nov 28, 2021

t-vi commented Nov 30, 2021

sgugger commented Nov 30, 2021

t-vi commented Nov 30, 2021

sgugger commented Nov 30, 2021

t-vi commented Nov 30, 2021

use functional interface for softmax in attention #14198

use functional interface for softmax in attention #14198

Conversation

t-vi commented Oct 28, 2021

stas00 left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

t-vi commented Oct 28, 2021

github-actions bot commented Nov 28, 2021

t-vi commented Nov 28, 2021

sgugger commented Nov 28, 2021

t-vi commented Nov 30, 2021

sgugger commented Nov 30, 2021

t-vi commented Nov 30, 2021

sgugger commented Nov 30, 2021

t-vi commented Nov 30, 2021