-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tagger: use unnormalized probabilities for inference #10197
Tagger: use unnormalized probabilities for inference #10197
Conversation
Using unnormalized softmax avoids use of the relatively expensive exp function, which can significantly speed up non-transformer models (e.g. I got a speedup of 27% on a German tagging + parsing pipeline).
There are users who use the scores directly and (I assume) would be expecting them to be normalized. I'm not saying we necessarily shouldn't do this, but this proposal doesn't give these users any way to control this behavior, does it? |
We could make this configurable. Maybe we should also only provide this functionality in a |
That sounds like a reasonable proposal. |
Normalization of probabilities is disabled by default to improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that it's a good idea to make this configurable in a new version of the Tagger
.
I find it slightly unintuitive to go from a v1 with always normalization, to a v2 with default normalization off. It might still catch some users off-guard that the behaviour changes between the two versions when you rely on defaults. But then again it is documented in the docs, and I can see why this would be beneficial in most cases / for most users. So I'm leaning towards keeping it like the PR has it currently :-)
Description
Using unnormalized softmax avoids use of the relatively expensive exp function,
which can significantly speed up non-transformer models (e.g. I got a speedup
of 27% on a German tagging + parsing pipeline).
Types of change
Performance improvement
Checklist
Draft since this requires explosion/thinc#583 and a new Thinc version.