Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Vectorize the softmax calculation when not along the last dim (#59195)
Summary: Currently, if we do softmax which are not along the last dim, the calculation will fall to a [scalar version](https://github.com/pytorch/pytorch/blob/d417a094f398f1c4efd7f818b14b8471a597fbcc/aten/src/ATen/native/SoftMax.cpp#L14-L64). And we find actually we have the chance to vectorize the calculation along the inner_size dim. Changes we made: - Use vectorized softmax_kernel instead of host_softmax when not along the last dim. Performance data on 28 cores' Intel 8280 CPU when the Input size is [32, 81, 15130] and do softmax along the second dim(81). - FP32 Baseline: 24.67 ms - FP32 optimized: 9.2 ms Pull Request resolved: #59195 Reviewed By: ailzhang Differential Revision: D28854796 Pulled By: cpuhrsch fbshipit-source-id: 18477acc3963754c59009b1794f080496ae16c3d
- Loading branch information
1 parent
d60d81b
commit 68d690f
Showing
3 changed files
with
123 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters