[PyTorch] MHA: just use existing softmax on CPU (#72462)

Summary: Pull Request resolved: pytorch/pytorch#72462 Eliminating one potential source of bugs. ghstack-source-id: 149067329 Test Plan: CI Reviewed By: zrphercule Differential Revision: D34006432 fbshipit-source-id: 55fda186636dc457db7f3f9c8e18f1627ff33b6a (cherry picked from commit 5d8de9a12200db236d0fedfd3b13b1209fd4bc18)
cyyever · Feb 17, 2022 · 76364d9 · 76364d9
1 parent b685d38
commit 76364d9
Showing 1 changed file with 4 additions and 1 deletion.
diff --git a/aten/src/ATen/native/attention.cpp b/aten/src/ATen/native/attention.cpp
@@ -131,13 +131,16 @@ Tensor bmm_nt(const Tensor& a, const Tensor& b) {
 }
 
 void masked_softmax_dropout(
-    const Tensor& attn_scores,
+    Tensor& attn_scores,
     const c10::optional<Tensor>& attn_mask) {
   auto B = attn_scores.size(0);
   auto num_heads = attn_scores.size(1);
   auto T = attn_scores.size(2);
   if (attn_mask) {
     TORCH_CHECK(attn_mask->is_contiguous());
+  } else {
+    at::_softmax_out(attn_scores, attn_scores, 3, false);
+    return;
   }
   AT_DISPATCH_FLOATING_TYPES_AND2(
       ScalarType::Half,