XLNet Bug when training with apex 16-bit precision (huggingface#6567)

* xlnet fp16 bug fix * comment cast added * Update modeling_xlnet.py Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Zigur · Oct 26, 2020 · 14c1658 · 14c1658
1 parent f4df3a9
commit 14c1658
Showing 1 changed file with 2 additions and 1 deletion.
diff --git a/src/transformers/modeling_xlnet.py b/src/transformers/modeling_xlnet.py
@@ -446,7 +446,8 @@ def forward(
             v_head_h = torch.einsum("ibh,hnd->ibnd", cat, self.v)
 
             # positional heads
-            k_head_r = torch.einsum("ibh,hnd->ibnd", r, self.r)
+            # type casting for fp16 support
+            k_head_r = torch.einsum("ibh,hnd->ibnd", r.type(self.r.dtype), self.r)
 
             # core attention ops
             attn_vec = self.rel_attn_core(