Fix mixed precision in TF models #9163

jplu · 2020-12-17T12:53:05Z

What does this PR do?

This PR aims to fix the mixed precision issues when tf.keras.mixed_precision.experimental.set_policy() is set to something else than tf.float32. In the same page, this PR aims to fix some TFLite quantization issues.

Before to further continue this PR, the PR #9418 has to be merged.

Fixes # (issue)

#7052

patrickvonplaten · 2021-01-20T18:15:11Z

src/transformers/activations_tf.py

@@ -56,14 +57,20 @@ def mish(x):

 def gelu_fast(x):
    x = tf.convert_to_tensor(x)
-    coeff1 = tf.cast(7978845608, x.dtype)
+    coeff1 = tf.cast(0.7978845608, x.dtype)


wow was that wrong the whole time before?

Yep! I was as surprised as you 😄

sgugger

LGTM, thanks for fixing this!

sgugger · 2021-01-20T18:40:59Z

src/transformers/activations_tf.py

+if version.parse(tf.version.VERSION) >= version.parse("2.4"):
+    gelu = tf.keras.activations.gelu
+else:
+    gelu = tf.keras.layers.Activation(_gelu)


Nice to be able to use the TF version now!

jplu marked this pull request as draft December 17, 2020 12:53

jplu force-pushed the fix-activations branch from cf5c233 to 0985e2b Compare January 20, 2021 09:29

jplu marked this pull request as ready for review January 20, 2021 09:29

jplu mentioned this pull request Jan 20, 2021

TFBert activation layer will be casted into float32 under mixed precision policy #7052

Closed

jplu requested review from sgugger, LysandreJik and patrickvonplaten January 20, 2021 17:40

patrickvonplaten reviewed Jan 20, 2021

View reviewed changes

patrickvonplaten approved these changes Jan 20, 2021

View reviewed changes

sgugger approved these changes Jan 20, 2021

View reviewed changes

jplu added 8 commits January 20, 2021 21:34

Fix Gelu precision

668d2ae

Fix gelu_fast

d4b9b9a

Naming

5c81475

Fix usage and apply style

a27aa3a

add TF gelu approximate version

d28a49a

add TF gelu approximate version

03fe5eb

add TF gelu approximate version

6b7140b

Apply style

93609ab

jplu force-pushed the fix-activations branch from 4ba0b79 to 93609ab Compare January 20, 2021 20:35

jplu added 2 commits January 20, 2021 21:52

Fix albert

35bce9a

Remove the usage of the Activation layer

4d3acbb

LysandreJik approved these changes Jan 21, 2021

View reviewed changes

LysandreJik merged commit 3f290e6 into huggingface:master Jan 21, 2021

jplu deleted the fix-activations branch January 21, 2021 12:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mixed precision in TF models #9163

Fix mixed precision in TF models #9163

jplu commented Dec 17, 2020 •

edited

patrickvonplaten Jan 20, 2021

jplu Jan 20, 2021 •

edited

sgugger left a comment

sgugger Jan 20, 2021

Fix mixed precision in TF models #9163

Fix mixed precision in TF models #9163

Conversation

jplu commented Dec 17, 2020 • edited

What does this PR do?

patrickvonplaten Jan 20, 2021

Choose a reason for hiding this comment

jplu Jan 20, 2021 • edited

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

sgugger Jan 20, 2021

Choose a reason for hiding this comment

jplu commented Dec 17, 2020 •

edited

jplu Jan 20, 2021 •

edited