Layernorm one_plus support in fast_layer_norm #1557

anmolgupt · 2022-12-12T22:07:19Z

In this implementation, gamma tensor will be initialized to 0 (instead of 1) and the +1 is numerically handled. It is enabled by setting use_1p=True (default: False)

mikolajblaz · 2022-12-13T16:27:16Z

apex/contrib/csrc/layer_norm/ln_api.cpp

@@ -80,7 +80,8 @@ layer_norm::BwdFunction & get_bwd_launcher(torch::Dtype wtype, torch::Dtype ityp
 std::vector<at::Tensor> ln_fwd(const at::Tensor &x,      // BxSxhidden_size
                               const at::Tensor &gamma,   // hidden_size
                               const at::Tensor &beta,   // hidden_size
-                               const float epsilon
+                               const float epsilon,
+                               const float one_plus


Since this is a float anyway, we could call it sth like gamma_shift and allow values different than 0 and 1.
This is not needed so this is just a side note.

apex/contrib/layer_norm/layer_norm.py

mikolajblaz · 2022-12-13T16:33:05Z

apex/contrib/test/layer_norm/test_fast_layer_norm.py

+        for is_1p in use_1p:
+            for h in hidden_sizes:
+                with self.subTest(f"hidden_size={h}, use_1p={is_1p}"):
+                    self.assertAll(_test_impl(256, 2, h, fp32, fp32, is_1p))


Is there an existing test (already in Apex) that checks that custom LayerNorm (like FastLayerNormFN) behaves exactly the same as e.g. PyT LayerNorm? It would be nice to double check that our modification with adding +1 is mathematically correct. It seems that _test_impl uses backward_ to check correctness, which is using the same +1 logic as the kernels.

Anmol Gupta added 3 commits December 9, 2022 11:30

layernorm1p support

8a4b723

Merge remote-tracking branch 'origin/master' into ln1p

e3a153e

ln1p fixes

2df49db

mikolajblaz reviewed Dec 13, 2022

View reviewed changes

Anmol Gupta added 4 commits December 13, 2022 11:04

update one_plus handling in FastLayerNornFN

33b583b

added a separate class for layernorm1p

e19a2f5

added gamma_shift

52082fa

updated the test with gamma_shift name

12c5836

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Layernorm one_plus support in fast_layer_norm #1557

Layernorm one_plus support in fast_layer_norm #1557

anmolgupt commented Dec 12, 2022

mikolajblaz Dec 13, 2022

mikolajblaz Dec 13, 2022

Layernorm one_plus support in fast_layer_norm #1557

Are you sure you want to change the base?

Layernorm one_plus support in fast_layer_norm #1557

Conversation

anmolgupt commented Dec 12, 2022

mikolajblaz Dec 13, 2022

Choose a reason for hiding this comment

mikolajblaz Dec 13, 2022

Choose a reason for hiding this comment