Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
[WIP] Add Layer Normalization #7251
See also #7175
This adds 2 new ops in SameDiff: Standardize and LayerNorm.
Standardize will turn the examples in the given ndarray into zero mean, unit variance distributed values (as calculated along the given dimensions).
LayerNorm will then use Standardize and additionally apply the gain multiplication and optionally add a bias.
As the layer normalization paper (https://arxiv.org/abs/1607.06450) says that it isn't really suitable for CNNs, this will only be available directly on DenseLayer, SimpleRNN and LSTM.
Aha! Link: https://skymindai.aha.io/features/ND4J-51
After discussing it with @AlexDBlack, I've decided to skip layer normalization support for LSTMs in this PR.
Because neither CuDNN nor MKL-DNN support layer normalization for LSTMs the performance hit for using it would be large enough that no one would be using it anyway. Also it would add a lot of complexity to what is already a pretty complex piece of code at the moment.
We can revisit layer normalization support for LSTMs once we start moving the layer implementations over to SameDiff.