layer normalization in layer? #44

PanXiebit · 2018-10-04T07:11:18Z

https://github.com/NLPLearn/QANet/blob/8107d223897775d0c3838cb97f93b089908781d4/layers.py#L52

execuse me, in the paper "Layer Normalization,Lei Jimmy Ba, Ryan Kiros, and Geoffrey E. Hinton", it said that the mean and variance is computed over all the hidden units in the same layer, and different training cases have different normalization terms. So I think the mean should be computed like this:

axes = list(range(1, x.shape.ndims))
mean = tf.reduce_mean(x, axes)

So the shape of mean is [batch,]. also the variance is [batch,]
and then feed them to compute the normlized x.

In the tensorflow api of layer normalization, the source code is below, and I think it is the same with mine.
norm_axes = list(range(begin_norm_axis, inputs_rank))
https://github.com/tensorflow/tensorflow/blob/c19e29306ce1777456b2dbb3a14f511edf7883a8/tensorflow/contrib/layers/python/layers/layers.py#L2311

The text was updated successfully, but these errors were encountered:

alznn · 2018-10-08T04:21:46Z

I tried this, but the error message show below：
ValueError: Dimensions must be equal, but are 128 and 32 for 'Embedding_Encoder_Layer/Encoder_Residual_Block/encoder_block_0/layer_norm_0/sub' (op: 'Sub') with input shapes: [32,?,1,128], [32].
I guess there are many places have to adjust to fit it XD

PanXiebit · 2018-10-08T08:41:26Z

@alznn maybe you can try tf.nn.moments(inputs, axes). I use this, and no error. And you need to pay attention that after compute the mean and variance, you need to use the beta and gamma to renormalize the distribution, the shape of beta and gamma is inputs.shape[-1].

localminimum · 2018-10-26T12:06:18Z

Thanks for your question! You are right in that right now it looks more like an instance normalization rather than a layer normalization. If you check this line I actually attempted using both normalization methods and they didn't show much difference so I sticked with one!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

layer normalization in layer? #44

layer normalization in layer? #44

PanXiebit commented Oct 4, 2018

alznn commented Oct 8, 2018

PanXiebit commented Oct 8, 2018

localminimum commented Oct 26, 2018

layer normalization in layer? #44

layer normalization in layer? #44

Comments

PanXiebit commented Oct 4, 2018

alznn commented Oct 8, 2018

PanXiebit commented Oct 8, 2018

localminimum commented Oct 26, 2018