Update the epsilon in batch_norm_layer to a variable in v2. #5692

peterzhang2029 · 2017-11-16T07:19:39Z

resolve #5548

… add_bn_eq

lcy-seso · 2017-11-17T08:01:47Z

paddle/gserver/layers/BatchNormBaseLayer.h

@@ -94,6 +94,8 @@ class BatchNormBaseLayer : public Layer {
  bool useGlobalStats_;
  // use to compute moving mean and variance.
  real movingAvgFraction_;
+  // Epsilon value used in the batch normalization formula.
+  real EPS;


EPS 已经不再是静态常量了，成员变量用小写加下划线， epsilon_ 。

注释修改一下，Epsilon is a small random noise used in batch normalization for stability.

lcy-seso · 2017-11-17T08:05:09Z

python/paddle/trainer/config_parser.py

@@ -2482,6 +2483,8 @@ def __init__(self,
            self.config.use_global_stats = use_global_stats
        if moving_average_fraction is not None:
            self.config.moving_average_fraction = moving_average_fraction
+        if epsilon is not None:


这个逻辑有问题吧？默认值 epsilon 已经被设置成 1e-5。让用户把 epsilon 设置成 None，然后又是默认值吗？这个很奇怪。

这里原本是按照moving_average_fraction的方式赋值的，已修改。

lcy-seso · 2017-11-17T08:07:29Z

python/paddle/trainer_config_helpers/layers.py

@@ -3123,6 +3126,9 @@ def batch_norm_layer(input,
    assert (batch_norm_type is None) or (batch_norm_type == "batch_norm") or \
           (batch_norm_type == "mkldnn_batch_norm") or \
           (batch_norm_type == "cudnn_batch_norm")
+
+    assert epsilon >= 1e-5, "Parameter epsilon must be no less than 1e-5."


这里 epsilon 不是 parameter 。去掉 Parameter。

代码的逻辑不统一，配置里面已经限制了 epsilon 不可能设置到比 1e-5 小，C++ 实现里面又会比较 epsilon 和 1e-5，取其中的较小值。C++ 里面的逻辑相当于永远都不会走到，就不再需要了？这里想要的行为到底是什么呢？

Parameter已修改。

layers.py中的assert是为了限制BatchNormalizationLayer.cpp、MKLDNNBatchNormLayer.cpp中epsilon的大小，这两部分的c++源码部分并没有对epsilon再限制。

在CudnnBatchNormLayer.cpp中Cudnn提供的接口要求epsilon是double类型，所以对输入的epsilon进行类型转换为eps_，测试发现当输入是epsilon是1e-5时，Cudnn中bn的接口会报 CUDNN_BN_MIN_EPSILON错误。查看了Cudnn中的API，原因可能是类型转化后的eps_并没有满足最小值的要求，所以在这个部分又做了一次取两者的较大值。当epsilon默认输入是1e-5并使用GPU训练时，确保eps_的输入合法。

… add_bn_eq

lcy-seso · 2017-11-21T05:09:31Z

python/paddle/trainer_config_helpers/layers.py

@@ -3106,6 +3107,8 @@ def batch_norm_layer(input,
                             will use the mean and variance of the current batch
                             of test data.
    :type use_global_stats: bool | None.
+    :param epsilon: Small constant added to the variance to avoid numerical problems.


Small --> A small

to avoid numerical problems

This makes me think what are the problems?

to improve numeric stability.

lcy-seso · 2017-11-21T05:18:09Z

proto/ModelConfig.proto

+
+  // for batch normalization layer
+  // small constant added to the variance to avoid numerical problems.
+  optional double epsilon = 60 [ default = 0.00001 ];


small --> A / The . Please add the article.

This comment makes me think what are the problems?

lcy-seso · 2017-11-21T05:32:10Z

python/paddle/trainer/config_parser.py

@@ -2483,6 +2484,8 @@ def __init__(self,
        if moving_average_fraction is not None:
            self.config.moving_average_fraction = moving_average_fraction

+        self.config.epsilon = epsilon


请像 moving_average_fraction 一样来写。

这里为什么不检查 epsilon 的合法性呢？如果直接调用老的接口来创建 Batch Norm 层，传入非法的
epsilon 呢？

请在 2476 行，当 tyep= cudnn_batch_norm 时，直接检查并且给提示信息。静态参数在配置网络时候就可以检查出来，可以不在运行时做。
cuDNN does not allow an eps value less than 1e-5

为了确保老接口使用bn层一致，这里统一将epsilon大小的判断放在了config_parser.py中，所有类型的batch norm layer都需要保证输入的epsilon大小必须大于或等于1e-5，和cuDNN的约束一致。

lcy-seso · 2017-11-21T05:58:23Z

python/paddle/trainer_config_helpers/layers.py

@@ -3123,6 +3126,9 @@ def batch_norm_layer(input,
    assert (batch_norm_type is None) or (batch_norm_type == "batch_norm") or \
           (batch_norm_type == "mkldnn_batch_norm") or \
           (batch_norm_type == "cudnn_batch_norm")
+
+    assert epsilon >= 1e-5, "epsilon must be no less than 1e-5."


这个 epsilon 的限制是针对 cudnn 还是对 PaddlePaddle 自己实现的batch norm 也要满足限制？

cudnn 的限制在静态配置的时候检查一下，如果小于 1e-5 给用户一个warning，告诉用户PaddlePaddle
会截断到 1e-5。

所有类型的bn层都需要统一满足限制，已改为在config_parse.py中统一判断。

lcy-seso · 2017-11-21T06:02:00Z

paddle/gserver/layers/CudnnBatchNormLayer.cpp

@@ -21,7 +21,7 @@ namespace paddle {

 REGISTER_LAYER(cudnn_batch_norm, CudnnBatchNormLayer);

-const double CudnnBatchNormLayer::EPS = 1E-5;
+const double CudnnBatchNormLayer::MIN_EPS = 1E-5;


不要用这个 magic number ，直接用 CUDNN_BN_MIN_EPSILON。

lcy-seso · 2017-11-21T06:02:23Z

paddle/gserver/layers/CudnnBatchNormLayer.cpp

+  * static_cast<double>(epsilon_), The CUDNN_STATUS_BAD_PARAM error
+  * will occur due to eps_ value is less than
+  * CUDNN_BN_MIN_EPSILON.
+  * The following code is to ensure that the eps_ meets requirement.


comment 信息太冗余，不要解释这个赋值的逻辑，没有啥用处。

cuDNN does not allow an epsilon value less than CUDNN_BN_MIN_EPSILON.

lcy-seso · 2017-11-21T06:10:09Z

paddle/gserver/layers/CudnnBatchNormLayer.cpp

+  * static_cast<double>(epsilon_), The CUDNN_STATUS_BAD_PARAM error
+  * will occur due to eps_ value is less than
+  * CUDNN_BN_MIN_EPSILON.
+  * The following code is to ensure that the eps_ meets requirement.


comment 信息太冗余。

cuDNN does not allow an epsilon value less than CUDNN_BN_MIN_EPSILON.

… add_bn_eq

lcy-seso

LGTM

add epsilon in bn

8a49f7f

peterzhang2029 force-pushed the add_bn_eq branch from 7a32f0b to 8a49f7f Compare November 16, 2017 09:48

peterzhang2029 added 2 commits November 16, 2017 19:24

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

ac46018

… add_bn_eq

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

27d7b2c

… add_bn_eq

peterzhang2029 force-pushed the add_bn_eq branch from d492421 to 1882471 Compare November 17, 2017 05:39

peterzhang2029 requested review from guoshengCS, lcy-seso and tensor-tang November 17, 2017 05:55

lcy-seso requested changes Nov 17, 2017

View reviewed changes

lcy-seso reviewed Nov 17, 2017

View reviewed changes

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

9580c45

… add_bn_eq

peterzhang2029 force-pushed the add_bn_eq branch from 1882471 to 9580c45 Compare November 17, 2017 10:41

lcy-seso requested changes Nov 21, 2017

View reviewed changes

peterzhang2029 added 2 commits November 22, 2017 10:32

refine docstrings

5502abb

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

90e05a4

… add_bn_eq

lcy-seso approved these changes Nov 22, 2017

View reviewed changes

lcy-seso merged commit 6577760 into PaddlePaddle:develop Nov 22, 2017

peterzhang2029 deleted the add_bn_eq branch November 22, 2017 10:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the epsilon in batch_norm_layer to a variable in v2. #5692

Update the epsilon in batch_norm_layer to a variable in v2. #5692

peterzhang2029 commented Nov 16, 2017

lcy-seso Nov 17, 2017

peterzhang2029 Nov 17, 2017

lcy-seso Nov 17, 2017 •

edited

Loading

peterzhang2029 Nov 17, 2017

lcy-seso Nov 17, 2017 •

edited

Loading

peterzhang2029 Nov 17, 2017 •

edited

Loading

lcy-seso Nov 21, 2017

peterzhang2029 Nov 22, 2017

lcy-seso Nov 21, 2017

peterzhang2029 Nov 22, 2017

lcy-seso Nov 21, 2017

peterzhang2029 Nov 22, 2017

lcy-seso Nov 21, 2017

peterzhang2029 Nov 22, 2017

lcy-seso Nov 21, 2017

peterzhang2029 Nov 22, 2017

lcy-seso Nov 21, 2017

peterzhang2029 Nov 22, 2017

lcy-seso Nov 21, 2017

peterzhang2029 Nov 22, 2017

lcy-seso left a comment

Update the epsilon in batch_norm_layer to a variable in v2. #5692

Update the epsilon in batch_norm_layer to a variable in v2. #5692

Conversation

peterzhang2029 commented Nov 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso Nov 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso Nov 17, 2017 • edited Loading

Choose a reason for hiding this comment

peterzhang2029 Nov 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso left a comment

Choose a reason for hiding this comment

lcy-seso Nov 17, 2017 •

edited

Loading

lcy-seso Nov 17, 2017 •

edited

Loading

peterzhang2029 Nov 17, 2017 •

edited

Loading