Problem of BatchNorm in Fluid. #9273

qingqing01 · 2018-03-21T01:54:42Z

Now the moving mean and variance in batch_norm are created as parameters and set trainable False:

https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/layers/nn.py#L1512

   mean = helper.create_parameter(
        attr=ParamAttr(
            name=moving_mean_name, initializer=Constant(0.0), trainable=False),
        shape=param_shape,
        dtype=input.dtype)
    mean.stop_gradient = True

    variance = helper.create_parameter(
        attr=ParamAttr(
            name=moving_variance_name,
            initializer=Constant(1.0),
            trainable=False),
        shape=param_shape,
        dtype=input.dtype)
    variance.stop_gradient = True

But when see the ProgrameDesc proto string, there is still some calculation operators related to the moving mean and variance. For example, I print the ProgrameDesc in MobileNet-SSD of one GPU, the following proto string is about the moving variance(batch_norm_x.w_2 is the moving variance):

  ops {
    inputs {
      parameter: "X"
      arguments: "batch_norm_2.w_2"
    }
    outputs {
      parameter: "Out"
      arguments: "_generated_var_52"
    }
    type: "scale"
    attrs {
      name: "scale"
      type: FLOAT
      f: 4.99999987369e-05
    }
  }
  ops {
    inputs {
      parameter: "X"
      arguments: "batch_norm_2.w_2@GRAD"
    }
    inputs {
      parameter: "Y"
      arguments: "_generated_var_52"
    }
    outputs {
      parameter: "Out"
      arguments: "batch_norm_2.w_2@GRAD"
    }
    type: "elementwise_add"
    attrs {
      name: "axis"
      type: INT
      i: -1
    }
  }

The text was updated successfully, but these errors were encountered:

jacquesqiao · 2018-03-21T02:13:40Z

can you paste the desc info of batch_norm_op as well as these elementwise_add ops?

qingqing01 · 2018-03-21T02:19:31Z

The above ProgrameDesc is about the regularization since I set L2 regularization 0.00005. Although, the un-trainable parameters skip the parameter updating process in https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/optimizer.py#L202 . But there is no need to append these ops in ProgrameDesc.

qingqing01 · 2018-03-21T02:25:13Z

@jacquesqiao the desc for batch_norm_op:

 ops {
    inputs {
      parameter: "Bias"
      arguments: "batch_norm_2.b_0"
    }
    inputs {
      parameter: "Mean"
      arguments: "batch_norm_2.w_1"
    }
    inputs {
      parameter: "Scale"
      arguments: "batch_norm_2.w_0"
    }
    inputs {
      parameter: "Variance"
      arguments: "batch_norm_2.w_2"
    }
    inputs {
      parameter: "X"
      arguments: "conv2d_1.tmp_0"
    }
    outputs {
      parameter: "MeanOut"
      arguments: "batch_norm_2.w_1"
    }
    outputs {
      parameter: "SavedMean"
      arguments: "batch_norm_2.tmp_0"
    }
    outputs {
      parameter: "SavedVariance"
      arguments: "batch_norm_2.tmp_1"
    }
    outputs {
      parameter: "VarianceOut"
      arguments: "batch_norm_2.w_2"
    }
    outputs {
      parameter: "Y"
      arguments: "batch_norm_2.tmp_2"
    }
    type: "batch_norm"
    attrs {
      name: "data_layout"
      type: STRING
      s: "NCHW"
    }
    attrs {
      name: "epsilon"
      type: FLOAT
      f: 9.99999974738e-06
    }
    attrs {
      name: "momentum"
      type: FLOAT
      f: 0.899999976158
    }
    attrs {
      name: "is_test"
      type: BOOLEAN
      b: false
    }
  }

The above descs for scale_op and elementwise_add_op I pasted are added by transpiler, not in the model config.

qingqing01 added the 屯 label Mar 21, 2018

reyoung mentioned this issue Mar 22, 2018

Shrink batch_norm_grad's inputs #9299

Merged

reyoung closed this as completed in #9299 Mar 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem of BatchNorm in Fluid. #9273

Problem of BatchNorm in Fluid. #9273

qingqing01 commented Mar 21, 2018 •

edited by lcy-seso

Loading

jacquesqiao commented Mar 21, 2018

qingqing01 commented Mar 21, 2018

qingqing01 commented Mar 21, 2018

Problem of BatchNorm in Fluid. #9273

Problem of BatchNorm in Fluid. #9273

Comments

qingqing01 commented Mar 21, 2018 • edited by lcy-seso Loading

jacquesqiao commented Mar 21, 2018

qingqing01 commented Mar 21, 2018

qingqing01 commented Mar 21, 2018

qingqing01 commented Mar 21, 2018 •

edited by lcy-seso

Loading