add fuse_bn_act op #27230

zhangting2020 · 2020-09-10T05:09:27Z

PR types

Function optimization

PR changes

OPs

Describe

This Op performs batch norm on input x, and adds the result to input y. Then it performs activation on the sum. We use cuDNN API to implements this function, the following points need to be noted:

cudnnBatchNormalizationForwardTrainingEx requires inputs x and z must be float16.
The data format of inputs must be NHWC [batch, in_height, in_width, in_channels].

This Op will be used in automatic mixed precision training of the resnet model. The following image is part of the model. The red parts represent the inputs of this Op. The green parts represent the computation performed by the Op.

Performance of ResNet50 AMP Training

Test on V100, CUDA 10.1, cuDNN 7.6, single card, BS=128

before：1015.18 imgs/s
after：1085.98 imgs/s，+6.9%

loss and accuracy

set fuse_bn_add_act=true and train 63 epochs

paddle-bot-old · 2020-09-10T05:09:33Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle/fluid/operators/fused/fused_bn_add_activation_op.cc

python/paddle/fluid/contrib/layers/nn.py

luotao1

LGTM

wangchaochaohu

LGTM

zhiqiu

LGTM for paddle.fluid.contrib.fused_bn_add_act wihout core.ops since it is used in static graph only.

wzzju · 2020-09-21T09:10:30Z

python/paddle/fluid/contrib/mixed_precision/fp16_lists.py

-    'matmul',
-    'mul',
-}
+white_list = {'conv2d', 'matmul', 'mul', 'fused_bn_add_activation'}


I think fused_bn_add_activation should be added in the gray_list .

wzzju · 2020-09-21T09:16:55Z

python/paddle/fluid/contrib/layers/nn.py

+    check_variable_and_dtype(x, 'input', ['float16', 'float32', 'float64'],
+                             'fused_bn_add_act')
+    check_variable_and_dtype(y, 'input', ['float16', 'float32', 'float64'],


BTW, you have only registered the float16 kernel. So, 'float32' and 'float64' is not needed here.

The dtype check is performed during the compilation time, and the limit to float16 will cause the check to fail.

wzzju · 2020-09-21T09:36:29Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+            if in_name != 'X' or in_name != 'Z':
+                continue


I think the condition test here maybe wrong. Maybe the logic below is more understandable.

if in_name not in {'X', 'Z'}: continue

Maybe the condition test about batch_norm can be simplified as below:

if src_dtype == core.VarDesc.VarType.FP32 and op.type in {'batch_norm', 'fused_bn_add_activation'}: if in_name not in {'X', 'Z'}: continue

wzzju · 2020-09-21T09:45:30Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

            if op.type == 'batch_norm' and out_name != 'Y':
                continue
+            if op.type == 'fused_bn_add_activation' and out_name != 'Y':
+                continue


if op.type in {'batch_norm', 'fused_bn_add_activation'} and out_name != 'Y': continue

wzzju · 2020-09-21T10:20:52Z

paddle/fluid/operators/fused/fused_bn_add_activation_op.cu

+        saved_mean->template data<BatchNormParamType<float>>();
+    const auto *saved_var_data =
+        saved_var->template data<BatchNormParamType<float>>();
+


Please use T instead of float.

wzzju

LGTM.

TCChenlong

LGTM

zhangting2020 added 3 commits September 9, 2020 07:47

add fused_bn_add_relu op

9e4c46e

add fused_bn_add_act op

6343f99

add unittest

d6da648

zhangting2020 force-pushed the fuse_bn_add_relu branch from 832e4a3 to baabd41 Compare September 16, 2020 05:01

fix unittest

4efbc49

zhangting2020 force-pushed the fuse_bn_add_relu branch from baabd41 to 4efbc49 Compare September 16, 2020 05:24

wangchaochaohu reviewed Sep 17, 2020

View reviewed changes

paddle/fluid/operators/fused/fused_bn_add_activation_op.cc Show resolved Hide resolved

python/paddle/fluid/contrib/layers/nn.py Show resolved Hide resolved

python/paddle/fluid/contrib/layers/nn.py Show resolved Hide resolved

python/paddle/fluid/contrib/layers/nn.py Show resolved Hide resolved

remove unused input

71e3066

luotao1 previously approved these changes Sep 21, 2020

View reviewed changes

wangchaochaohu previously approved these changes Sep 21, 2020

View reviewed changes

zhiqiu self-requested a review September 21, 2020 05:45

zhiqiu previously approved these changes Sep 21, 2020

View reviewed changes

wzzju reviewed Sep 21, 2020

View reviewed changes

zhangting2020 dismissed stale reviews from zhiqiu, wangchaochaohu, and luotao1 via 2e1df1d September 21, 2020 11:14

zhangting2020 force-pushed the fuse_bn_add_relu branch from 2e1df1d to f1ba94b Compare September 21, 2020 11:18

modify according to reviewer's comments

bf5a7f2

zhangting2020 force-pushed the fuse_bn_add_relu branch from f1ba94b to bf5a7f2 Compare September 21, 2020 16:47

wzzju approved these changes Sep 22, 2020

View reviewed changes

TCChenlong approved these changes Sep 22, 2020

View reviewed changes

zhiqiu self-requested a review September 22, 2020 06:59

zhiqiu approved these changes Sep 22, 2020

View reviewed changes

luotao1 approved these changes Sep 22, 2020

View reviewed changes

lanxianghit approved these changes Sep 22, 2020

View reviewed changes

raindrops2sea approved these changes Sep 23, 2020

View reviewed changes

zhangting2020 merged commit 906e7f9 into PaddlePaddle:develop Sep 23, 2020

zhangting2020 mentioned this pull request Sep 23, 2020

add fuse_bn_add_act_ops args PaddlePaddle/models#4864

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add fuse_bn_act op #27230

add fuse_bn_act op #27230

zhangting2020 commented Sep 10, 2020 •

edited

paddle-bot-old bot commented Sep 10, 2020

luotao1 left a comment

wangchaochaohu left a comment

zhiqiu left a comment

wzzju Sep 21, 2020

zhangting2020 Sep 21, 2020

wzzju Sep 21, 2020

zhangting2020 Sep 21, 2020

wzzju Sep 21, 2020 •

edited

zhangting2020 Sep 21, 2020

wzzju Sep 21, 2020

zhangting2020 Sep 21, 2020

wzzju Sep 21, 2020

zhangting2020 Sep 21, 2020

wzzju left a comment

TCChenlong left a comment

add fuse_bn_act op #27230

add fuse_bn_act op #27230

Conversation

zhangting2020 commented Sep 10, 2020 • edited

PR types

PR changes

Describe

Performance of ResNet50 AMP Training

loss and accuracy

paddle-bot-old bot commented Sep 10, 2020

luotao1 left a comment

Choose a reason for hiding this comment

wangchaochaohu left a comment

Choose a reason for hiding this comment

zhiqiu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wzzju Sep 21, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wzzju left a comment

Choose a reason for hiding this comment

TCChenlong left a comment

Choose a reason for hiding this comment

zhangting2020 commented Sep 10, 2020 •

edited

wzzju Sep 21, 2020 •

edited