Add RFC for batchnorm1d optim #135

EsdeathYZH · 2022-05-22T15:27:36Z

No description provided.

paddle-bot-old · 2022-05-22T15:27:38Z

你的PR提交成功，感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备，具体请参考示例和模版。
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.

CLAassistant · 2022-05-22T15:27:40Z

All committers have signed the CLA.

rfcs/APIs/20220522_api_optim_batchnorm1d.md

paddle-bot-old · 2022-05-23T07:48:02Z

PR格式检查通过，你的PR将接受Paddle专家以及开源社区的review，请及时关注PR动态。
The format inspection passed. Your PR will be reviewed by experts of Paddle and developers from the open-source community. Stay tuned.

JamesLim-sy

Good work!

JamesLim-sy · 2022-05-24T03:17:45Z

rfcs/APIs/20220522_api_optim_batchnorm1d.md

+batch_norm_out = batch_norm(x)
+```
+
+简单的进行端到端实验测试，可以发现在数据规模较小时，torch和paddle都使用cudnn进行计算，而当数据规模大于一定阈值后，torch使用自己开发的算子进行计算使得计算时延小于paddle，进一步使用nsight compute查看profile结果：


请教下，这部分可以发现在数据规模较小时，torch和paddle都使用cudnn进行计算，此时Paddle和torch的性能是相同的吗，和后文中提到的oneflow相比呢？想确认下torch无论是调cudnn，还是启用自己手写的kernel，性能更好些

这里我的测试shape是[200000, 1, 3]，在这个shape下paddle和torch都使用的是cudnn，性能也是相似的，后续我把shape改成[136000, 16]之后，torch在小数据规模下调用的自己的kernel，性能要比cudnn要好，oneflow的性能我还没测试过，之后测一下补充到这个文档里

zkh2016 · 2022-05-27T05:41:21Z

rfcs/APIs/20220522_api_optim_batchnorm1d.md

+
+## 命名与参数设计
+参考：[飞桨API 设计及命名规范](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/dev_guides/api_contributing_guides/api_design_guidelines_standard_cn.html)
+## 底层OP设计


考虑到bn在许多模型中被使用，直接基于原算子修改需要验证大量的模型精度，所以建议

对当前的bn先解决不支持大batch_size的问题，

在paddle/phi/sparse目录下新增batch_norm，支持bn1d的情况，性能尽量打平或优于竞品

add RFC for batchnorm1d optim

33a2a6e

paddle-bot-old bot added contributor status: proposed labels May 22, 2022

zkh2016 reviewed May 23, 2022

View reviewed changes

rfcs/APIs/20220522_api_optim_batchnorm1d.md Show resolved Hide resolved

dingjiaweiww added status: open review and removed status: proposed labels May 23, 2022

refine

a2ecb6f

JamesLim-sy reviewed May 24, 2022

View reviewed changes

refiine profile

50a1ebc

zkh2016 reviewed May 27, 2022

View reviewed changes

EsdeathYZH added 5 commits May 29, 2022 22:55

refine impl

87a0926

impl kernel v2

3da35c8

minor

530191f

add 2d kernel design

d7514c6

refine RFC

6281d26

zkh2016 approved these changes Jul 13, 2022

View reviewed changes

zkh2016 merged commit fe897a1 into PaddlePaddle:master Jul 13, 2022

zkh2016 mentioned this pull request Jul 13, 2022

Optimize batchnorm1d using 2D kernel PaddlePaddle/Paddle#43530

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RFC for batchnorm1d optim #135

Add RFC for batchnorm1d optim #135

EsdeathYZH commented May 22, 2022

paddle-bot-old bot commented May 22, 2022

CLAassistant commented May 22, 2022 •

edited

paddle-bot-old bot commented May 23, 2022

JamesLim-sy left a comment

JamesLim-sy May 24, 2022 •

edited

EsdeathYZH May 24, 2022 •

edited

zkh2016 May 27, 2022

Add RFC for batchnorm1d optim #135

Add RFC for batchnorm1d optim #135

Conversation

EsdeathYZH commented May 22, 2022

paddle-bot-old bot commented May 22, 2022

CLAassistant commented May 22, 2022 • edited

paddle-bot-old bot commented May 23, 2022

JamesLim-sy left a comment

Choose a reason for hiding this comment

JamesLim-sy May 24, 2022 • edited

Choose a reason for hiding this comment

EsdeathYZH May 24, 2022 • edited

Choose a reason for hiding this comment

zkh2016 May 27, 2022

Choose a reason for hiding this comment

CLAassistant commented May 22, 2022 •

edited

JamesLim-sy May 24, 2022 •

edited

EsdeathYZH May 24, 2022 •

edited