Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local_response_norm in nn.functional and nn.layer #27725

Merged
merged 5 commits into from
Oct 13, 2020

Conversation

huangjun12
Copy link
Contributor

@huangjun12 huangjun12 commented Sep 29, 2020

PR types

New features

PR changes

APIs

Describe

原API:
paddle.fluid.layers.lrn(input, n=5, k=1.0, alpha=0.0001, beta=0.75, name=None, data_format='NCHW')
存在如下问题:
(1) 精度与torch实现有较大diff,最大相对误差约为1e-4
(2) 仅支持4-D输入,不支持3D及5D输入
(3) api名称、参数名称、参数顺序与torch的接口不同

在functional和layer下新增如下两个API,修复了以上问题,实现和接口形式对齐torch:
(1) paddle.nn.functional.local_response_norm(x, size, alpha=1e-4, beta=0.75, k=1., data_format="NCHW", name=None)
(2) paddle.nn.LocalResponseNorm(size, alpha=0.0001, beta=0.75, k=1.0, data_format="NCHW", name=None)

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Sep 29, 2020

✅ This PR's description meets the template requirements!
Please wait for other CI results.

@huangjun12 huangjun12 changed the title add local_response_norm in nn.functional and nn.layer, test=develop Add local_response_norm in nn.functional and nn.layer Sep 29, 2020
data_format="NCHW",
name=None):
"""
Local Response Normalization performs a type of "lateral inhibition" by normalizing over local input regions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer to its own class doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, functional is referred to class


input1 = fluid.data(
name="input1", shape=[3, 40, 40], dtype="float32")
input2 = fluid.data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input1/2 shape should be [3, 3, 40] and [3, 40, 3]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[3, 40, 40] also works

self.check_static_4d_input(place=place)
self.check_static_5d_input(place=place)

def check_dygraph_3d_input(self, place):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it only check the data_format?

Copy link
Contributor Author

@huangjun12 huangjun12 Oct 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Acc is aligned with torch implementation for simplicity, here we just test data_format

self.places.append(fluid.CUDAPlace(0))

def test_dygraph(self):
for place in self.places:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't use the new api to test dygraph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@willthefrog willthefrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to see some benchmark in the description.


channel_last = True if data_format[-1] == "C" else False

div = fluid.layers.unsqueeze(paddle.multiply(x, x), axes=[1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use functional instead of layers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


div = fluid.layers.unsqueeze(paddle.multiply(x, x), axes=[1])
if channel_last:
if dim == 3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part is a bit too repetitive, could use some consolidation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@huangjun12
Copy link
Contributor Author

would be nice to see some benchmark in the description.

benchmark测试:
| 输入tensor | shape | torch耗时 | paddle耗时 |
| 3-D Tensor | (128, 32, 224) | 0.01972 | 2.0091 |
| 4-D Tensor | (128, 32, 224, 224) | 1.55033 | 2.0408 |
| 5-D Tensor | (5, 128, 32, 224, 224) | 9.423509 | 2.13808 |

@willthefrog
Copy link
Contributor

willthefrog commented Oct 9, 2020

would be nice to see some benchmark in the description.

benchmark测试:
| 输入tensor | shape | torch耗时 | paddle耗时 |
| 3-D Tensor | (128, 32, 224) | 0.01972 | 2.0091 |
| 4-D Tensor | (128, 32, 224, 224) | 1.55033 | 2.0408 |
| 5-D Tensor | (5, 128, 32, 224, 224) | 9.423509 | 2.13808 |

so the time cost of the 3d version is about the same as the 1d version, even though the data is 1120x larger?

@huangjun12
Copy link
Contributor Author

would be nice to see some benchmark in the description.

benchmark测试:
| 输入tensor | shape | torch耗时 | paddle耗时 |
| 3-D Tensor | (128, 32, 224) | 0.01972 | 2.0091 |
| 4-D Tensor | (128, 32, 224, 224) | 1.55033 | 2.0408 |
| 5-D Tensor | (5, 128, 32, 224, 224) | 9.423509 | 2.13808 |

so the time cost of the 3d version is about the same as the 1d version, even though the data is 1120x larger?

Yes... a little confused

@willthefrog
Copy link
Contributor

this is weird, looks like it is implemented with mostly elementwise and reshape operations , would expect it to be linear in complexity.
maybe try increase the input size to find the boundary of the constant time cost?

@huangjun12
Copy link
Contributor Author

huangjun12 commented Oct 12, 2020

this is weird, looks like it is implemented with mostly elementwise and reshape operations , would expect it to be linear in complexity.
maybe try increase the input size to find the boundary of the constant time cost?

the mean cost time of this api is obtained by removing the time of initialization and data copy, results as follows:

| input tensor | shape | torch cost | paddle cost |
| 3-D Tensor | (32, 224, 224) | 0.01972 | 0.001512
| 4-D Tensor | (10, 32, 224, 224) | 1.55033 | 0.002727
| 5-D Tensor | (50, 10, 32, 224, 224) | 9.423509 | 0.007594

which is faster than pytorch, and shows linear complexity as expected.

jzhang533
jzhang533 previously approved these changes Oct 12, 2020
Copy link
Contributor

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm for api change

Copy link
Contributor

@willthefrog willthefrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@willthefrog willthefrog merged commit 7409263 into PaddlePaddle:develop Oct 13, 2020
chen-zhiyu pushed a commit to chen-zhiyu/Paddle that referenced this pull request Oct 15, 2020
…7725)

* add local_response_norm in nn.functional and nn.layer, test=develop

* update layers to functional api, test=develop

* fix ci coverage, test=develop

* fix unittests bug, test=develop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants