Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add channels last support for AdaptiveAvgPool2d on CPU path #42104

Closed

Conversation

mingfeima
Copy link
Collaborator

This patch adds channels last support for nn.AdaptiveAvgPool2d, both forward and backward included.

@mingfeima
Copy link
Collaborator Author

mingfeima commented Jul 27, 2020

Updates:

  1. Add support for ChannelsLast memory format on CPU, this path is vectorized upon the dimension of channels.
  2. Move Contiguous path to native/cpu.

adaptive_avg_pool2d has a fast path when output_size is 1x1 for contiguous memory format, this patch did not change that. Similar approach on channels last requires a reshape which makes it less performant, since the generic kernel adaptive_avg_pool2d_kernel on channels last already performs better than the fast path, i skip to the implement fast path for channels last when output size is 1x1.

Results:

Machine: CPU Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz, 2*20 cores.
Bench: Use this script to reproduce, ./run.sh adaptive_avg_pool2d.py.
Input size: [1, 2048, 7, 7], [128, 2048, 7, 7]: (size in ResNet50).
Output size: [1, 1], [2, 2]
Both single thread (1 core) and single socket (20 core) are tested.

Code base: before: 96aaa311, after: 86bf0cc7.

Time per iteration (unit: ms), the lower the better.

#cores input_size output_size before (contiguous) after (contiguous) after (channels_last) cl/contig
20 [1, 2028, 7, 7] [2, 2] 0.025 0.021 0.02 1.05
20 [128, 2028, 7, 7] [2, 2] 1.973 1.021 0.373 2.74
20 [1, 2028, 7, 7] [1, 1] 0.033 0.032 0.022 1.45
20 [128, 2028, 7, 7] [1, 1] 0.44 0.451 0.311 1.45
1 [1, 2028, 7, 7] [2, 2] 0.22 0.146 0.034 4.29
1 [128, 2028, 7, 7] [2, 2] 26.806 16.949 5.309 3.19
1 [1, 2028, 7, 7] [1, 1] 0.037 0.037 0.016 2.31
1 [128, 2028, 7, 7] [1, 1] 4.649 4.605 3.626 1.27

@dr-ci
Copy link

dr-ci bot commented Jul 27, 2020

💊 CI failures summary and remediations

As of commit 271c085 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 2/2 non-CircleCI failure(s)

Extra GitHub checks: 1 failed


ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 25 times.

@ngimel ngimel self-requested a review July 27, 2020 21:42
@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 27, 2020
@mingfeima
Copy link
Collaborator Author

@ngimel could you please enlighten me how to enable the nhwc test for CPU at test_nn.py, for example this one below:
test_adaptive_pooling_avg_nhwc: https://github.com/pytorch/pytorch/blob/master/test/test_nn.py#L3472

@mingfeima
Copy link
Collaborator Author

@VitalyFedyunin @ngimel is it OK to use #pragma omp simd in ATen when trying to vectorize the scalar code?

@mingfeima mingfeima force-pushed the channels_last/adaptive_avg_pool2d branch 3 times, most recently from d14f995 to 797744f Compare August 12, 2020 06:50
@VitalyFedyunin VitalyFedyunin self-requested a review August 20, 2020 15:41
Copy link
Contributor

@VitalyFedyunin VitalyFedyunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good (I still need to dive into correctness). Please add tests for various memory format input and outputs.

@mingfeima mingfeima force-pushed the channels_last/adaptive_avg_pool2d branch from 797744f to e49de0c Compare November 5, 2020 01:36
@facebook-github-bot
Copy link
Contributor

Hi @mingfeima!

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

optimize adaptive average pool2d forward path

optimize adaptive average pool2d backward path

remove unused headers

minor change

minor change

rename the header; add adaptive max pooling in future.

minor change

loosen adapative_pool2d test on nhwc to both device cuda and cpu

minor change
@mingfeima mingfeima force-pushed the channels_last/adaptive_avg_pool2d branch from e49de0c to 271c085 Compare November 5, 2020 01:52
@mingfeima
Copy link
Collaborator Author

@VitalyFedyunin test cases added for cpu path on channels last memory format, please review. Sorry for the delay, got distracted by some other job.
I will rebase other relevant PRs upon this one, #42719, #46234, etc.

@codecov
Copy link

codecov bot commented Nov 5, 2020

Codecov Report

Merging #42104 into master will decrease coverage by 0.00%.
The diff coverage is 94.11%.

@@            Coverage Diff             @@
##           master   #42104      +/-   ##
==========================================
- Coverage   60.82%   60.81%   -0.01%     
==========================================
  Files        2751     2753       +2     
  Lines      254434   254392      -42     
==========================================
- Hits       154748   154712      -36     
+ Misses      99686    99680       -6     

@mingfeima
Copy link
Collaborator Author

please review the new one #48916

@mingfeima mingfeima closed this Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants