add channels last for AdaptiveAvgPool2d #48916

mingfeima · 2020-12-07T04:41:55Z

Stack from ghstack:

add channels last for AdapativeMaxPool2d #48920 add channels last for AdapativeMaxPool2d
optimize channels last for BatchNorm2d on CPU #48919 optimize channels last for BatchNorm2d on CPU
add channels last support for AvgPool2d on CPU #48918 add channels last support for AvgPool2d on CPU
add channels last for MaxPool2d #48917 add channels last for MaxPool2d
add channels last for AdaptiveAvgPool2d #48916 add channels last for AdaptiveAvgPool2d

optimize adaptive average pool2d forward path

optimize adaptive average pool2d backward path

remove unused headers

minor change

rename the header; add adaptive max pooling in future.

minor change

loosen adapative_pool2d test on nhwc to both device cuda and cpu

minor change

Differential Revision: D25399469

optimize adaptive average pool2d forward path optimize adaptive average pool2d backward path remove unused headers minor change minor change rename the header; add adaptive max pooling in future. minor change loosen adapative_pool2d test on nhwc to both device cuda and cpu minor change [ghstack-poisoned]

mingfeima · 2020-12-07T04:46:49Z

use this one to replace #42104.

Updates:

Add support for ChannelsLast memory format on CPU, this path is vectorized upon the dimension of channels.
Move Contiguous path to native/cpu.

adaptive_avg_pool2d has a fast path when output_size is 1x1 for contiguous memory format, this patch did not change that. Similar approach on channels last requires a reshape which makes it less performant, since the generic kernel adaptive_avg_pool2d_kernel on channels last already performs better than the fast path, i skip to the implement fast path for channels last when output size is 1x1.

Results:

Machine: CPU Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz, 2*20 cores.
Bench: Use this script to reproduce, ./run.sh adaptive_avg_pool2d.py.
Input size: [1, 2048, 7, 7], [128, 2048, 7, 7]: (size in ResNet50).
Output size: [1, 1], [2, 2]
Both single thread (1 core) and single socket (20 core) are tested.

Code base: before: 96aaa311, after: 86bf0cc7.

Time per iteration (unit: ms), the lower the better.

#cores	input_size	output_size	before (contiguous)	after (contiguous)	after (channels_last)	cl/contig
20	[1, 2028, 7, 7]	[2, 2]	0.025	0.021	0.02	1.05
20	[128, 2028, 7, 7]	[2, 2]	1.973	1.021	0.373	2.74
20	[1, 2028, 7, 7]	[1, 1]	0.033	0.032	0.022	1.45
20	[128, 2028, 7, 7]	[1, 1]	0.44	0.451	0.311	1.45
1	[1, 2028, 7, 7]	[2, 2]	0.22	0.146	0.034	4.29
1	[128, 2028, 7, 7]	[2, 2]	26.806	16.949	5.309	3.19
1	[1, 2028, 7, 7]	[1, 1]	0.037	0.037	0.016	2.31
1	[128, 2028, 7, 7]	[1, 1]	4.649	4.605	3.626	1.27

codecov · 2020-12-07T08:10:44Z

Codecov Report

Merging #48916 (06a3207) into gh/mingfeima/2/base (e429d05) will decrease coverage by 0.00%.
The diff coverage is 94.11%.

@@                   Coverage Diff                   @@
##           gh/mingfeima/2/base   #48916      +/-   ##
=======================================================
- Coverage                80.76%   80.76%   -0.01%     
=======================================================
  Files                     1867     1869       +2     
  Lines                   201584   201542      -42     
=======================================================
- Hits                    162817   162782      -35     
+ Misses                   38767    38760       -7

facebook-github-bot · 2020-12-14T19:13:21Z

@VitalyFedyunin merged this pull request in 690eaf9.

facebook-github-bot added the cla signed label Dec 7, 2020

This was referenced Dec 7, 2020

add channels last for MaxPool2d #48917

Closed

add channels last support for AvgPool2d on CPU #48918

Closed

optimize channels last for BatchNorm2d on CPU #48919

Closed

add channels last for AdapativeMaxPool2d #48920

Closed

pytorchbot added the open source label Dec 7, 2020

mingfeima mentioned this pull request Dec 7, 2020

add channels last support for AdaptiveAvgPool2d on CPU path #42104

Closed

VitalyFedyunin self-requested a review December 8, 2020 17:18

VitalyFedyunin approved these changes Dec 8, 2020

View reviewed changes

facebook-github-bot closed this in 690eaf9 Dec 14, 2020

facebook-github-bot added the Merged label Dec 14, 2020

mingfeima mentioned this pull request Dec 18, 2020

add channels last support for thnn_conv2d (non-dilated) #49582

Closed

facebook-github-bot deleted the gh/mingfeima/2/head branch December 18, 2020 15:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add channels last for AdaptiveAvgPool2d #48916

add channels last for AdaptiveAvgPool2d #48916

mingfeima commented Dec 7, 2020 •

edited by VitalyFedyunin

mingfeima commented Dec 7, 2020 •

edited

codecov bot commented Dec 7, 2020

facebook-github-bot commented Dec 14, 2020

add channels last for AdaptiveAvgPool2d #48916

add channels last for AdaptiveAvgPool2d #48916

Conversation

mingfeima commented Dec 7, 2020 • edited by VitalyFedyunin

mingfeima commented Dec 7, 2020 • edited

Updates:

Results:

codecov bot commented Dec 7, 2020

Codecov Report

facebook-github-bot commented Dec 14, 2020

mingfeima commented Dec 7, 2020 •

edited by VitalyFedyunin

mingfeima commented Dec 7, 2020 •

edited