[MetaSchedule] Complete NCHW Conv2D Winograd Kernel Scheduling by zxybazh · Pull Request #12648 · apache/tvm

zxybazh · 2022-08-30T12:14:40Z

This PR is a follow up for #12127 with updates on a critical local read cache (d) in data_pack block and scheduling for the kernel parts if available.

This change would bring MS's performance to be aligned with AutoTVM on NCHW Conv2d on CUDA. Benchmarking results to follow. And dispatch priority change will follow up in a separate PR.

CC @vinx13 @junrushao

cc @Hzfengsy @junrushao1994

zxybazh · 2022-08-30T16:51:17Z

Follow up with CUDA benchmarking results on Geforce RTX 3070, all data layouts are NCHW, padding (1, 1), kernel size (3, 3). Workload is a single conv2d function, dispatched to nn.contrib_conv2d_winograd_without_weight_transform for both AutoTVM and MetaSchedule.

Data Shape	Kernel Layout	Kernel Shape	Winograd	MS trials	AutoTVM trials	MS Perf(ms)	AutoTVM Perf(ms)	Perf Compare
(1, 512, 7, 7)	OIHW	(512, 512, 3, 3)	Yes	2048	1024	0.05053524796891558	0.05030714765801846	-0.4513687378%
(2, 64, 56, 56)	OIHW	(64, 64, 3, 3)	Yes	2048	1024	0.040951505223880594	0.04111647433704021	0.4028401698%
(2, 48, 56, 56)	OIHW	(48, 48, 3, 3)	Yes	2048	1024	0.029897891745708238	0.030240458663465385	1.1457895449%
(1, 64, 28, 28)	OIHW	(64, 64, 3, 3)	Yes	2048	1024	0.010250015296394941	0.010370220173567484	1.1727287589%
(1, 128, 28, 28)	OIHW	(128, 128, 3, 3)	Yes	2048	1024	0.026061056047912482	0.02657971123398702	1.9901541408%
(1, 64, 56, 56)	OIHW	(64, 64, 3, 3)	Yes	2048	1024	0.022871235249414843	0.023843754776970795	4.2521513025%
(1, 128, 14, 14)	OIHW	(128, 128, 3, 3)	Yes	2048	1024	0.011837556120361336	0.012637039015519788	6.7537833572%
(1, 256, 14, 14)	OIHW	(256, 256, 3, 3)	Yes	2048	1024	0.028101132421289355	0.03071125413188647	9.2883150453%
(1, 80, 73, 73)	OIHW	(192, 80, 3, 3)	Yes	2048	1024	0.2016123825065274	0.22382275871015564	11.0163750497%

zxybazh · 2022-08-30T22:50:02Z

@tvm-bot rerun

…e#12648) * Complete winograd scheduling. * Fix test.

zxybazh marked this pull request as ready for review August 30, 2022 12:14

github-actions bot requested a review from Hzfengsy August 30, 2022 22:45

zxybazh added 2 commits August 30, 2022 15:55

Complete winograd scheduling.

fda7b03

Fix test.

bbe5c05

zxybazh force-pushed the bugfix/2022-08-30/fix-winograd branch from b3cdbdf to bbe5c05 Compare August 30, 2022 22:55

vinx13 approved these changes Aug 30, 2022

View reviewed changes

Hzfengsy approved these changes Aug 31, 2022

View reviewed changes

Hzfengsy merged commit f7cc992 into apache:main Aug 31, 2022

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022

[MetaSchedule] Complete NCHW Conv2D Winograd Kernel Scheduling (apach…

62d8d65

…e#12648) * Complete winograd scheduling. * Fix test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaSchedule] Complete NCHW Conv2D Winograd Kernel Scheduling#12648

[MetaSchedule] Complete NCHW Conv2D Winograd Kernel Scheduling#12648
Hzfengsy merged 2 commits intoapache:mainfrom
zxybazh:bugfix/2022-08-30/fix-winograd

zxybazh commented Aug 30, 2022 •

edited by github-actions bot

Loading

Uh oh!

zxybazh commented Aug 30, 2022

Uh oh!

zxybazh commented Aug 30, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zxybazh commented Aug 30, 2022 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zxybazh commented Aug 30, 2022

Uh oh!

zxybazh commented Aug 30, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zxybazh commented Aug 30, 2022 •

edited by github-actions bot

Loading