MLAS: improve quantized depthwise convolution by tracysh · Pull Request #6513 · microsoft/onnxruntime

tracysh · 2021-01-31T03:12:33Z

Description: Improve the performance of QLinearConv for depthwise convolutions.

Motivation and Context

Added support for an im2col variant that returns an array of buffer pointers instead of memcpy'ing the original data (see https://arxiv.org/pdf/1907.02129.pdf). This saves memcpy time and reduces the size of the intermediate buffer needed to hold the im2col transform. This im2col will also be used to support quantized pooling ops.
Added an AVX2 depthwise convolution kernel.

On an older Broadwell test system, the mobilenetv2-7.quant.onnx from the E2E example drops from 4.6ms to 3.6ms per inference. The original FP32 model runs in 3.2ms on the same system.

yufenglee · 2021-02-01T21:25:09Z

onnxruntime/core/providers/cpu/nn/qlinearconv.cc

+            output_start,
+            output_count,
+            worker_col_buffer,
+            padding_data.data());


The input data is continuous. Let's say the size of one image is NSize. Will it be faster to just add NSize on the indirection buffer for each iteration?

yufenglee

tracysh added 7 commits January 29, 2021 12:59

wip

c47db65

neon fix

45b7fdb

Merge branch 'master' into tracysh/qdwconv

af12d46

Merge branch 'master' into tracysh/qdwconv

a113a0c

fix comments

ab7a13b

improve output range for 1D test

bb6298c

break apart 1D/2D cases

a301810

tracysh requested review from yufenglee and zhanghuanrong January 31, 2021 03:12

tracysh requested a review from a team as a code owner January 31, 2021 03:12

Merge branch 'master' into tracysh/qdwconv

abf4eac

yufenglee reviewed Feb 1, 2021

View reviewed changes

Merge branch 'master' into tracysh/qdwconv

91e5055

yufenglee approved these changes Feb 2, 2021

View reviewed changes

tracysh merged commit 9a6e715 into master Feb 2, 2021

tracysh deleted the tracysh/qdwconv branch February 2, 2021 05:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLAS: improve quantized depthwise convolution#6513

MLAS: improve quantized depthwise convolution#6513
tracysh merged 9 commits intomasterfrom
tracysh/qdwconv

tracysh commented Jan 31, 2021

Uh oh!

yufenglee Feb 1, 2021

Uh oh!

yufenglee left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tracysh commented Jan 31, 2021

Uh oh!

yufenglee Feb 1, 2021

Choose a reason for hiding this comment

Uh oh!

yufenglee left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants