Optimization Convolution op when using dnnl ep#10051
Optimization Convolution op when using dnnl ep#10051jywu-msft merged 1 commit intomicrosoft:masterfrom
Conversation
If Group attr = 1 allow the OneDNN library to optimize the memory layout for the device the Convolution operator is being run on. With out this optimization the default NCHW memory layout is used on CPUs the NCHW memory layout can result in a significant performance decrease. Signed-off-by: George Nash <george.nash@intel.com>
|
@jywu-msft maintainer, please remember to |
|
/azp run Linux DNNL CI Pipeline |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@jywu-msft you think you can launch the other CI builds now. Thanks! |
|
/azp run Linux CPU Minimal Build E2E CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-python-checks-ci-pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux CPU x64 NoContribops CI Pipeline |
|
/azp run Linux GPU CI Pipeline, Linux OpenVINO CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Linux CPU CI Pipeline |
|
Azure Pipelines successfully started running 6 pipeline(s). |
|
Azure Pipelines successfully started running 9 pipeline(s). |
Description:
If Group attr = 1 allow the OneDNN library to optimize the memory
layout for the device the Convolution operator is being run on.
Without this optimization, the default NCHW memory layout is used
on CPUs, the NCHW memory layout can result in a significant performance
decrease.
Motivation and Context
Using the default NCHW memory layout on CPU resulted in poor performance when running convolution models on CPU.