Stride-1 tf.nn.conv2d with XLA is 1.5x slower then without XLA, as far as stride-2 tf.nn.depthwise_conv2d #60312
Labels
comp:xla
XLA
stat:awaiting tensorflower
Status - Awaiting response from tensorflower
TF 2.12
For issues related to Tensorflow 2.12
type:bug
Bug
type:performance
Performance Issue
Click to expand!
Issue Type
Bug
Have you reproduced the bug with TF nightly?
Yes
Source
binary
Tensorflow Version
2.12.0, 2.13.0-dev20230412
Custom Code
Yes
OS Platform and Distribution
Google Colab
Mobile device
No response
Python version
Google Colab
Bazel version
No response
GCC/Compiler version
No response
CUDA/cuDNN version
11.8
GPU model and memory
No response
Current Behaviour?
See example below to reproduce. Here is speed test results:
stride-1 conv 40.66
stride-1 conv_jit 64.11 // conv2d is slower with JIT but only if stride=1
stride-2 conv 40.18
stride-2 conv_jit 28.05 // when stride=2 it is FASTER with JIT
stride-1 dwconv 9.82
stride-1 dwconv_jit 5.72 // dwconv is faster with JIT but only if stride=1
stride-2 dwconv 2.59
stride-2 dwconv_jit 4.2 // when stride=2 it is SLOWER with JIT
Standalone code to reproduce the issue
https://colab.research.google.com/drive/1zqqPVVKt4ILRA1rCoWjB1uOtB3D0hDc-?usp=sharing
Relevant log output
No response
The text was updated successfully, but these errors were encountered: