Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] test_tensorrt.py::test_conv2d_transpose failed #9653

Closed
lp6m opened this issue Dec 6, 2021 · 4 comments
Closed

[Bug] test_tensorrt.py::test_conv2d_transpose failed #9653

lp6m opened this issue Dec 6, 2021 · 4 comments

Comments

@lp6m
Copy link

lp6m commented Dec 6, 2021

There is a bug in the code generation for TensorRT that causes the conv2d_transpose test to fail.

run_pytest cython a tests/python/contrib/test_tensorrt.py::test_conv2d_transpose
enabled targets: cuda; cuda -model=unknown -libs=cudnn
pytest marker: gpu
================================================================================================ test session starts ================================================================================================
platform linux -- Python 3.6.9, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
rootdir: /workspace
plugins: forked-1.3.0, xdist-2.3.0, profiling-1.7.0
collected 2 items                                                                                                                                                                                                   

tests/python/contrib/test_tensorrt.py .F 
...
---------------------------------------------------------------------------------------------- Captured stderr call ------------------------------------------------------------------------------------------------
[02:42:32] /workspace/src/runtime/contrib/tensorrt/tensorrt_runtime.cc:300: Finished building TensorRT engine for subgraph tvmgen_default_tensorrt_main_0 with batch size 1
[02:42:36] /workspace/src/runtime/contrib/tensorrt/tensorrt_runtime.cc:300: Finished building TensorRT engine for subgraph tvmgen_default_tensorrt_main_0 with batch size 1
------------------------------------------------------------------------- generated xml file: /workspace/build/pytest-results/a-cython.xml --------------------------------------------------------------------------
============================================================================================== short test summary info ==============================================================================================
FAILED tests/python/contrib/test_tensorrt.py::test_conv2d_transpose[run] - AssertionError: 
=========================================================================================== 1 failed, 1 passed in 32.66s ============================================================================================

This problem has not been detected by CI because CI only runs the compile test and not the run test.

If I set the number of output channels in the test to 1 as shown below, the test passes, so I think the cause is that the weights are not converted correctly during the layout conversion, but I have not yet identified where the bug is.

 def test_conv2d_transpose(run_module):
     def get_graph(
         x_shape=(1, 32, 8, 8),
-        k_shape=(32, 16, 3, 3),
+        k_shape=(32, 1, 3, 3),

Also, the test passes fine with commit ID 92ca782, so it is likely that subsequent commits are causing this problem.

Environment

TVM: Release v0.8
TensorRT: 7.2.3

@lp6m lp6m added the type: bug label Dec 6, 2021
@masahi
Copy link
Member

masahi commented Dec 6, 2021

There were several breaking changes in the conv2d transpose implementation.

Probably the necessary change didn't get applied to the TensorRT backend. @AndrewZhaoLuo @Laurawly

@AndrewZhaoLuo
Copy link
Contributor

Yeah this is probably my thing. Some weird thing going with layout transforms yep.

If you change "nn.conv2d_transpose": ["NCHW", "default"], to "nn.conv2d_transpose": ["NCHW", "IOHW"] in the python/tvm/relay/op/contrib/tensorrt.py it works.

So the default should be IOHW but need to look closer

@lp6m
Copy link
Author

lp6m commented Dec 7, 2021

@AndrewZhaoLuo Thank you for your quick reply, I confirmed this change fix this problem.
d0bb4bd

@AndrewZhaoLuo
Copy link
Contributor

Above PR is ready for review, I just forgot to change the default layout in old PR I think.

@masahi masahi closed this as completed Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants