Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting 1D deconvolutions in TensorRT #1587

Closed
alecgunny opened this issue Nov 4, 2021 · 9 comments
Closed

Supporting 1D deconvolutions in TensorRT #1587

alecgunny opened this issue Nov 4, 2021 · 9 comments

Comments

@alecgunny
Copy link

alecgunny commented Nov 4, 2021

Description

Deconvolutions (or transposed convolutions depending on who you're asking) aren't supported in more recent versions of TensorRT (which I understand that the operator support matrix explicitly mentions). However, this operation was supported as of TRT 7.2.1, and it became a critical part of a low-latency pipeline my team works on, which is built on top of the 20.11 NGC container release (which runs 7.2.1.6).

Unfortunately, we need to begin moving past Python 3.6, which limits our ability to continue to leverage TensorRT. For reference, in the repo linked to below I've included a repro using the 21.10 container. Are there any plans to extend support to this operator in upcoming releases, and if so is there any rough timeline?

Environment

TensorRT Version: 8.0.3.4
NVIDIA GPU: V100 16GB
NVIDIA Driver Version: 465.19.01
CUDA Version: 11.4
CUDNN Version:
Operating System: Ubuntu
Python Version (if applicable): 3.8
Tensorflow Version (if applicable):
PyTorch Version (if applicable): 1.10
Baremetal or Container (if so, version): Container, 21.10-py3

Relevant Files

https://github.com/alecgunny/trt-env-repro

Steps To Reproduce

Full steps can be found in the README of the attached repo, with logs for both functioning and non-functioning cases included. The main issue seems to be that after the first deconvolution layer, TensorRT throws the error

[network.cpp::setWeightsName::3013] Error Code 1: Internal Error (The given weights is not used in the network!)

which stops the rest of the network build, cutting subsequent layers off.

@pranavm-nvidia
Copy link
Collaborator

This should be fixed in TRT 8.2. Could you try out the 8.2 EA release?

@alecgunny
Copy link
Author

Yes that's great to know, I'm working on building an environment for it now starting from the CUDA 11.4.2.-devel-ubuntu20.04 container. I have things just about working and will update when I am able to run, thanks very much.

Will this be the version released in the 21.11 NGC container? We also use Triton to serve our models at inference time, do you have any insights as to whether these versions will be coordinated in the next release for both containers?

@pranavm-nvidia
Copy link
Collaborator

@rajeevsrao Do you know which container(s) 8.2 will be part of?

@rajeevsrao
Copy link
Collaborator

@rajeevsrao Do you know which container(s) 8.2 will be part of?

21.12

@alecgunny
Copy link
Author

@rajeevsrao got it, thank you

@pranavm-nvidia I tested using 8.2 and the logs indicate that things are working properly. The network is able to build and the inferred shapes match up correctly. I haven't been able to test the outputs to compare for accuracy since I'm having trouble building PyCuda in the container, but its good to know that it seems likely we'll be able to start using TensorRT in our pipeline again come December.

Thanks for your help, closing this issue as resolved.

@wilbur-caper
Copy link

@rajeevsrao Do you know which container(s) 8.2 will be part of?

21.12

@rajeevsrao hi the 21.12 mean nvcr.io/nvidia/tensorrt:21.12-py3 ? I can't find it

@rajeevsrao
Copy link
Collaborator

@rajeevsrao Do you know which container(s) 8.2 will be part of?

21.12

@rajeevsrao hi the 21.12 mean nvcr.io/nvidia/tensorrt:21.12-py3 ? I can't find it

@hererookie the monthly containers are usually shipped towards the end of the month. 21.12 will be available sometime near Dec 25th.Since TensorRT 8.2 GA was ready after the 21.11 container was finalized and validated it didn't make it to 21.11.

@alecgunny
Copy link
Author

I don't know how far outside of your purview this is, but I know there's generally an attempt to align the software versions between concurrent container releases on NGC. Do you have any insight into whether this will be the version of TensorRT that gets released with the 21.12 Triton container?

@wilbur-caper
Copy link

This should be fixed in TRT 8.2. Could you try out the 8.2 EA release?

@pranavm-nvidia hi ,I meet the same error #1654 ,TensorRT 8.0 still not support 2D deconvolutions ? but on the tensorrt8 operator support matrix explicitly mentions https://github.com/onnx/onnx-tensorrt/blob/8.0-EA/docs/operators.md that support convtranspose 2D and 3D;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants