Srivastava kshitij new trt ops #332

jaybdub · 2020-06-09T22:51:53Z

adds "enabled" flag to tensorrt_converter and add_module_test
adds support for many TensorRT 7+ operations
switches plugin build system to use PyTorch Extension

jaybdub · 2020-06-09T22:54:45Z

Tested with

Platform/GPU	PyTorch Version	TensorRT Version	Notes
Jetson Xavier NX	1.4	7.1	JetPack 4.4 DP

jaybdub · 2020-06-09T22:57:14Z

@SrivastavaKshitij I've created this PR based off of your changes in #324 .

It applies some refactoring

adds '--plugins' flag to at least support PyTorch versions < 1.3 without plugins
replaces get_trt_version() -> float with trt_version() -> str
- Pythons lexigraphical ordering can handle patches like '5.2' > '5.1.23' etc.
adds 'enabled' flag to allow inline filtering of converters / tests
- this avoids having to explicitly separate based on TRT version, also allows us to re-use test cases for different converter implementations

Are you able to test this for the configurations you use to make sure nothing is broken in the refactor?

SrivastavaKshitij · 2020-06-10T01:25:04Z

We should update README.md file

Test Results

NGC Container	TRT version	Pytorch version	Status
Pytorch 19.07	5.1.5	PyTorch 1.2.0a0	❌
Pytorch 19.12	6.0.1	PyTorch 1.4.0a0+a5b4d78	✔️
Pytorch 20.03	7.0.0	1.5.0a0+8f84ded	✔️
Custom container	5.1	Pytorch=1.4, torchvision = 0.5	✔️

@jaybdub : I am not sure how to build for Pytorch < 1.3 without plugins. When u say
adds '--plugins' flag to at least support PyTorch versions < 1.3 without plugins, I dont see a way not to use --plugins flag in setup.py.

Error related to Ngc container 19.07

/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/serialize/output-archive.h:47:8: note:   no known conversion for argument 2 from ‘c10::IValue’ to ‘torch::serialize::OutputArchive&’
torch2trt/plugins/interpolate.cpp: In member function ‘virtual nvinfer1::Dims torch2trt::InterpolatePlugin::getOutputDimensions(int, const nvinfer1::Dims*, int)’:
torch2trt/plugins/interpolate.cpp:123:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (int i = 0; i < size.size(); i++) {
                     ~~^~~~~~~~~~~~~
error: command 'gcc' failed with exit status 1

jaybdub · 2020-06-10T01:37:41Z

Thanks for the fast response!

Sorry, I forgot to push the --plugins addition to setup.py. It should be there now.

By update README do you mean to add the test platform matrix?

SrivastavaKshitij · 2020-06-10T01:42:37Z

Ok . Let me test it quickly.

Under Setup section in README, we should say use plugins flag for torch < 1.3 and dont use the flag for > 1.3 . Something like that

SrivastavaKshitij · 2020-06-10T01:47:23Z

Getting the following error when running unit tests:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/sw/torch2trt/torch2trt/test.py", line 114, in <module>
    max_error, fps, fps_trt, ms, ms_trt = run(test)
  File "/sw/torch2trt/torch2trt/test.py", line 23, in run
    module_trt = torch2trt(module, inputs_conversion, max_workspace_size=1 << 20,  **self.torch2trt_kwargs)
  File "/sw/torch2trt/torch2trt/torch2trt.py", line 407, in torch2trt
    outputs = module(*inputs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 525, in __call__
    result = self.forward(*input, **kwargs)
  File "/sw/torch2trt/torch2trt/converters/interpolate.py", line 95, in forward
    return F.interpolate(x, self.size, mode=self.mode, align_corners=self.align_corners)
  File "/sw/torch2trt/torch2trt/torch2trt.py", line 217, in wrapper
    converter["converter"](ctx)
  File "/sw/torch2trt/torch2trt/converters/interpolate.py", line 36, in convert_interpolate_plugin
    plugin = get_interpolate_plugin(size=size, mode=mode, align_corners=align_corners)
  File "/sw/torch2trt/torch2trt/converters/interpolate.py", line 9, in get_interpolate_plugin
    from torch2trt.plugins import InterpolatePlugin
ImportError: cannot import name 'InterpolatePlugin'

which makes sense because we didnt register plugin. I think we can have an if condition under converters/__init__.py where we dont register interpolate plugin op but prints out a warning saying Interpolate function is not compatible with Pytorch < 1.3

jaybdub · 2020-06-10T01:59:02Z

Added disclaimer to README.

Also, I set enabled = ... and torch.__version__ >= '1.3' for plugin based interpolate converter and relevant interpolate test cases.

SrivastavaKshitij · 2020-06-10T01:59:13Z

That's a very good idea ! We can also add test platform matrix. People use different combinations of Pytorch, torchvision and TRT. Test platform matrix will give them an idea as to which combinations have been tried and tested. We can add a dockerfile for reference

Something like this:

FROM nvcr.io/nvidia/tensorrt:19.07-py3
RUN apt-get update
RUN pip install torchvision==0.5.0 torchvision==1.4.0 

RUN git clone --recursive https://github.com/NVIDIA-AI-IOT/torch2trt.git /sw/torch2trt && \
    cd /sw/torch2trt && 
    python setup.py build_ext --inplace

RUN pip install termcolor graphviz

This will help community in testing their environment easily

SrivastavaKshitij · 2020-06-10T02:03:51Z

Final Results

NGC Container	TRT version	Pytorch version	Status
Pytorch 19.07	5.1.5	PyTorch 1.2.0a0	✔️
Pytorch 19.12	6.0.1	PyTorch 1.4.0a0+a5b4d78	✔️
Pytorch 20.03	7.0.0	1.5.0a0+8f84ded	✔️
Custom container	5.1	Pytorch=1.4, torchvision = 0.5	✔️

jaybdub · 2020-06-10T02:07:49Z

Thanks! This gives some confidence, I'll consider adding a test matrix.

It might be useful to log warnings like you suggested. But we can probably save this for another smaller PR.

Dockerfile would also be great. Jetson platforms are now heavily supporting cloud-native integration, so could be used there as well.

I'd like to get this PR merged first, but then would be great to consider these other features.

SrivastavaKshitij · 2020-06-10T02:11:28Z

Thats perfect. I think this PR is ready to be merged :-)

SrivastavaKshitij · 2020-06-10T02:33:25Z

[DO NOT MERGE]: I ran the build again and I cant get plugins to build . I dont know what broke

Now

Step 6/7 : RUN git clone --recursive https://github.com/NVIDIA-AI-IOT/torch2trt.git /sw/torch2trt &&     cd /sw/torch2trt &&     git fetch origin pull/332/head:PR332 &&     git checkout PR332 &&     python setup.py build_ext --inplace
 ---> Running in 4c2bd93fe0b4
Cloning into '/sw/torch2trt'...
From https://github.com/NVIDIA-AI-IOT/torch2trt
 * [new ref]         refs/pull/332/head -> PR332
Switched to branch 'PR332'
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
running build_ext

Earlier:

Step 6/7 : RUN git clone --recursive https://github.com/NVIDIA-AI-IOT/torch2trt.git /sw/torch2trt &&     cd /sw/torch2trt &&     git fetch origin pull/332/head:PR332 &&     git checkout PR332 &&     python setup.py build_ext --inplace
 ---> Running in 7c200b4d45c4                 
Cloning into '/sw/torch2trt'...                       
From https://github.com/NVIDIA-AI-IOT/torch2trt                                                                                                                                                                                                                                  
 * [new ref]         refs/pull/332/head -> PR332                                                                                                                                                                                                                                 
Switched to branch 'PR332'                                                                                                                            
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'                                    
running build_ext                                                                                                                    
building 'plugins' extension                                                                                                            
creating /sw/torch2trt/build                             
creating /sw/torch2trt/build/temp.linux-x86_64-3.6     
creating /sw/torch2trt/build/temp.linux-x86_64-3.6/torch2trt                                        
creating /sw/torch2trt/build/temp.linux-x86_64-3.6/torch2trt/plugins                              
Emitting ninja build file /sw/torch2trt/build/temp.linux-x86_64-3.6/build.ninja...                       
Compiling objects...                                                                                            
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /sw/torch2trt/build/temp.linux-x86_64-3.6/torch2trt/plugins/interpolate.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/include/aarch64-linux-gnu -I/opt/conda/lib/py
thon3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/inc
lude/python3.6m -c -c /sw/torch2trt/torch2trt/plugins/interpolate.cpp -o /sw/torch2trt/build/temp.linux-x86_64-3.6/torch2trt/plugins/interpolate.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=plugins -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/nn/utils.h:5:0,
                 from /opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:10,  
                 from /opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,                    
                 from /opt/conda/lib/python3.6/site-packages/torch/include/torch/extension.h:4,                                  
                 from /sw/torch2trt/torch2trt/plugins/interpolate.cpp:1:                                                             
/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/nn/utils/rnn.h: In function ‘torch::nn::utils::rnn::PackedSequence torch::nn::utils::rnn::pack_sequence(c10::ArrayRef<at::Tensor>, bool)’:
/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/nn/utils/rnn.h:336:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for (int64_t i = 0; i < sequences.size(); i++) {
                       ~~^~~~~~~~~~~~~~~~~~
/sw/torch2trt/torch2trt/plugins/interpolate.cpp: In member function ‘virtual nvinfer1::Dims torch2trt::InterpolatePlugin::getOutputDimensions(int, const nvinfer1::Dims*, int)’:
/sw/torch2trt/torch2trt/plugins/interpolate.cpp:123:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for (int i = 0; i < size.size(); i++) {
                     ~~^~~~~~~~~~~~~
/sw/torch2trt/torch2trt/plugins/interpolate.cpp: In member function ‘virtual bool torch2trt::InterpolatePlugin::supportsFormat(nvinfer1::DataType, nvinfer1::PluginFormat) const’:
/sw/torch2trt/torch2trt/plugins/interpolate.cpp:131:33: warning: ‘kNCHW’ is deprecated [-Wdeprecated-declarations]
     if (format != PluginFormat::kNCHW) {
                                 ^~~~~
In file included from /usr/include/x86_64-linux-gnu/NvInferRuntime.h:59:0,
                 from /usr/include/x86_64-linux-gnu/NvInfer.h:53,
                 from /sw/torch2trt/torch2trt/plugins/interpolate.cpp:6:
/usr/include/x86_64-linux-gnu/NvInferRuntimeCommon.h:243:5: note: declared here
     kNCHW TRT_DEPRECATED_ENUM = kLINEAR, //! <-- Deprecated, used for backward compatibility
     ^~~~~
/sw/torch2trt/torch2trt/plugins/interpolate.cpp:131:33: warning: ‘kNCHW’ is deprecated [-Wdeprecated-declarations]
     if (format != PluginFormat::kNCHW) {
                                 ^~~~~
In file included from /usr/include/x86_64-linux-gnu/NvInferRuntime.h:59:0,
                 from /usr/include/x86_64-linux-gnu/NvInfer.h:53,
                 from /sw/torch2trt/torch2trt/plugins/interpolate.cpp:6:
/usr/include/x86_64-linux-gnu/NvInferRuntimeCommon.h:243:5: note: declared here
     kNCHW TRT_DEPRECATED_ENUM = kLINEAR, //! <-- Deprecated, used for backward compatibility
     ^~~~~
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/torch2trt
g++ -pthread -shared -B /opt/conda/compiler_compat -L/opt/conda/lib -Wl,-rpath=/opt/conda/lib -Wl,--no-as-needed -Wl,--sysroot=/ /sw/torch2trt/build/temp.linux-x86_64-3.6/torch2trt/plugins/interpolate.o -L/usr/lib/aarch64-linux-gnu -L/opt/conda/lib/python3.6/site-packages/
torch/lib -L/usr/local/cuda/lib64 -lnvinfer -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.6/torch2trt/plugins.cpython-36m-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.6/torch2trt/plugins.cpython-36m-x86_64-linux-gnu.so -> torch2trt

jaybdub · 2020-06-10T02:34:34Z

Which system configuration is this? I only made a small change since last message, but shouldn't effect building.

SrivastavaKshitij · 2020-06-10T02:42:56Z

Ohhhh! It broke after 57c8188. My docker intermediate images were cached , hence I didnt catch it. This time i ran with --no-cache and bisected all the commits and 57c8188 is the commit that broke plugin. Sorry for the oversight

SrivastavaKshitij · 2020-06-10T02:45:47Z

Makes sense.

setup(
    name='torch2trt',
    version='0.1.0',
    description='An easy to use PyTorch to TensorRT converter',
    packages=find_packages(),
    ext_package='torch2trt',
    ext_modules=ext_modules,
    cmdclass={'build_ext': BuildExtension}
)

and ext_modules= [] on Line 12. I think its not appending properly

SrivastavaKshitij · 2020-06-10T03:03:38Z

There is no problem in the workflow

@jaybdub : Sorry John, my bad. there was confusion on my side. I didnt add --plugins flag.

When i ran the following command, it works:
python setup.py build_ext --inplace --plugins

So after 57c8188 , I was supposed to add --plugins flag but I didn't and the docker images were cached so i never got an error. When I ran with --no-cached when building docker file, I got the error and thought that the workflow is broken.

jaybdub · 2020-06-10T03:04:13Z

Did you add --plugins to setup.py call?

Judging from the error you sent, it seems like it's attempting to build the plugin.

My guess would be it's a linking / include directory issue in the extension.

Right now it's hard coded to add the TensorRT paths for Jetson platforms. Maybe your docker had the environment set up so it didn't need this.

Can you verify if this is the issue?

If so, we can probably search for the correct path in setup.py, or if it can't be found allow user to pass path using flags.

jaybdub · 2020-06-10T03:05:01Z

Ah, I see. So just to confirm -- building with --plugins resolved issues, and test matrix is what you listed previously (all succeed)?

SrivastavaKshitij · 2020-06-10T03:16:12Z

@jaybdub : I ran build command again for all the combinations in test matrix and also ran unit tests. Everything is fine. Sorry for the false alarm !

jaybdub · 2020-06-10T03:34:03Z

No worries, good to hear!

Going to do a few sanity tests on torchvision models and then merge.

Srivastava kshitij new trt ops

jaybdub and others added 25 commits April 10, 2020 12:18

plugin setup.py

6be6da4

asdf

1142079

interpolate torch serialization

b5bb91d

version handling

62a8918

added upsample code in converters

fb72cac

added functional <--> nn.module binding for upsample

f6a7eaf

made changes:

3ecd4f5

working upsample with trt6

3c00f09

added F.interpolate converter

558cbff

(WIP), trt version check

f48ad4b

Add 3d convolutions and interpolations

8f63b22

Update lots of ops

3ebf1b9

More opreations

63e65d4

Remove tools import

a4f0b52

fixed trt version compatibility for torch2trt.py

393a13d

separated trt 7 ops

401be2e

added original ops

8cb3606

reverting to default for trt 7

3ce1371

changed value from 7 to 7.0

af2674c

Merge branch 'master' into new_trt_ops

f3881cc

added enabled kwarg to register_converter and add_module_test

b2c8c13

moved trt7 ops to converters dir

5c31892

merged interpolate modules

e3f45d3

format init

1f4f69c

increment minor vers

229f807

added --plugins to setup

57c8188

jaybdub added 2 commits June 9, 2020 18:50

plugins disclaimer to readme

a22e82c

disable interpolate if PyTorch < 1.3

1a5a397

fixed interpolate plugin checking

b31611c

jaybdub merged commit 1f66266 into master Jun 10, 2020

jaybdub deleted the SrivastavaKshitij-new_trt_ops branch June 10, 2020 04:09

SrivastavaKshitij mentioned this pull request Jun 15, 2020

Is it possible to get a description of all required libraries with their versions here? #312

Closed

jaybdub added a commit that referenced this pull request Jun 28, 2021

Merge pull request #332 from NVIDIA-AI-IOT/SrivastavaKshitij-new_trt_ops

d781740

Srivastava kshitij new trt ops

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Srivastava kshitij new trt ops #332

Srivastava kshitij new trt ops #332

jaybdub commented Jun 9, 2020

jaybdub commented Jun 9, 2020

jaybdub commented Jun 9, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 •

edited

jaybdub commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020

jaybdub commented Jun 10, 2020

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 •

edited

jaybdub commented Jun 10, 2020

Srivastava kshitij new trt ops #332

Srivastava kshitij new trt ops #332

Conversation

jaybdub commented Jun 9, 2020

jaybdub commented Jun 9, 2020

jaybdub commented Jun 9, 2020 • edited

SrivastavaKshitij commented Jun 10, 2020 • edited

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 • edited

SrivastavaKshitij commented Jun 10, 2020 • edited

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 • edited

SrivastavaKshitij commented Jun 10, 2020

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 • edited

jaybdub commented Jun 10, 2020 • edited

SrivastavaKshitij commented Jun 10, 2020 • edited

SrivastavaKshitij commented Jun 10, 2020 • edited

SrivastavaKshitij commented Jun 10, 2020

jaybdub commented Jun 10, 2020

jaybdub commented Jun 10, 2020

SrivastavaKshitij commented Jun 10, 2020 • edited

jaybdub commented Jun 10, 2020

jaybdub commented Jun 9, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

jaybdub commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited

SrivastavaKshitij commented Jun 10, 2020 •

edited