Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading plugins from libmyplugins.so with trtexec fails #107

Closed
philipp-schmidt opened this issue Jul 26, 2020 · 17 comments
Closed

Loading plugins from libmyplugins.so with trtexec fails #107

philipp-schmidt opened this issue Jul 26, 2020 · 17 comments
Labels
wontfix This will not be worked on

Comments

@philipp-schmidt
Copy link

philipp-schmidt commented Jul 26, 2020

I'm trying to run the serialized yolov4 engine with trtexec, which comes included in the TensorRT NGC docker containers. But anyway: it basically loads the serialized engine and runs inference - not much different from what yolov4.cpp does.

docker run --gpus all -it --rm -v $(pwd)/tensorrtx:/tensorrtx nvcr.io/nvidia/tensorrt:19.10-py3
cd /workspace/tensorrt/bin
./trtexec --loadEngine=/tensorrtx/yolov4/build/yolov4.engine --verbose

...
[06/26/2020-16:59:06] [E] [TRT] INVALID_ARGUMENT: getPluginCreator could not find plugin Mish_TRT version 1
[06/26/2020-16:59:06] [E] [TRT] safeDeserializationUtils.cpp (259) - Serialization Error in load: 0 (Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
[06/26/2020-16:59:06] [E] [TRT] INVALID_STATE: std::exception
[06/26/2020-16:59:06] [E] [TRT] INVALID_CONFIG: Deserialize the cuda engine failed.
[06/26/2020-16:59:06] [E] Engine could not be created
&&&& FAILED TensorRT.trtexec # ./trtexec --loadEngine=/tensorrtx/yolov4/build/yolov4.engine --verbose

This obviously fails because TensorRT does not load the plugin containing Mish Activation and Yolo Layer. So next I tried:

./trtexec --loadEngine=/tensorrtx/yolov4/build/yolov4.engine --plugins=/tensorrtx/yolov4/build/libmyplugins.so --verbose

...
[06/26/2020-17:02:26] [I] Loading supplied plugin library: /tensorrtx/yolov4/build/libmyplugins.so
[06/26/2020-17:02:27] [E] [TRT] INVALID_ARGUMENT: getPluginCreator could not find plugin Mish_TRT version 1
[06/26/2020-17:02:27] [E] [TRT] safeDeserializationUtils.cpp (259) - Serialization Error in load: 0 (Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry)
[06/26/2020-17:02:27] [E] [TRT] INVALID_STATE: std::exception
[06/26/2020-17:02:27] [E] [TRT] INVALID_CONFIG: Deserialize the cuda engine failed.
[06/26/2020-17:02:27] [E] Engine could not be created
&&&& FAILED TensorRT.trtexec # ./trtexec --loadEngine=/tensorrtx/yolov4/build/yolov4.engine --plugins=/tensorrtx/yolov4/build/libmyplugins.so --verbose

So it loads the library containing the plugin but it does not seem to register the plugin correctly. I checked the same approach with a different tensorrt engine using custom layers and it worked. Is there any trick to loading your custom plugins @wang-xinyu ? Is selecting the correct arch=compute_30;code=sm_30 in the CMakeList.txt important? Is it possible to run the engine in a "standardized" way outside of your custom loading code in yolov4.cpp?
Any hint greatly appreciated! :)

Edit: I tried with different TensorRT versions, including the most recent 20.06 container which comes with 7.1.2

@wang-xinyu
Copy link
Owner

@philipp-schmidt I never tried trtexec before. I think you may need to find some docs which state how to write trt plugin compatible with trtexec and adapt the plugin in this repo.

And why do you need trtexec? Is that just for testing, isn't it?

@philipp-schmidt
Copy link
Author

And why do you need trtexec? Is that just for testing, isn't it?

Making it compatible would make it possible to deploy the model on Triton Inference Server and therefore probably also in Kubeflow. The use the same loading logic than trtexec and load the plugin library via the LD_PRELOAD env variable. This would make all the models in this repo even more awesome, and I'm pretty sure it is a minor change in the way the lib is compiled.

I think you may need to find some docs which state how to write trt plugin compatible with trtexec and adapt the plugin in this repo

I will have a look into that. It is probably only a matter of calling the plugin initializer in the load logic in the lib.
So calling REGISTER_TENSORRT_PLUGIN(pluginCreator) statically in the library might do the trick already.
What do you think? I can implement it but it would be awesome if you could tell me if you already see anything which would make your code incompatible with the tensorrt plugin guide:

https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#add_custom_layer

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/custom_operation.html?highlight=ld_preload#tensorrt

@wang-xinyu
Copy link
Owner

Can you try move REGISTER_TENSORRT_PLUGIN(pluginCreator) to yololayer.h and rebuild the libmyplugins.so.

@philipp-schmidt
Copy link
Author

Yes it worked already.

./trtexec --loadEngine=/tensorrtx/yolov4/build/yolov4.engine --plugins=/tensorrtx/yolov4/build/libmyplugins.so

[06/27/2020-09:05:05] [I] Loading supplied plugin library: /tensorrtx/yolov4/build/libmyplugins.so
[06/27/2020-09:05:07] [I] Average over 10 runs is 13.4981 ms (host walltime is 13.6303 ms, 99% percentile time is 13.6868).
[06/27/2020-09:05:07] [I] Average over 10 runs is 12.3215 ms (host walltime is 12.4557 ms, 99% percentile time is 13.5178).
[06/27/2020-09:05:07] [I] Average over 10 runs is 11.1133 ms (host walltime is 11.2475 ms, 99% percentile time is 11.2093).
[06/27/2020-09:05:07] [I] Average over 10 runs is 11.0806 ms (host walltime is 11.2119 ms, 99% percentile time is 11.1464).
[06/27/2020-09:05:07] [I] Average over 10 runs is 11.0867 ms (host walltime is 11.2182 ms, 99% percentile time is 11.1598).
[06/27/2020-09:05:07] [I] Average over 10 runs is 11.0825 ms (host walltime is 11.2146 ms, 99% percentile time is 11.123).
[06/27/2020-09:05:07] [I] Average over 10 runs is 11.0968 ms (host walltime is 11.228 ms, 99% percentile time is 11.2434).
[06/27/2020-09:05:07] [I] Average over 10 runs is 11.0633 ms (host walltime is 11.1949 ms, 99% percentile time is 11.1936).
[06/27/2020-09:05:08] [I] Average over 10 runs is 11.0848 ms (host walltime is 11.2179 ms, 99% percentile time is 11.1514).
[06/27/2020-09:05:08] [I] Average over 10 runs is 11.0722 ms (host walltime is 11.2028 ms, 99% percentile time is 11.1555).

@philipp-schmidt
Copy link
Author

I can write a little PR on how to build and deploy this to Triton if you want. Triton really steps up the game I think. It takes care of Model Deployment and runs the model on multiple GPUs with overlapping data transfer and scheduling etc.

@wang-xinyu
Copy link
Owner

@philipp-schmidt That would be great, Can you write a tutorial about Triton, put it in tensorrtx/tutorials and submit a PR?

@cesarandreslopez
Copy link
Contributor

I can write a little PR on how to build and deploy this to Triton if you want. Triton really steps up the game I think. It takes care of Model Deployment and runs the model on multiple GPUs with overlapping data transfer and scheduling etc.

Looking forward for that guidance! @philipp-schmidt

@philipp-schmidt
Copy link
Author

Hi @cesarandreslopez @wang-xinyu
I'm working on it now. I planning to make a few more changes to optimize the engine (including INT8 opt and calibration, which in my tests currently gave me no speedup because the custom layers dont support INT8 and will introduce a lot of FP32 output layers in between) and I'm thinking about using the ReLU version instead of the MISH activations, so tensorrt can optimize conv/relu better.

Currently my yolov3 implementation is twice as fast as our yolov4 here (its with int8 and without combining the three yolo layers at the end though so some of it is expected)

Anyway, there will be so many changes that I might be putting it in a new repo. Ofc I will share any goodies with you here.

@philipp-schmidt
Copy link
Author

philipp-schmidt commented Aug 16, 2020

Hi,

I implemented everything to deploy to Triton and put it here: https://github.com/isarsoft/yolov4-triton-tensorrt

If Triton is an alternative for you @cesarandreslopez instead of using TensorRT directly (Triton has some amazing benefits and it has a python SDK), maybe you give it a try.

@wang-xinyu I included your license and mentioned you in the README. Feel free to copy the tutorial in my README for your repo if you want. Or you can link my repo.

@sudapure
Copy link

@philipp-schmidt , appreciate your work, but without having client infer code , it looks incomplete, looking forward for client infer script if you wish to opensource it.

@philipp-schmidt
Copy link
Author

philipp-schmidt commented Aug 16, 2020

Client code will be the next thing I'll add. It's pretty straightforward so it should not take long.

@sudapure Can you add an issue in my repo and tell me whether you want C++ or Python first?

@cesarandreslopez
Copy link
Contributor

@philipp-schmidt looks very promising! Will check out soon and perhaps even contribute to some versions with other neural networks. Thanks,

@wang-xinyu
Copy link
Owner

@philipp-schmidt Thanks for sharing. I will add a link in readme.

@stale
Copy link

stale bot commented Mar 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Mar 5, 2021
@stale stale bot closed this as completed Mar 12, 2021
@pshwetank
Copy link

Hey! is anyone working on this issue? I wanted to deploy yolov5 but seems to be getting some error while using trtexec :

root@dae16f369f04:/usr/src/tensorrt/bin# ./trtexec --loadengine=yolov5s.engine --plugins=libmyplugins.so 
&&&& RUNNING TensorRT.trtexec # ./trtexec --loadengine=yolov5s.engine --plugins=libmyplugins.so
[03/27/2021-13:48:29] [E] Model missing or format not recognized
&&&& FAILED TensorRT.trtexec # ./trtexec --loadengine=yolov5s.engine --plugins=libmyplugins.so

@pshwetank
Copy link

@philipp-schmidt @wang-xinyu ?

@wang-xinyu
Copy link
Owner

maybe cuz of typo.
can you try --loadEngine, not loadengine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants