New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
馃悰 [Bug] Torch Tensor RT crash when trying to compile a script module on Windows (C++) #1144
Comments
The release notes for version 1.1.0 indicates
I cannot guarantee it is related, but I would try with TensorRT 8.2. |
Thanks @gcuendet! I will try with TensorRT 8.2 |
Unfortunately, I'm facing the same problem even using TensorRT 8.2.5.1. |
Now that I have a second look at it, I think it's just your CMake. There are a few problems, but typically, when you do
you explicitly link a library (torchtrt.lib / dll) into your executable. That's a pretty old-fashioned way of using CMake. The main idea of what is sometimes referred to as "modern" CMake is to use targets instead. A CMake target can encapsulate more informations about what and how to link to (such as list of libraries, obviously, but also headers that are exposed, dependencies to link against, specific flags to use, etc.). The branch that you are using generates a CMake finder for torch-tensorRT that allows you to do
in the exact same way you are doing for libtorch.
Note that now
If you want to see an example of that, check in the example directory on that "CMake" branch, that's exactly how these are linked to torch-tensorRT, using CMake. The one small problem that you might encounter when doing Again, as a conclusion, please have a look at how the torch-tensorRT examples are compiled and linked to torch-tensorRT. |
Ok, I tried to change my
And it fails with:
So I tried with
And now it fails with
Since there's no
And I get
So I checked
So I added the following:
And I got another error:
But at this point I guess I'm just doing something wrong. |
I think the main problem here is that you didn't install torchtrt. I am assuming that looking at:
which looks like your build folder. My assumption is that you did something like:
but not the install step:
Is that correct? You should install it (last command above), when copying everything in the install folder, some paths can "correct" (typically the finders in cmake/Module get staged at the right place). Then this the path
becomes the path to the install folder.:
Actually, writing this answer, it reminded me of a similar recent issue. Maybe you could have a look at the answer there, it might be helpful. |
Oh right, I added on VS the install option as a build command argument into the CMakeSettings for x64-Release And it built an install directory with everything needed by cmake (almost, it still wanted me to specify the TensorRT_ROOT) to work properly. This is my new CMakeLists.txt
Unfortunately it still crashes in the same point,
@narendasan Is this something that ever happened or am I the first to experience this kind of behaviour? At this point I don't understand what I'm possibily doing wrong. |
As an additional note, I was able to optimize the model on the same PC by using WSL. Then I tried to create a super trivial executable that just loaded the model
It compiles, and when i try to run the executable it asks me to copy the torch .dll files in the build directory (a known problem when using torch in Windows) but it doesn't ask me to copy It crashes when it tries to load the model. |
Update: by catching the exception as a
I was able to get the following message:
So I guess, based on #642 that maybe VS isn't linking the torchtrt_runtime at all, that's probably due to the fact that the "-Wl,--no-as-needed" flag isn't working as expected. This is the full output from VS (compilation + linking)
And if you check the linking passage ([2/2]) it seems that everything's fine since "C:\src\Torch-TensorRT\out\install\x64-Release\lib\torchtrt_runtime.lib" appears. But still, it doesn't ask me to copy the .dll when I run the program. |
That's a very good point! If my understanding is correct, on Linux, torch-tensorRT relies on static initialisation to register the Engine class with torch (see here). That works if you force linking to that library, even when no symbol are used (that's always the case for torchtrt_runtime), so that it's automatically loaded by the consumer executable/library. Anyway, a possible (ugly) workaround is to link to torchtrt (and not torchtrt_runtime) and use at least one symbol from that lib (just instentiate a Not sure what the proper design will be, but I guess this will need to be addressed at some point, if the windows support becomes a bit more official, right @narendasan ? |
Update: By manually loading the .dlls the script does work! This of course isn't a proper solution and should be properly adressed in #1058 (maybe it would be beneficial to change the CMake example too @gcuendet) If we want to optimize a model through TorchTensoRT on Windows make sure to manually load
If we want to use an optimized model on Windows make sure to manually load
|
Hello @andreabonvini I setup the code and fixed all compile time errors, but now I'm getting weird linking errors for each access of torch_tensorrt:: in my code. Can you please explain the steps that made the torch_tensorrt work on windows? I'll be exremely grateful to you |
Hi @noman-anjum-retro, can you share your CMakeLists.txt, your code, and the linking errors you're receiving? |
Great that you had it working @andreabonvini ! 馃槂 In my opinion, that issue is independent from the CMake support in itself. What I mean by that is:
Still, that problem puzzled me and I might have a proposal.
So basically, by just including that header (as documented in the README), the library is linked since one symbol is required. So that could easily be done in torch-tensorRT as well. Moreover that would be consistent with torchvision. What do you think @narendasan ? |
Hello @andreabonvini. I started with the basic cmake:
I run CMAKE with command cmake -G "Visual Studio 16 2019" -DCMAKE_PREFIX_PATH=D:\Codes\libtorch .
Initially code was showing compilation error as I copied it from documentation. But now compilation errors are fixed. When I run the code each reference of torch_tensorrt throughs linking error, I'm sharing one below: Severity Code Description Project File Line Suppression State |
You aren't linking to
|
Thanks I'll take a look to it |
How did you build torch_tensorrt on windows can you please explain it. When I'm running bazel build for default workspace file it throws an error
|
You have to build and install the library through CMake, you can do that only from this branch. |
Alright, that makes sense now. Thanks Alot |
Hey @andreabonvini I tried compiling with above mentioned steps. Code runs well with this script:
return 0; }` However when I try to compile script_module it throws linking error:
Any Idea What's Wrong?? |
It still seems that your are not linking against |
Environment So the steps are as follow: I copied this PR into my system and compiled it with command project(example-app) set(TensorRT_ROOT "C:\Program Files\TensorRT\TensorRT-8.4.1.5\") set(torchtrt_DIR "C:\Program Files (x86)\Torch-TensorRT\lib\cmake\torchtrt") find_package(torchtrt REQUIRED) find_package(Torch REQUIRED) add_executable(example-app create_trt_module.cpp) target_link_libraries(example-app PRIVATE torch "${TORCH_LIBRARIES}" "-Wl,--no-as-needed" torchtrt_runtime "-Wl,--no-as-needed") Additional Step: |
@noman-anjum-retro, as already said, you are not linking againt Moreover, maybe using TensorRT 8.4.1.5 maybe it's not the best idea cause of this. |
Thank you soo much @andreabonvini. It is working now |
Update: To correctly load
(This is the output from running |
@gcuendet At first glance seems reasonable. We could make a |
Hello, it is working fine with debug mode in visual studio, however when I switched to Release mode,(I also changed libtorch to release) it is unable to load TrTCompiledEngine and throwing following error Any Idea about it?. I need to switch to release mode because openv debug mode is very slow, and it's ruining the gain obtained via trt |
@noman-anjum-retro Seems like the runtime is not properly registering. Are you running this in C++? Can you try latest master? |
Yeah, It's working now with the latest master. Thanks |
This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days |
Bug Description
I can't compile a script module with TorchTensorRT.
This is my code:
This is the error I get:
First a WARNING get printed "
WARNING: [Torch-TensorRT] - Interpolation layer will be run through ATen, not TensorRT. Performance may be lower than expected
",and then, as you can see from the screenshot, I got an exception "
read access violation. creator was nullptr
." when running the following lines:The file
interpolate.cpp
is located atpath/to/Torch-TensorRT/core/conversion/converters/impl
.What am I doing wrong?
This is my
CMakeLists.txt
:I exported the traced script module with the following code:
Environment
The text was updated successfully, but these errors were encountered: