C++ CUDA Project in Windows - Second call to DNN Module::forward() throws exception #47590

rtrahms · 2020-11-09T04:50:05Z

🐛 Bug

Environment/Setup

Using libtorch 1.7.0 / CUDA 10.1 / MS Visual Studio 2019

Using torchscript model pt file.

To Reproduce

To reproduce:

Confirm CUDA device is present by checking torch::cuda::is_available() and torch::cuda::device_count()
Load a model using the torch::jit::script::Module class and torch::jit::load() to specify the pt torchscript file to load.
Specify module.to(kCUDA) for CUDA DNN
Load an image and convert to a tensor:

half_ = true // CUDA FP16
cv::cvtColor(img_input, img_input, cv::COLOR_BGR2RGB); // BGR -> RGB
img_input.convertTo(img_input, CV_32FC3, 1.0f / 255.0f); // normalization 1/255
auto tensor_img = torch::from_blob(img_input.data, { 1, img_input.rows, img_input.cols, img_input.channels() }).to(kCUDA);

tensor_img = tensor_img.permute({ 0, 3, 1, 2 }).contiguous(); // BHWC -> BCHW (Batch, Channel, Height, Width)

if (half_) {
tensor_img = tensor_img.to(torch::kHalf);
}
std::vectortorch::jit::IValue inputs;
inputs.emplace_back(tensor_img);
Perform inference:
torch::jit::IValue output = module_.forward(inputs);

The first time this forward() call is made, it works. Any subsequent call will throw an exception. When using the kCPU version of this code, no exception is thrown.

Expected behavior

Once a module is loaded, repeated calls to Module::forward() should return without exception.

cc @peterjc123 @maxluk @nbcsm @guyang3532 @gunandrose4u @smartcat2010 @mszhanyi @gmagogsfm

mszhanyi · 2020-11-10T07:35:43Z

@rtrahms , thank your feedback. Could you attach your example (.py and .pt) that I could reproduce it easily?

rtrahms · 2020-11-11T01:35:47Z

Not a python file but a libtorch cpp file, and a supporting Detector class cpp/h files. Built into a MS Visual Studio 2019 project and linking with libtorch libraries.
libTorchTestApp4.zip

rtrahms · 2020-11-11T01:38:27Z

CUDA PT file for the network attached.
yolov5s_ob12.torchscript-cuda.zip

Labels file as well.
labels_ob_12.txt

mszhanyi · 2020-11-12T04:17:33Z

CUDA PT file for the network attached.
yolov5s_ob12.torchscript-cuda.zip

Labels file as well.
labels_ob_12.txt

@rtarquini , thank your reply. could you share the image file

rtrahms · 2020-11-12T14:14:51Z

It is just a standard JPG image loaded with OpenCV. You can use any image file really. The point is it shouldn't crash on feeding the image into forward() on the second pass.

mszhanyi · 2020-11-16T06:41:26Z

@rtrahms, I tried your example and it passed.
I used py1.7.0+cu110 because there's cu110 in my local machine. opencv is opencv-4.5.0. vs2019 version is 16.7.7.
For convenience, I added torch path and opencv path in PATH.

The test result is

The cmakelist file is attached.
CMakeLists.txt

The image I used is Lenna

mszhanyi · 2020-11-17T03:44:14Z

@rtrahms , I double checked with https://download.pytorch.org/libtorch/cu101/libtorch-win-shared-with-deps-debug-1.7.0%2Bcu101.zip. It passed too.

mszhanyi · 2020-11-18T02:20:07Z

Close this issue since it 's not a bug. @rtrahms , feel free to reopen it if you have any new findings

DBraun · 2020-12-31T22:11:23Z

@rtrahms What was the solution?

heitorschueroff added module: windows Windows support for PyTorch oncall: jit Add this issue/PR to JIT oncall triage queue labels Nov 9, 2020

github-actions bot added this to Need triage in JIT Triage Nov 9, 2020

mszhanyi added the windows-triaged label Nov 10, 2020

wanchaol moved this from Need triage to In discussion in JIT Triage Nov 10, 2020

rtrahms mentioned this issue Nov 12, 2020

Question: Have you tried this code with CUDA? Nebula4869/YOLOv5-LibTorch#3

Closed

mszhanyi closed this as completed Nov 18, 2020

JIT Triage automation moved this from In discussion to Done Nov 18, 2020

mszhanyi mentioned this issue Nov 20, 2020

C++ project on Windows, linker links against torch_cpu.dll instead torch_cuda.dll #46161

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++ CUDA Project in Windows - Second call to DNN Module::forward() throws exception #47590

C++ CUDA Project in Windows - Second call to DNN Module::forward() throws exception #47590

rtrahms commented Nov 9, 2020 •

edited by pytorch-probot bot

mszhanyi commented Nov 10, 2020

rtrahms commented Nov 11, 2020

rtrahms commented Nov 11, 2020 •

edited

mszhanyi commented Nov 12, 2020 •

edited

rtrahms commented Nov 12, 2020

mszhanyi commented Nov 16, 2020

mszhanyi commented Nov 17, 2020

mszhanyi commented Nov 18, 2020

DBraun commented Dec 31, 2020

C++ CUDA Project in Windows - Second call to DNN Module::forward() throws exception #47590

C++ CUDA Project in Windows - Second call to DNN Module::forward() throws exception #47590

Comments

rtrahms commented Nov 9, 2020 • edited by pytorch-probot bot

🐛 Bug

Environment/Setup

To Reproduce

Expected behavior

mszhanyi commented Nov 10, 2020

rtrahms commented Nov 11, 2020

rtrahms commented Nov 11, 2020 • edited

mszhanyi commented Nov 12, 2020 • edited

rtrahms commented Nov 12, 2020

mszhanyi commented Nov 16, 2020

mszhanyi commented Nov 17, 2020

mszhanyi commented Nov 18, 2020

DBraun commented Dec 31, 2020

rtrahms commented Nov 9, 2020 •

edited by pytorch-probot bot

rtrahms commented Nov 11, 2020 •

edited

mszhanyi commented Nov 12, 2020 •

edited