-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Description
🐛 Bug
I'm trying to load a torchscript module serialized as in the tutorial, but I get vector<T> too long
when trying to print the modules in the model, as well as a crash if model->forward
is called:
To Reproduce
Serialize model as in the tutorial:
(piptorch) C:\...\>python
Python 3.6.7 |Anaconda, Inc.| (default, Oct 28 2018, 19:44:12) [MSC v.1915 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torchvision
>>> model = torchvision.models.resnet18()
>>> example = torch.rand(1, 3, 224, 224)
>>> traced_script_module = torch.jit.trace(model, example)
>>> output = traced_script_module(torch.ones(1, 3, 224, 224))
>>> output[0, :5]
tensor([-0.3891, -0.0314, 0.4143, 0.0229, 0.0064], grad_fn=<SliceBackward>)
>>> traced_script_module.save("model.pt")
>>> exit()
My C++ code looks like this:
#include "pch.h"
#include <torch/script.h> // One-stop header.
#include <iostream>
#include <memory>
#include "ConsoleApplication1.h"
int main(int argc, const char* argv[]) {
if (argc != 2) {
std::cerr << "usage: example-app <path-to-exported-script-module>\n";
return -1;
}
std::cout << "loading: " << argv[1] << '\n';
std::ifstream in(argv[1], std::ios_base::binary);
// Deserialize the ScriptModule from a file using torch::jit::load().
try {
std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(in);
in.close();
assert(module != nullptr);
auto mods = module->get_modules();
for (const auto& m : mods) {
std::cout << m.key() << "\n";
}
c10::Device gpu(c10::DeviceType::CUDA, 0);
module->to(gpu);
std::cout << "module moved to gpu ok\n";
torch::jit::Stack inputs;
auto tensor = torch::ones({ 1, 3, 224, 224 }).to(gpu);
inputs.push_back(tensor);
std::cout << "inputs constructed ok\n";
module->forward(inputs);
std::cout << "ok\n";
}
catch (c10::Error e) {
in.close();
std::cerr << e.what() << "\n";
}
catch (std::exception exn) {
in.close();
std::cerr << exn.what() << "\n";
}
}
But when I run the above I get:
PS C:\Users\a\Documents\libtorch-shared-with-deps-1.0\proj\x64\Debug> .\ConsoleApplication1.exe model.pt
loading: model.pt
vector<T> too long
If I remove the get_modules line and the part that prints the module names, I get:
PS C:\Users\a\Documents\libtorch-shared-with-deps-1.0\proj\x64\Debug> .\ConsoleApplication1.exe model.pt
loading: model.pt
module moved to gpu ok
inputs constructed ok
PS C:\Users\a\Documents\libtorch-shared-with-deps-1.0\proj\x64\Debug>
(i.e, a crash when module->forward() is called)
If I run it in the VS2017 debugger, the crash happens in `std::unordered_map``:
Walking up the stacktrace to torch's code:
Environment
Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).
Collecting environment information...
PyTorch version: 1.0
Is debug build: No
CUDA used to build PyTorch: 9.0
OS: Microsoft Windows 10 Home
GCC version: Could not collect
CMake version: version 3.13.0-rc1
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration: GPU 0: GeForce GTX 1070 With Max-Q Design
Nvidia driver version: 411.70
cuDNN version: Probably one of the following:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin\cudnn64_7.dll
Versions of relevant libraries:
[pip] Could not collect
[conda] torch 1.0 <pip>
[conda] torchvision 0.2.1 <pip>
NB: I installed Pytorch 1.0 & LibTorch from https://github.com/peterjc123/pytorch-scripts as I couldn't find an official module or get it to build myself
Additional context
I had some 'std' ambiguous symbol
errors in some of the headers when I included them in visual studio 2017, of the form std::vector ...
and std::unordered_map ...
. I had to convert them to ::std::vector ... and ::std::unordered_map ...
to get it to work.