Loading traced pytorch model to C++ #124009

ZarinaMaks · 2024-04-13T14:12:07Z

🐛 Describe the bug

Will be grateful for any help :(

I’m having a problem with using pytorch 2.2.2+cpu pretrained model in C++ via TorchLib 2.2.2 (cpu version).
I’m also using Visual Studio 2022 to work with C++ project which is located on D:\ drive. I’ve added Torchlib to Cmake and it successfully downloads the model and works until line:

at::Tensor output = module.forward(inputs).toTensor();

After this line it crushes with the error: INTEL MKL ERROR: The specified module could not be found. mkl_avx2.1.dll. Intel MKL FATAL ERROR: Cannot load mkl_avx2.1.dll or mkl_def.1.dll. and I didn't figured out what is wrong with TorchLib here. Seems like the problem is with some part of the lib related to python, but it’s strange because the model loads to C++ by LibTorch successfully.

The .cpp code is here:

#include <torch/script.h>
...
    string path = "traced_energynn_model.pt";
    torch::jit::script::Module module;
    try {
        module = torch::jit::load(path);
    }
    catch (const c10::Error& e) {
        std::cerr << "error loading the model\n";
        return -1;
    }
    std::cout << "Model traced_energynn_model loaded fine\n";

    std::vector<torch::jit::IValue> inputs;
    inputs.push_back(torch::randn({ 1, 2 }));

    at::Tensor output = module.forward(inputs).toTensor();
    std::cout << output << "\n";
...

The way I saved the model in python 3.9:

from random import randrange

x = X_test.reset_index(drop=True).iloc[randrange(len(X_test))]
example = torch.tensor([x]).to(torch.float32)

traced_model = torch.jit.trace(model, example)
traced_model.save("traced_energynn_model.pt")

Versions

Collecting environment information...
PyTorch version: 2.2.2+cpu
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Home
GCC version: Could not collect
Clang version: Could not collect
CMake version: version 3.28.1
Libc version: N/A

Python version: 3.9.13 (main, Aug 25 2022, 23:51:50) [MSC v.1916 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19045-SP0
Is CUDA available: False
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1660 Ti
Nvidia driver version: 516.94
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture=9
CurrentClockSpeed=2592
DeviceID=CPU0
Family=198
L2CacheSize=1536
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=2592
Name=Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
ProcessorType=3
Revision=

Versions of relevant libraries:
[pip3] flake8==4.0.1
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.25.2
[pip3] numpydoc==1.4.0
[pip3] torch==2.2.2
[pip3] torchvision==0.17.1
[conda] blas 1.0 mkl
[conda] mkl 2021.4.0 haa95532_640
[conda] mkl-service 2.4.0 py39h2bbff1b_0
[conda] mkl_fft 1.3.1 py39h277e83a_0
[conda] mkl_random 1.2.2 py39hf11a4ad_0
[conda] numpy 1.25.2 pypi_0 pypi
[conda] numpydoc 1.4.0 py39haa95532_0
[conda] torch 2.2.2 pypi_0 pypi
[conda] torchvision 0.17.1 pypi_0 pypi

cc @peterjc123 @mszhanyi @skyline75489 @nbcsm @vladimir-aubrecht @iremyux @Blackhex @cristianPanaite @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

The text was updated successfully, but these errors were encountered:

LLimerence · 2024-04-18T02:37:39Z

Hello, have you solved this problem

xuhancn · 2024-04-19T08:56:25Z

I checked pytorch builder scripts https://github.com/search?q=repo%3Apytorch%2Fbuilder+path%3A%2F%5Ewindows%5C%2Finternal%5C%2F%2F+mkl&type=code

It seems Windows libtorch not copy all mkl depend libs, and current pytorch Linux already static linked mkl.
@ZarinaMaks could you please share a example code, and let me reproduce issue?

xuhancn · 2024-04-19T14:31:17Z

pytorch/builder#1790 @jgong5 Please review and comment this PR.

From pytorch issue: pytorch/pytorch#124009 I found libtorch seems use shared mkl lib and missing some mkl dll files. 1. Currently pytorch Linux already use static mkl lib. 2. Windows can also support static mkl lib, I have validated as pytorch/pytorch#116946 So, this PR will switch pytorch to use static mkl lib. I have tested PR on my local PC: <img width="1151" alt="image" src="https://github.com/pytorch/builder/assets/8433590/d727c361-3344-4d95-ac2e-8dc307b74690">

resubmit #1790 with fix PR #1797. From pytorch issue: pytorch/pytorch#124009 I found libtorch seems use shared mkl lib and missing some mkl dll files. 1. Currently pytorch Linux already use static mkl lib. 2. Windows can also support static mkl lib, I have validated as pytorch/pytorch#116946 Tested in https://github.com/pytorch/pytorch/actions/runs/8836875904/job/24264643410

xuhancn · 2024-04-27T08:26:55Z

Hi @ZarinaMaks
I have submitted some PRs to fix this issue: #124925 & pytorch/builder#1798
Could you please help on test the latest nightly build?
Install command:

python -m pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu --upgrade --force-reinstall

Thanks.

leslie-fang-intel · 2024-05-07T06:44:09Z

Close this issue since the fixing PR landed. Feel free to re-open if any further discussion needed.

jgong5 assigned xuhancn Apr 15, 2024

jgong5 added module: windows Windows support for PyTorch module: cpu CPU specific problem (e.g., perf, algorithm) labels Apr 15, 2024

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 15, 2024

xuhancn mentioned this issue Apr 19, 2024

Windows use mkl static lib. pytorch/builder#1790

Merged

xuhancn mentioned this issue Apr 24, 2024

Windows use mkl static lib (take 2) pytorch/builder#1798

Merged

xuhancn added the intel This tag is for PR from Intel label Apr 26, 2024

xuhancn added the module: mkl Related to our MKL support label Apr 27, 2024

leslie-fang-intel closed this as completed May 7, 2024

LaurentMazare mentioned this issue May 13, 2024

Error when building burn on Windows when upgrading to tch 0.16 LaurentMazare/tch-rs#870

Open

syl20bnr mentioned this issue May 13, 2024

Update tch to 0.16+ tracel-ai/burn#1765

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading traced pytorch model to C++ #124009

Loading traced pytorch model to C++ #124009

ZarinaMaks commented Apr 13, 2024 •

edited by pytorch-bot bot

LLimerence commented Apr 18, 2024

xuhancn commented Apr 19, 2024 •

edited

xuhancn commented Apr 19, 2024

xuhancn commented Apr 27, 2024

leslie-fang-intel commented May 7, 2024

Loading traced pytorch model to C++ #124009

Loading traced pytorch model to C++ #124009

Comments

ZarinaMaks commented Apr 13, 2024 • edited by pytorch-bot bot

🐛 Describe the bug

Versions

LLimerence commented Apr 18, 2024

xuhancn commented Apr 19, 2024 • edited

xuhancn commented Apr 19, 2024

xuhancn commented Apr 27, 2024

leslie-fang-intel commented May 7, 2024

ZarinaMaks commented Apr 13, 2024 •

edited by pytorch-bot bot

xuhancn commented Apr 19, 2024 •

edited