what should i do to disable libcaffe2_nvrtc.so #31985

HardLaugh · 2020-01-09T07:49:09Z

I find that executable file must link against libcaffe2_nvrtc.so，but I dont want to link against it due to my projects, I need pure executable file without anything .so about torch, so I build static torch libs and unfortunately libcaffe2_nvrtc.so stoped me

the question is about CUDAHooks.cpp

static std::pair<std::unique_ptr<at::DynamicLibrary>, at::cuda::NVRTC*> load_nvrtc() {
#if defined(_WIN32)
  std::string libcaffe2_nvrtc = "caffe2_nvrtc.dll";
#elif defined(__APPLE__)
  std::string libcaffe2_nvrtc = "libcaffe2_nvrtc.dylib";
#else
  std::string libcaffe2_nvrtc = "libcaffe2_nvrtc.so";
#endif
  std::unique_ptr<at::DynamicLibrary> libnvrtc_stub(
      new at::DynamicLibrary(libcaffe2_nvrtc.c_str()));
  auto fn = (at::cuda::NVRTC * (*)()) libnvrtc_stub->sym("load_nvrtc");
  return std::make_pair(std::move(libnvrtc_stub), fn());
}

and the errors is :

terminate called after throwing an instance of 'std::runtime_error'
  what():  Error in dlopen or dlsym: libcaffe2_nvrtc.so: cannot open shared object file: No such file or directory
The above operation failed in interpreter, with the following stack trace:

cc @ngimel

The text was updated successfully, but these errors were encountered:

zou3519 · 2020-01-09T15:38:00Z

I'm not following what "pure executable file without anything .so about torch" means, could you please clarify?

HardLaugh · 2020-01-10T01:16:51Z

@zou3519 If the system environment changes, which only install mkl and cuda, however, executable file cannot run because it need the shared libs--libcaffe2_nvrtc.so

linux-vdso.so.1 =>  (0x00007ffed67ed000)
        libmkl_intel_lp64.so => /home/46799/anaconda3/envs/eptorch/lib/libmkl_intel_lp64.so (0x00007f31abf5f000)
        libmkl_gnu_thread.so => /home/46799/anaconda3/envs/eptorch/lib/libmkl_gnu_thread.so (0x00007f31aa84c000)
        libmkl_core.so => /home/46799/anaconda3/envs/eptorch/lib/libmkl_core.so (0x00007f31a6843000)
        libgomp.so.1 => /home/46799/anaconda3/envs/eptorch/lib/libgomp.so.1 (0x00007f31a6815000)
        libcuda.so.1 => /usr/local/nvidia/lib64/libcuda.so.1 (0x00007f31a5687000)
        libnvrtc.so.10.0 => /usr/local/cuda/lib64/libnvrtc.so.10.0 (0x00007f31a406b000)
        libnvToolsExt.so.1 => /usr/local/cuda/lib64/libnvToolsExt.so.1 (0x00007f31a3e61000)
        libcudart.so.10.0 => /usr/local/cuda/lib64/libcudart.so.10.0 (0x00007f31a3be7000)
        libcufft.so.10.0 => /usr/local/cuda/lib64/libcufft.so.10.0 (0x00007f319d733000)
        libcublas.so.10.0 => /usr/local/cuda/lib64/libcublas.so.10.0 (0x00007f319919c000)
        libcurand.so.10.0 => /usr/local/cuda/lib64/libcurand.so.10.0 (0x00007f3195035000)
        libcusparse.so.10.0 => /usr/local/cuda/lib64/libcusparse.so.10.0 (0x00007f31915cd000)
        libcudnn.so.7 => /usr/local/cuda/lib64/libcudnn.so.7 (0x00007f317d03e000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f317ce20000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f317cc18000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f317ca13000)
        libstdc++.so.6 => /home/46799/anaconda3/envs/eptorch/lib/libstdc++.so.6 (0x00007f317c6d9000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f317c3d3000)
        libgcc_s.so.1 => /home/46799/anaconda3/envs/eptorch/lib/libgcc_s.so.1 (0x00007f317c3be000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f317bff5000)
        /lib64/ld-linux-x86-64.so.2 (0x0000562a03662000)
        libnvidia-fatbinaryloader.so.430.14 => /usr/local/nvidia/lib64/libnvidia-fatbinaryloader.so.430.14 (0x00007f317bda7000)

wahlberg82 · 2020-09-20T21:20:30Z

I'm in the same situation. I'm building a plugin for a commercial host application. This plugin, which is in the form of a shared library (.so file), is statically linking libtorch (i.e. it's baking all of it into it's own so file). The only libraries that are dynamically linked against are the CUDA libs. This plugin needs to be portable to other systems without the need to install lots of dependencies (CUDA is fine to be needed as an additional install). Everything works, except I can't get my shared library to not also need the torch library "libcaffe2_nvrtc.so". Since it doesn't exist a static version of this library "libcaffe2_nvrtc.a", I can't seem to get around it. Is there a way to not have libtorch dynamically depend on this library and still have GPU support? I can build libtorch with CPU support only, and then I don't have the problem, but I need the GPU accelleration.
Cheers, David

yeoserene · 2020-09-28T16:12:39Z

I am having the same problem as @wahlberg82. Running my executable with CPU only doesn't spawn any problems, but trying to run it with GPU after compilation gives: "Error in dlopen or dlsym: libcaffe2_nvrtc.so: cannot open shared object file: No such file or directory". I also need this as a standalone that can run perfectly by itself.

ghost · 2020-10-01T13:26:04Z

Is have same problem. Hoe make static link libtorch?

malfet · 2020-10-08T21:57:28Z

PR against master should address the issue (although your application will still have dynamic dependency on NVRTC library)

wahlberg82 · 2020-10-12T21:03:18Z

I did a pull of the nightly master yesterday and recompiled statically. I can confirm that this fix is working well and solving the issue. Thanks all for fixing this so quickly and being such a great community!
Cheers, Davd

Summary: Instead of dynamically loading `caffe2_nvrtc`, lazyNVRTC provides the same functionality by binding all the hooks to lazy bind implementation, very similar to the shared library jump tables: On the first call, each function from the list tries to get a global handle to the respective shared library and replace itself with the dynamically resolved symbol, using the following template: ``` auto fn = reinterpret_cast<decltype(&NAME)>(getCUDALibrary().sym(C10_SYMBOLIZE(NAME))); if (!fn) throw std::runtime_error("Can't get" ## NAME); lazyNVRTC.NAME = fn; return fn(...) ``` Fixes pytorch/pytorch#31985 Pull Request resolved: pytorch/pytorch#45674 Reviewed By: ezyang Differential Revision: D24073946 Pulled By: malfet fbshipit-source-id: 1479a75e5200e14df003144625a859d312885874

zou3519 added module: build Build system issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: cuda Related to torch.cuda, and CUDA support in general labels Jan 9, 2020

malfet self-assigned this Sep 28, 2020

ngimel mentioned this issue Oct 1, 2020

Statically linking to libtorch but also needing the shared library “libcaffe2_nvrtc.so”? #45603

Closed

malfet mentioned this issue Oct 1, 2020

Add LazyNVRTC #45674

Closed

facebook-github-bot closed this as completed in 1558a36 Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what should i do to disable libcaffe2_nvrtc.so #31985

what should i do to disable libcaffe2_nvrtc.so #31985

HardLaugh commented Jan 9, 2020 •

edited by pytorch-probot bot

zou3519 commented Jan 9, 2020

HardLaugh commented Jan 10, 2020 •

edited

wahlberg82 commented Sep 20, 2020

yeoserene commented Sep 28, 2020

ghost commented Oct 1, 2020

malfet commented Oct 8, 2020

wahlberg82 commented Oct 12, 2020

what should i do to disable libcaffe2_nvrtc.so #31985

what should i do to disable libcaffe2_nvrtc.so #31985

Comments

HardLaugh commented Jan 9, 2020 • edited by pytorch-probot bot

zou3519 commented Jan 9, 2020

HardLaugh commented Jan 10, 2020 • edited

wahlberg82 commented Sep 20, 2020

yeoserene commented Sep 28, 2020

ghost commented Oct 1, 2020

malfet commented Oct 8, 2020

wahlberg82 commented Oct 12, 2020

HardLaugh commented Jan 9, 2020 •

edited by pytorch-probot bot

HardLaugh commented Jan 10, 2020 •

edited