Add LazyNVRTC #45674

malfet · 2020-10-01T15:49:53Z

Instead of dynamically loading caffe2_nvrtc, lazyNVRTC provides the same functionality by binding all the hooks to lazy bind implementation, very similar to the shared library jump tables:
On the first call, each function from the list tries to get a global handle to the respective shared library and replace itself with the dynamically resolved symbol, using the following template:

  auto fn = reinterpret_cast<decltype(&NAME)>(getCUDALibrary().sym(C10_SYMBOLIZE(NAME)));
  if (!fn)
    throw std::runtime_error("Can't get" ## NAME);
  lazyNVRTC.NAME = fn;
  return fn(...)

Fixes #31985

facebook-github-bot

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

codecov · 2020-10-02T10:25:34Z

Codecov Report

Merging #45674 into master will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #45674      +/-   ##
==========================================
- Coverage   68.59%   68.59%   -0.01%     
==========================================
  Files         410      410              
  Lines       52667    52667              
==========================================
- Hits        36129    36128       -1     
- Misses      16538    16539       +1

Impacted Files	Coverage Δ
torch/testing/_internal/expecttest.py	`77.55% <0.00%> (-1.03%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f6dc256...3885eba. Read the comment docs.

aten/src/ATen/cuda/detail/LazyNVRTC.cpp

ngimel · 2020-10-02T19:27:17Z

That's cool, looks like we don't need caffe2_nvrtc at all then? Should we clean up the code related to that?

Instead of dynamically loading `caffe2_nvrtc`, lazyNVRTC provides the same functionality by binding all the hooks to lazy bind implementation, very similar to the shared library jump tables: On the first call, each function from the list tries to get a global handle to the respective shared library and replace itself with the dynamically resolved symbol, using the following template: ``` auto fn = reinterpret_cast<decltype(&NAME)>(getCUDALibrary().sym(C10_SYMBOLIZE(NAME))); if (!fn) throw std::runtime_error("Can't get" ## NAME); lazyNVRTC.NAME = fn; return fn(...) ```

malfet · 2020-10-05T17:04:31Z

@ngimel I will clean up build and packaging issues in the followup PR

facebook-github-bot

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-10-06T02:13:27Z

@malfet merged this pull request in 1558a36.

Summary: Instead of dynamically loading `caffe2_nvrtc`, lazyNVRTC provides the same functionality by binding all the hooks to lazy bind implementation, very similar to the shared library jump tables: On the first call, each function from the list tries to get a global handle to the respective shared library and replace itself with the dynamically resolved symbol, using the following template: ``` auto fn = reinterpret_cast<decltype(&NAME)>(getCUDALibrary().sym(C10_SYMBOLIZE(NAME))); if (!fn) throw std::runtime_error("Can't get" ## NAME); lazyNVRTC.NAME = fn; return fn(...) ``` Fixes pytorch/pytorch#31985 Pull Request resolved: pytorch/pytorch#45674 Reviewed By: ezyang Differential Revision: D24073946 Pulled By: malfet fbshipit-source-id: 1479a75e5200e14df003144625a859d312885874

malfet requested a review from apaszke as a code owner October 1, 2020 15:49

malfet marked this pull request as draft October 1, 2020 15:50

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Oct 1, 2020

malfet force-pushed the malfet/get-rid-of-caffe2-nvrtc branch 3 times, most recently from 90d9aa0 to 3885eba Compare October 2, 2020 05:45

malfet requested review from ezyang and ngimel October 2, 2020 05:45

malfet added module: cuda Related to torch.cuda, and CUDA support in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Oct 2, 2020

malfet marked this pull request as ready for review October 2, 2020 05:46

facebook-github-bot reviewed Oct 2, 2020

View reviewed changes

ezyang reviewed Oct 2, 2020

View reviewed changes

aten/src/ATen/cuda/detail/LazyNVRTC.cpp Show resolved Hide resolved

ezyang approved these changes Oct 2, 2020

View reviewed changes

malfet force-pushed the malfet/get-rid-of-caffe2-nvrtc branch from 3885eba to 9185949 Compare October 3, 2020 22:29

facebook-github-bot reviewed Oct 5, 2020

View reviewed changes

facebook-github-bot closed this in 1558a36 Oct 5, 2020

facebook-github-bot added the merged label Oct 6, 2020

malfet deleted the malfet/get-rid-of-caffe2-nvrtc branch October 13, 2020 02:01

mruberry added the Merged label Oct 28, 2020

malfet mentioned this pull request Nov 20, 2023

Avoid cuda stubs libraries being RPATHed #109493

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LazyNVRTC #45674

Add LazyNVRTC #45674

malfet commented Oct 1, 2020 •

edited

facebook-github-bot left a comment

codecov bot commented Oct 2, 2020 •

edited

ngimel commented Oct 2, 2020

malfet commented Oct 5, 2020

facebook-github-bot left a comment

facebook-github-bot commented Oct 6, 2020

Add LazyNVRTC #45674

Add LazyNVRTC #45674

Conversation

malfet commented Oct 1, 2020 • edited

facebook-github-bot left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 2, 2020 • edited

Codecov Report

ngimel commented Oct 2, 2020

malfet commented Oct 5, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Oct 6, 2020

malfet commented Oct 1, 2020 •

edited

codecov bot commented Oct 2, 2020 •

edited