Move thnvrtc and DynamicLibrary to ATen #22362

ssnl · 2019-06-29T03:18:49Z

Having the NVRTC stub in ATen is necessary to call driver APIs in ATen. This is currently blocking #22229.

DynamicLibrary is also moved as it is used in the stub code, and seems general enough.

ssnl · 2019-06-30T07:02:54Z

The test error is from #22052 , not from this PR.

zdevito

Code changes look fine to me. I didn't review the build changes because I do not have the context for how they work anymore.

vadimkantorov · 2019-07-01T17:40:27Z

Does this open a way to custom fast user-provided CUDA-snippets? (like element-wise fused ops)

ssnl · 2019-07-01T19:30:11Z

@vadimkantorov could you elaborate a bit on what you mean exactly?

vadimkantorov · 2019-07-01T19:45:44Z

I mean providing a wrapper that, given a user-provided CUDA expresion like x => sin(cos(x)), can in runtime generate a kernel (without saving it to disk, like cpp_extension.load_inline would do, and I think there are also some restrictions on ABI matching the compiler that was used for compiling PyTorch) that applies over a CUDA tensor (inplace or outplace). Probably binary apply functions could exist as well. Even if this would exist only in contrib, an example of NVRTC runtime compilation and creation of a function for dense contiguous float CUDA tensors can be helpful.

In principle it's JIT's job, but meanwhile it's evolving, having a CUPY-like functionality: https://docs-cupy.chainer.org/en/stable/reference/kernel.html would be cool.

ssnl · 2019-07-01T19:52:32Z

@vadimkantorov I see.

Let me explain a bit. This PR adds nothing new (unless you are using ATen and only ATen). The NVRTC stub currently lives in libtorch, being used by jit exactly. The PR only moves it to ATen. Technically what you described could be a feature implemented in libtorch without this PR, and using the existing stub there. But maybe moving the stub to ATen makes bindings easier, idk. :)

ssnl · 2019-07-02T01:52:15Z

I think it would be great if this is also reviewed by a build person :)

zou3519 · 2019-07-02T14:18:36Z

I'm not sure if this is relevant, but I think you can call libtorch code from ATen.

ssnl · 2019-07-02T15:25:51Z

@zou3519 fascinating! Are there examples on how to do that?

zou3519 · 2019-07-02T15:34:44Z

You can include a torch header from any ATen file:

pytorch/aten/src/ATen/NamedTensor.cpp

Line 4 in b768777

#include <torch/csrc/utils/memory.h>

.

This is possible because @kostmo did some build work on merging libcaffe2 and libtorch so that they are the same library. This will definitely break the ATen standalone library build though, if it still exists and if it hasn't been broken already. I don't know what our plan regarding the standalone is.

ssnl · 2019-07-02T17:13:44Z

@zou3519 Oh that's cool. Good to know that. Thanks! :)

ssnl · 2019-07-03T18:15:21Z

Hey @kostmo , can you review this? Thank you :)

ssnl · 2019-07-03T20:18:53Z

@pytorchbot merge this please :)

facebook-github-bot

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ezyang · 2019-07-08T14:15:44Z

It looks like this diff need some internal build updates.

ssnl · 2019-07-08T14:19:00Z

@ezyang Let me know if there is any thing I can help. I am more than happy to!

ezyang · 2019-07-08T14:19:49Z

The internal CI error is

caffe2/aten/src/ATen/cuda/detail/CUDAHooks.cpp: included an untracked header: 
caffe2/aten/src/ATen/cuda/nvrtc_stub/ATenNVRTC.h

I'm waiting for my internal source copy to check out and I'll look some more.

ezyang · 2019-07-08T14:21:04Z

I got a simple fix. Testing.

Summary: Having the NVRTC stub in ATen is necessary to call driver APIs in ATen. This is currently blocking pytorch/pytorch#22229. `DynamicLibrary` is also moved as it is used in the stub code, and seems general enough. Pull Request resolved: pytorch/pytorch#22362 Differential Revision: D16131787 Pulled By: ezyang fbshipit-source-id: add2ee8a8865229578aa00001a00d5a6671e0e73

facebook-github-bot · 2019-07-09T16:03:53Z

@ezyang merged this pull request in 31d821e.

ezyang · 2019-07-10T14:36:15Z

Internally, users are reporting this error:

Error in dlopen or dlsym: libcaffe2_nvrtc.so: cannot open shared object file: No such file or directory

ssnl · 2019-07-10T14:38:49Z

@ezyang hmmm I know that internal builds are not using CMake. Maybe some internal build changes are needed for it to build the libcaffe2_nvtrc?

ezyang · 2019-07-10T14:40:23Z

So, what I'm kind of thinking right now, is that all I need to do is turn USE_DIRECT_NVRTC back on for the internal build. Because fbcode has its own strategy for lazy loading of CUDA libraries, so the thing you did here isn't necessary in that case. Unfortunately, I didn't actually review this PR, so I am not sure.

ssnl · 2019-07-10T14:48:57Z

@ezyang Ah I see. So the fbcode build just puts everything in one library, including the libcaffe2_nvrtc bits. Let me know how I can help with this!

also cc @kostmo who reviewed :)

pytorchbot added caffe2 oncall: jit Add this issue/PR to JIT oncall triage queue module: build Build system issues module: cuda Related to torch.cuda, and CUDA support in general module: internals Related to internal abstractions in c10 and ATen labels Jun 29, 2019

ssnl requested review from soumith and zdevito June 29, 2019 03:40

ssnl mentioned this pull request Jun 29, 2019

pin_memory malloc now uses existing context if available. #22229

Closed

ssnl force-pushed the thnvrtc_aten branch 6 times, most recently from e4af4de to 2b41e59 Compare June 30, 2019 01:29

Move thnvrtc and DynamicLibrary to ATen

ce15f39

ssnl force-pushed the thnvrtc_aten branch from 2b41e59 to ce15f39 Compare June 30, 2019 14:09

zdevito reviewed Jul 1, 2019

View reviewed changes

ezyang added the open source label Jul 1, 2019

zhangguanheng66 requested a review from colesbury July 1, 2019 22:16

zhangguanheng66 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 1, 2019

ssnl requested a review from kostmo July 3, 2019 18:15

kostmo approved these changes Jul 3, 2019

View reviewed changes

pytorchbot added the merge-this-please Was marked for merge with @pytorchbot merge this please label Jul 3, 2019

facebook-github-bot reviewed Jul 5, 2019

View reviewed changes

facebook-github-bot closed this in 31d821e Jul 9, 2019

ssnl deleted the thnvrtc_aten branch July 9, 2019 14:32

facebook-github-bot added the merged label Jul 9, 2019

mruberry added the Merged label Oct 28, 2020

Move thnvrtc and DynamicLibrary to ATen #22362

Move thnvrtc and DynamicLibrary to ATen #22362

Uh oh!

Conversation

ssnl commented Jun 29, 2019

Uh oh!

ssnl commented Jun 30, 2019

Uh oh!

zdevito left a comment

Choose a reason for hiding this comment

Uh oh!

vadimkantorov commented Jul 1, 2019

Uh oh!

ssnl commented Jul 1, 2019

Uh oh!

vadimkantorov commented Jul 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssnl commented Jul 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssnl commented Jul 2, 2019

Uh oh!

zou3519 commented Jul 2, 2019

Uh oh!

ssnl commented Jul 2, 2019

Uh oh!

zou3519 commented Jul 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssnl commented Jul 2, 2019

Uh oh!

ssnl commented Jul 3, 2019

Uh oh!

ssnl commented Jul 3, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jul 8, 2019

Uh oh!

ssnl commented Jul 8, 2019

Uh oh!

ezyang commented Jul 8, 2019

Uh oh!

ezyang commented Jul 8, 2019

Uh oh!

facebook-github-bot commented Jul 9, 2019

Uh oh!

ezyang commented Jul 10, 2019

Uh oh!

ssnl commented Jul 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Jul 10, 2019

Uh oh!

ssnl commented Jul 10, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

vadimkantorov commented Jul 1, 2019 •

edited

Loading

ssnl commented Jul 1, 2019 •

edited

Loading

zou3519 commented Jul 2, 2019 •

edited

Loading

ssnl commented Jul 10, 2019 •

edited

Loading