New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose GPU pointers to Python #5904

Merged
merged 1 commit into from Sep 19, 2017

Conversation

Projects
None yet
6 participants
@longjon
Copy link
Contributor

longjon commented Sep 7, 2017

This is a rebased and cleaned-up version of a commit from @tnarihi, 7b99436.

The purpose of this is to expose the GPU blob pointers, permitting use of libraries like PyCUDA without needless and expensive blob roundtripping.

This commit has been modified from the original in the following ways:

  • Wrapper functions are removed; instead the accessors are cast directly. This reduces the amount of code and also makes the low-level nature of the call a bit more evident at the place where such things are normally specified.
  • reinterpret_cast is used instead of C-style casting.
  • uintptr_t is used instead of size_t.
  • An underscore is prepended to the property names, emphasizing that these are low-level, at-your-own-risk properties. Since these are raw pointers, it's up to the user to decide when they remain valid.

I am making this PR now to have minimal mergeable access to blob GPU memory. On the Python side, it's easy to write code to convert Blobs to (e.g.) GPUArrays. Such code could also be brought in as desired (see @tnarihi's PRs).

Expose GPU pointers to Python
The pointers could be used by CUDA wrapper libraries in Python such as
PyCUDA, gnumpy, Theano etc.
@shaibagon

This comment has been minimized.

Copy link
Member

shaibagon commented Sep 7, 2017

Is there any automatic test for this feature?

@shelhamer shelhamer added the focus label Sep 8, 2017

@vimalthilak

This comment has been minimized.

Copy link

vimalthilak commented Sep 8, 2017

would be a very nice feature! ಠ_ಠ (yes, I lifted this from @longjon's comment )

@longjon

This comment has been minimized.

Copy link
Contributor Author

longjon commented Sep 11, 2017

@shaibagon No test is included here; since this code just exposes an existing call, it's not clear what to test, especially without bringing in additional Python dependencies. The idea here is to have a small mergeable patch which just makes it possible to access GPU memory from Python. If you are a regular old user, the public interface is unchanged and there is no new functionality. If you are a hacker, you can dig into these functions and interface to PyCUDA &c. free of round-trips (this is what I am doing). If you are a developer, you can think about adding these interfaces to pycaffe.

@vimalthilak I'm flattered but I'm not sure you've grokked the expression (https://www.google.com/search?q=ಠ_ಠ).

@vimalthilak

This comment has been minimized.

Copy link

vimalthilak commented Sep 11, 2017

@longjon You are right! Glad to stand corrected :).

My usage is more along the lines of I'm keeping my eyes on this thread!

@shaibagon

This comment has been minimized.

Copy link
Member

shaibagon commented Sep 11, 2017

@longjon This is indeed a very simple (yet important) feature, but I suppose a very simple test verifying that indeed the exposed GPU pointer can be used in python.

@longjon

This comment has been minimized.

Copy link
Contributor Author

longjon commented Sep 12, 2017

Well, I don't think there is such a thing as a simple test here: in order to check that these functions return something valid, we would need to read GPU memory from Python. We would need a new Python library to do that, which we would not want to make a hard dependency of pycaffe. That means we would need a way to disable some tests, or to acknowledge that not all tests should run and pass all the time. Then we would need to make sure that the new test was sometimes run, even though it wouldn't be runnable on Travis.

That's all possible (after all, Caffe has a CPU-only mode and GPU tests on the C++ side), but I don't think I'm going to do any of that in this PR, and I don't really feel bad about it, because there is no change to the public interface, only a private exposure of calls that are already tested in Caffe.

Future code which adds a public interface wrapping these calls probably merits testing. If such code interfaced to PyCUDA, e.g., it would already have to deal with the issue of dependencies.

@naibaf7

This comment has been minimized.

Copy link
Member

naibaf7 commented Sep 12, 2017

I'm currently working on something to allow writing GPU layers for both CUDA and OpenCL from Python on the OpenCL branch (single source and runtime compile for both CUDA and OpenCL). PyCUDA won't even be necessary there.

@shaibagon

This comment has been minimized.

Copy link
Member

shaibagon commented Sep 12, 2017

@longjon sounds reasonable. no need to over comicate things. What I like most in caffe is the automatic tests: I can be certain that everything does what it is expected to do.

@shaibagon

This comment has been minimized.

Copy link
Member

shaibagon commented Sep 12, 2017

@naibaf7 GPU pyhon layer is a huge step forward for caffe. Thanks!

@shelhamer shelhamer merged commit effcdb0 into BVLC:master Sep 19, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@shelhamer

This comment has been minimized.

Copy link
Member

shelhamer commented Sep 19, 2017

Thanks for bringing this commit back so we can sip from the GPU through Python @longjon and thanks @tnarihi for the original patch!

@naibaf7

This comment has been minimized.

Copy link
Member

naibaf7 commented Sep 19, 2017

@shaibagon @shelhamer Since this PR is usable for CUDA only: I'll release an initial version of the "device abstracted" Caffe soon, Python GPU will follow in 1-2 months approximately, which will allow to write device abstracted GPU code directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment