Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interoperability with PyTorch memory pool #2762

Closed
kmaehashi opened this issue Dec 5, 2019 · 12 comments
Closed

Interoperability with PyTorch memory pool #2762

kmaehashi opened this issue Dec 5, 2019 · 12 comments

Comments

@kmaehashi
Copy link
Member

Both CuPy and PyTorch have its memory pool. For interoperability, it would be better if the memory pool can be shared.

Currently we provide a way to use PyTorch memory pool as a CuPy memory pool. One idea is to move this code to CuPy code base (maybe under cupyx). https://github.com/chainer/chainer-pytorch-migration/blob/master/chainer_pytorch_migration/allocator.py

@kmaehashi kmaehashi added the cat:feature New features/APIs label Dec 5, 2019
@niboshi
Copy link
Member

niboshi commented Dec 5, 2019

PyTorch is a higher-level library.
It's not smart to make a dependency from CuPy to PyTorch.
As CuPy already has a generalized API to insert arbitrary allocator, I think it's cleaner to make PyTorch expose its allocator.

@niboshi niboshi added st:needs-discussion and removed cat:feature New features/APIs labels Dec 5, 2019
@kmaehashi
Copy link
Member Author

As CuPy already has a generalized API to insert arbitrary allocator, I think it's cleaner to make PyTorch expose its allocator.

Yes, I agree with that. That's another option.

It's not smart to make a dependency from CuPy to PyTorch.

What we need to do is to provide a feature user needs when they want. Actual user needs sometimes precede the smart or clean design. And I think there is an immediate need for this feature.

@niboshi
Copy link
Member

niboshi commented Dec 5, 2019

We don't have to spoil modularity in order to fulfill the immediate necessity; that can be done by a separate library.

@kmaehashi
Copy link
Member Author

that can be done by a separate library.

I agree that it is another option to consider 👍

@leofang
Copy link
Member

leofang commented Dec 6, 2019

Is #2710 relevant? I know some core functionalities of PyTorch are implemented in C++, but I am not sure about their memory pool.

@niboshi
Copy link
Member

niboshi commented Dec 9, 2019

Is #2710 about exposing memory pool in Cython for use within CuPy, or exposing it in C/C++ for use in external apps?

I think the former is already accomplished in the current CuPy code (and irrelevant to PyTorch interoperability).

As for the latter, I'm not familiar with this kind of usage of Cython so I might be misunderstanding, but I think it's not easy to use for users.
They would have to compile PyTorch C++ code by themselves using a CuPy installation.

@leofang
Copy link
Member

leofang commented Dec 16, 2019

Sorry for my late reply.

or exposing it in C/C++ for use in external apps?

#2710 is raised for this use case. In cupy.cuda.thrust, a handle to the CuPy memory pool is implemented so that the C++ functions in cupy_thrust.cu can use it. I believe it is better to move that handle to elsewhere as I suggested in #2710.

but I think it's not easy to use for users.
They would have to compile PyTorch C++ code by themselves using a CuPy installation.

That's right. This is meant for developers building projects on top of (or dependent on) CuPy, not for general users. For the case of PyTorch, a lot of nasty things are done at the C++ level, so I think this could be useful.

@kmaehashi kmaehashi self-assigned this Dec 17, 2019
@kmaehashi
Copy link
Member Author

kmaehashi commented Jan 9, 2020

PyTorch is a higher-level library.

I think we can't define which of PyTorch/CuPy is higher as they don't depend on each other. They're both tensor libraries.

Ideas:

  1. Create an interface in CuPy (e.g., cupyx.utils.use_torch_allocator())
  • Pros
    • Can provide the feature in minimum cost (only a few lines of code addition)
  • Cons
    • Dependency from CuPy to PyTorch. (this can be solved by making it an optional dependency)
  1. Create an interface in PyTorch (e.g., cupy.cuda.set_allocator(pytorch.cuda.cupy_allocator))
  • Pros
    • No modifications in CuPy
  • Cons
    • Dependency from PyTorch to CuPy. (note that allocator function must return cupy.cuda.memory.MemoryPointer)
  1. Create a separate library for the interface (e.g., cupy_interop.use_torch_allocator())
  • Pros
    • No dependency issues.
  • Cons
    • The overhead of making it a library is too large compared to the size of the feature.
  1. Add feature to replace memory allocator in PyTorch (e.g., pytorch.cuda.set_allocator(cupy.cuda.pytorch_allocator))
  • Pros
    • Most versatile solution
  • Cons
    • Relatively large fix in PyTorch (e.g., many code directly depends on CUDACachingAllocator) and also in CuPy (e.g., we need to provide a statistics API compatible with PyTorch)
    • It seems PyTorch has no other expected use-case other than CuPy interoperability.

Note that the idea 1 to 3 is to use PyTorch allocator in CuPy, whereas idea 4 is to use CuPy allocator in PyTorch.

@niboshi
Copy link
Member

niboshi commented Jan 9, 2020

I agree in that CuPy and PyTorch should not depend on each other.
So I think options 1. 2., and 4. should be avoided.

Another library (option 3.) is not necessary either.

My suggestion is to make PyTorch expose its bare allocators and wrap them with cupy.cuda.memory.CFunctionAllocator, just as done with ChainerX.

https://github.com/chainer/chainer/blob/53521cbfd7827c613396f6afcba04ca2362a612a/chainerx/_cuda.py#L29-L34

@emcastillo
Copy link
Member

emcastillo commented Jan 28, 2020

BTW, I found this in pytorch code when looking in-depth

https://github.com/pytorch/pytorch/blob/master/aten/src/THC/THCThrustAllocator.cuh

This is to allow thrust to get memory from pytorch.
It seems easy to add a similar interface that could export the actual allocator to pybind and then hook it to cupy as @niboshi said above.

More
https://github.com/pytorch/pytorch/blob/d9115b533aef0fc37d323c8c52b9c74a7807fa2b/c10/core/Allocator.h#L127-L170

@leofang
Copy link
Member

leofang commented Mar 10, 2020

Would this issue be closed by #3126?

@kmaehashi
Copy link
Member Author

Closing as we implemented this in a separate library.
https://github.com/pfnet/pytorch-pfn-extras/blob/master/docs/cuda.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants