## Sharing is Caring: GPU Interoperability and <3 of All Frameworks

In [1]:
import cupy as cp
import numpy as np
from numba import cuda

# PyTorch 1.4 supports direct __cuda_array_interface__ handoff.
import torch

# RFC: https://github.com/tensorflow/community/pull/180
# !pip install tfdlpack-gpu
import tfdlpack

### Create GPU Arrays and Move to DL Frameworks with `__cuda_array_interface__`

Frameworks that leverage the `__cuda_array_interface__` can be seamlessly transferred from compatiable libraries (CuPy, Numba, cuSignal, etc) directly, without using an intermediate Tensor format like [DLPack](https://github.com/dmlc/dlpack)

**CuPy <-> PyTorch**

In [2]:
# CuPy - GPU Array (like NumPy!)
gpu_arr = cp.random.rand(10_000, 10_000)

# Look at pointer
print('CuPy GPU Array Pointer: ', gpu_arr.__cuda_array_interface__['data'])

# Migrate from CuPy to PyTorch
torch_arr = torch.as_tensor(gpu_arr, device='cuda')

# Look at pointer -- it's the same as the CuPy array above!
print('PyTorch GPU Tensor Pointer: ', torch_arr.__cuda_array_interface__['data'])

# Migrate from PyTorch to CuPy
cupy_arr = cp.asarray(torch_arr)

# Look at pointer
print('CuPy GPU Pointer: ', cupy_arr.__cuda_array_interface__['data'])

CuPy GPU Array Pointer:  (140657360371712, False)
PyTorch GPU Tensor Pointer:  (140657360371712, False)
CuPy GPU Pointer:  (140657360371712, False)


**Numba CUDA <-> PyTorch**

In [3]:
# NumPy - CPU Array
cpu_arr = np.random.rand(10_000, 10_000)

# Use Numba to move to GPU
numba_gpu_arr = cuda.to_device(cpu_arr)

# Migrate from Numba, used for custom CUDA JIT kernels to PyTorch
torch_arr_numba = torch.as_tensor(numba_gpu_arr, device='cuda')

# Migrate from PyTorch back to Numba
numba_arr_from_torch = cuda.to_device(torch_arr_numba)

# Pointer love again
print('Numba GPU Array Pointer: ', numba_gpu_arr.__cuda_array_interface__['data'])
print('PyTorch GPU Tensor Pointer: ', torch_arr_numba.__cuda_array_interface__['data'])
print('Numba GPU Pointer: ', numba_arr_from_torch.__cuda_array_interface__['data'])

Numba GPU Array Pointer:  (140655749758976, False)
PyTorch GPU Tensor Pointer:  (140655749758976, False)
Numba GPU Pointer:  (140655749758976, False)


### Create GPU Arrays and Move to DL Frameworks with DLPack

Not all major frameworks currently support the `__cuda_array_interface__`, cough, [TensorFlow](https://www.tensorflow.org/). We can use the aforementioned DLPack as a bridge between the GPU ecosystem and TensorFlow with `tfdlpack`. See [this RFC](https://github.com/tensorflow/community/pull/180) for more information.

Optional: Allow GPU growth in TensorFlow or TF will take over the entire GPU.

In [4]:
!export TF_FORCE_GPU_ALLOW_GROWTH=false

**CuPy <-> TensorFlow**

In [5]:
# CuPy - GPU Array (like NumPy!)
gpu_arr = cp.random.rand(10_000, 10_000)

# Use CuPy's built in `toDlpack` function to move to a DLPack capsule
dlpack_arr = gpu_arr.toDlpack()

# Use `tfdlpack` to migrate to TensorFlow
tf_tensor = tfdlpack.from_dlpack(dlpack_arr)

# Confirm TF tensor is on GPU
print(tf_tensor.device)

# Use `tfdlpack` to migrate back to CuPy
dlpack_capsule = tfdlpack.to_dlpack(tf_tensor)
cupy_arr = cp.fromDlpack(dlpack_capsule)

/job:localhost/replica:0/task:0/device:GPU:0


**Numba CUDA <-> TensorFlow**

In [6]:
# Reset CUDA memory
cuda.close()

# NumPy - CPU Array
cpu_arr = np.random.rand(10_000, 10_000)

# Use Numba to move to GPU
numba_gpu_arr = cuda.to_device(cpu_arr)

# Use CuPy's asarray function and toDlpack to create DLPack capsule. There are multiple other ways to do this (i.e. PyTorch Utils)
dlpack_arr = cp.asarray(numba_gpu_arr).toDlpack()

# Migrate from Numba, used for custom CUDA JIT kernels to PyTorch
tf_tensor = tfdlpack.from_dlpack(dlpack_arr)

# Confirm TF tensor is on GPU
print(tf_tensor.device)

# Use `tfdlpack` to migrate back to Numba
dlpack_capsule = tfdlpack.to_dlpack(tf_tensor)
numba_arr = cuda.to_device(cp.fromDlpack(dlpack_capsule))

/job:localhost/replica:0/task:0/device:GPU:0


**PyTorch <-> TensorFlow**

In [7]:
import torch
import tfdlpack
from torch.utils import dlpack as th_dlpack

# Torch - GPU Array
gpu_arr = torch.rand(10_000, 10_000).cuda()

# Use Torch's DLPack function to get DLPack Capsule
dlpack_arr = th_dlpack.to_dlpack(gpu_arr)

# Use `tfdlpack` to migrate to TensorFlow
tf_tensor = tfdlpack.from_dlpack(dlpack_arr)

# Confirm TF tensor is on GPU
print(tf_tensor.device)

# Use `tfdlpack` to migrate back to PyTorch
dlpack_capsule = tfdlpack.to_dlpack(tf_tensor)
torch_arr = th_dlpack.from_dlpack(dlpack_capsule)

/job:localhost/replica:0/task:0/device:GPU:0
