## A way to map numpy to cupy

[CuPy](https://cupy.dev/) is a NumPy/SciPy-compatible Library for GPU-accelerated Computing with Python

I've been experimenting with ways to quickly convert numpy code to cupy and this notebook shows a potential approach. 

### Cavaets

I program in a lot of languages, an expert in none, so I am possibly missing an already existing pattern or utility to do this.  Also mapping numpy to cupy would probably cause a bunch of bugs if you did it all over the place in an existing code base.  For new code you might even consider using an alias for np then mapping cupy or numpy to it depending on whether cupy is available.

That being said this technique might be useful if you have small well contained scripts, notebook cells or functions and want to quickly see if cupy will accelerate performance. 

## Dependencies

This notebook requires a python environment that supports notebook and numpy and also requires cupy. 

To insatll cupy for your OS and Cuda version you may have to research the exact installation command.  For Windows 10 and Cuda-11 I have used

```
conda install -c conda-forge cupy cudatoolkit=11.
```

## Create some test arrays

Create test arrays and also push them to the GPU.

In this small fun example I create 3 test arrays.  This is because my notebooks, which I run in VSCode, output the time each cell takes to run.

In order for these times to be 'a bit' more reliable I process 3 arrays, to dillute the potential effect of initialization., 

For more formal benchmarking one would research code timing and profiling libraries and think through how to properly do initialization and 'warm up' stages to avoid biasing timing results. 

In [15]:
import numpy as np
import cupy as cp

n=500

# create random image of size 1000x1000x1000
img1 = np.random.rand(n,n,n)
img2 = np.random.rand(n,n,n)
img3 = np.random.rand(n,n,n)

img_gpu1 = cp.asarray(img1)
img_gpu2 = cp.asarray(img2)
img_gpu3 = cp.asarray(img3)


## Define utilities to switch between numpy and cupy

Here we define a utility to check if cupy is installed, and a utility to check an array to see if it is a cupy array, and if so return cupy, if not return numpy.

In [16]:
import numpy as np
import importlib

try:
    cupy = importlib.import_module('cupy')
    has_cupy = True
except ImportError:
    has_cupy = False

def is_cupy_installed():
    """
    Returns True if Cupy is installed, False otherwise.
    """
    return has_cupy

def get_platform(x):
    """
    Returns the appropriate package (NumPy or Cupy) depending on whether Cupy is installed and `x` is a
    Cupy array.

    Args:
        x (ndarray): Input array

    Returns:
        Module: The appropriate package (either NumPy or Cupy)
    """
    if has_cupy:
        if hasattr(cupy, 'ndarray') and isinstance(x, cupy.ndarray):
            return cupy
    return np


## Test with a numpy array

Here we get the platform for a numpy array, then run some simple numpy style code.

Note we print out the np object to check if it is a numpy module or cupy.  In this case it should be ```numpy```.

In [17]:
import time

### 1. Warmup stage

Here we do what is called a warmup stage.  When profiling fairly short code it is a good idea to call it once without timing, then call it multiple times in a timing stage.

Behind the scenes python may do some initialization of the numpy, cupy or other libraries that would bias timing results. 

In [18]:
np = get_platform(img1)
print(np)
test1=np.sum(img1, 0)

<module 'numpy' from 'c:\\Users\\bnort\\miniconda3\\envs\\decon_bioformats\\lib\\site-packages\\numpy\\__init__.py'>


### 2. Timing Stage

In [19]:

start = time.time()
test1=np.sum(img1, 0)
test2=np.sum(img2, 0)
test3=np.sum(img3, 0)
end = time.time()
print(end - start)

0.4059131145477295


## Test with a CuPy array

Run the same tests with the cupy array.  We should see the ```np``` variable get mapped to cupy instead of numpy.

First map ```np``` to the right module and run a warmup.  Then time 3 applications of the sum function. 

In [13]:
np = get_platform(img_gpu1)
print(np)
test1=np.sum(img_gpu1, 0)

<module 'cupy' from 'c:\\Users\\bnort\\miniconda3\\envs\\decon_bioformats\\lib\\site-packages\\cupy\\__init__.py'>


In [14]:

start = time.time()
test1=np.sum(img_gpu1, 0)
test2=np.sum(img_gpu2, 0)
test3=np.sum(img_gpu3, 0)
end = time.time()
print(end - start)

0.000997304916381836
