## Accelerating Python code with CuPy

The example was used from [this website.](https://towardsdatascience.com/heres-how-to-use-cupy-to-make-numpy-700x-faster-4b920dda1f56)

In [1]:
import numpy as np
import cupy as cp
import time

In the next cells, switching between Numpy and CuPy is as easy as replacing the Numpy `np` with CuPy’s `cp`.

To measure the speed of creating the arrays, Python’s native time library was used:

In [2]:
### Numpy and CPU
s = time.time()
x_cpu = np.ones((1000,1000,1000))
e = time.time()
print(e - s)


2.289909839630127


In [3]:
### CuPy and GPU
s = time.time()
x_gpu = cp.ones((1000,1000,1000))
cp.cuda.Stream.null.synchronize()
e = time.time()
print(e - s)

0.44948768615722656


As it is seen in the cells above CuPy was much faster. Numpy created the array of 1 Billion 1’s in 2.4618 seconds while CuPy only took 1.08; that’s a 10.5X speedup!

Now in the cells below there will be some mathematical operations done on the arrays. This time we’ll multiply the entire array by 5 and again check the speed of Numpy vs CuPy.

In [4]:
### Numpy and CPU
s = time.time()
x_cpu *= 5
e = time.time()
print(e - s)

1.083198070526123


In [5]:
### CuPy and GPU
s = time.time()
x_gpu *= 5
cp.cuda.Stream.null.synchronize()
e = time.time()
print(e - s)

0.3018193244934082


In the result, it was also clear that the CuPy is faster than Numpy.

Now try working with multiple arrays and do a few operations. The code down below will do the following:
1. Multiple the array by 5
2. Multiple the array by itself
3. Add the array to itself

In [6]:
### Numpy and CPU
s = time.time()
x_cpu *= 5
x_cpu *= x_cpu
x_cpu += x_cpu
e = time.time()
print(e - s)

3.253412961959839


In [7]:
### CuPy and GPU
s = time.time()
x_gpu *= 5
x_gpu *= x_gpu
x_gpu += x_gpu
cp.cuda.Stream.null.synchronize()
e = time.time()
print(e - s)

0.5709450244903564
