### Installing recommended libraries through PIP

In [None]:
# Optional requirement as library for CuPY
!pip install numpy

In [None]:
# Optional requirement as library for CuPY
!pip install scipy

In [None]:
# Optional requirement as library for CuPY
!pip install optuna

### Installing CuPy through Anaconda

#### Installing Anaconda

##### Windows
To install Anaconda for Windows, you just have to download and execute the executable on [Anaconda Website](https://www.anaconda.com/download).

#####
To install Anaconda for Linux, you just have to follow [this Anaconda Website tutorial](https://docs.anaconda.com/free/anaconda/install/linux/).

#### Installing CuPy
Technically, we can execute bash commands through Python notebook by adding a '!', but it could be that it has some problems with conda commands, so I recommend to use it directly on terminal.

In [None]:
# See https://docs.cupy.dev/en/stable/install.html#install-cupy-from-conda-forge
!conda install -c conda-forge cupy

##### Installing extra recommended cuda libraries for CuPy
These libraries are not needed, but theoretically it might speed up CuPy execution. If any of them fail to be installed, you might just skip it.

In [None]:
# Adding more cuda libraries through conda
# See https://docs.cupy.dev/en/stable/install.html#additional-cuda-libraries

In [None]:
# cuTENSOR https://anaconda.org/conda-forge/cutensor
!conda install -c conda-forge cutensor

In [None]:
# NCCL https://anaconda.org/conda-forge/nccl
!conda install -c conda-forge nccl

In [None]:
# cuDNN https://anaconda.org/anaconda/cudnn
!conda install -c anaconda cudnn

In [None]:
# cuSPARSELt https://anaconda.org/conda-forge/cusparselt
!conda install -c conda-forge cusparselt

### Testing CuPy

In [None]:
# grep command is only available for Linux/Mac OS. This will verify if cupy was correctly installed.
!pip freeze | grep cupy

The code below should print at least 1 cuda device available for CuPy.

In [1]:
import cupy as cp
x = cp.array([1, 2, 3])
print(x.device)
print(cp.cuda.runtime.getDeviceCount())

<CUDA Device 0>
1


#### Analyzing improvement

In [2]:
import numpy as np
import cupy as cp
import time

In [3]:
# Matrix multiplication with NumPy
start_time = time.time()
np_array = np.random.rand(10000, 10000)
np_result = np.matmul(np_array, np_array)
np_time = time.time() - start_time

In [4]:
# Matrix multiplication with CuPy
start_time = time.time()
cp_array = cp.random.randn(10000, 10000)
cp_result = cp.matmul(cp_array, cp_array)
cp_time = time.time() - start_time

In [5]:
print(f"NumPy Time: {np_time}")
print(f"CuPy Time: {cp_time}")

NumPy Time: 9.415247201919556
CuPy Time: 0.9719512462615967
