## Running tensors and PyTorch objects on the GPUs (and making faster computations)

GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes to make everything hunky dory (good).

### 1. Getting a GPU

1. Easiest - Use Google Colab for a free GPU (options to upgrade as well)
2. Use your own GPU - takes little bit of setup and requires the investment of purchasing a GPU, there's lots of options...
see this post for what options  to get : https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/

3. Use cloud computing - GCP, AWS, Azure, these services allow you to rent computers on the cloud and access them

For 2, 3 PyTorch + GPU drivers (CUDA) takes a little bit of setting up, to do this, refer to PyTorch setup documentation: https://pytorch.org/get-started/locally/

In [None]:
!nvidia-smi

Thu Aug 22 19:41:46 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   55C    P8              12W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

### 2. Check for GPU access with PyTorch

In [1]:
# check for GPU access with PyTorch
import torch
torch.cuda.is_available()

True

In [5]:
# setyp device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

For PyTorch since it's capable of running compute on the GPU or CPU, it's best practice to setup device agnostic code : https://pytorch.org/docs/stable/notes/cuda.html

E.g. run on GPU if available, else default to CPU

In [3]:
# count number of devices
torch.cuda.device_count()

1

## 3. Putting tensors (and models) on the GPU

The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [2]:
# create a tensonr (default on the CPU)
tensor = torch.tensor([1, 2, 3], device = "cpu")

# tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [3]:
tensor = torch.tensor([1, 2, 3])

# tensor not on GPU
print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [6]:
# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### 4. Moving tensors back to the CPU

In [7]:
# if tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [8]:
# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [10]:
# unchanged in tensor_on_gpu device = 'cuda:0'
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')