<a href="https://colab.research.google.com/github/Amrapali03/Learn-PyTorch/blob/main/PyTorch_colab_notebook_guide.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Recitation 0E: Intro to Google Colab

# Basics

#### Google Colab

- Colab is developed by Google Research and provides a Jupyter Notebook-style Python execution environment accessible directly through a web browser
- Main benefit is its computing resources


#### Accessing Colab
- Go to https://colab.research.google.com/ to create and access your notebooks
- Directly from Google Drive

# Bash and Magic Commands

Colab runs in a linux environemnt and you can access the terminal with !




#### Bash Commands

In [None]:
!nvidia-smi

Tue May 21 19:54:00 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   61C    P8              10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
!pip install torch
import torch

Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
  Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
Collectin

In [None]:
!ls
# !cd ..
# !mkdir

sample_data


#### Magic commands


In [None]:
# %time
# %env
# %matplotlib inline
# %debug

ERROR:root:No traceback has been produced, nothing to debug.


# Runtime


In [None]:
!nvidia-smi

Tue May 21 20:12:14 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   39C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
import torch
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Device: ", DEVICE)

Device:  cpu


### Utilizing Free GPU/TPU Resources

#### Changing Runtime
- Runtime > Change runtime type
- Select GPU/TPU and High-RAM option



#### GPUs: Training Time of ResNet50
- T4: 1x Speedup (Baseline)
- V100: 3.6x Speedup (Comparing to T4)
- A100: 10x Speedup (Comparing to T4)
- TPU: TPU is a completely different architecture and require many training constraints

# Managing your files

### Mounting to Google Drive

Very useful as you lose all files after the runtime ends

In [None]:
from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


### Saving/Loading files - Model checkpoints

Moving files to and from Google Drive

In [None]:
import torch
import torch.nn as nn

class MLP(nn.Module):

    def __init__(self, size):
        super(MLP, self).__init__()

        self.layers = []
        for in_dim, out_dim in zip(size[:-2], size[1:-1]):
          self.layers.extend([
              nn.Linear(in_dim, out_dim),
              nn.ReLU(),
              nn.BatchNorm1d(out_dim),
              nn.Dropout(0.5),
          ])
        self.layers.append(nn.Linear(size[-2], size[-1]))
        self.model = nn.Sequential(*self.layers)
        self.model.apply(self.init_param)

    def init_param(self, param):
      if type(param) == nn.Linear:
        nn.init.xavier_uniform_(param.weight)

    def forward(self, x):
      return self.model(x)

model = MLP([40, 2048, 512, 256, 71])

In [None]:
!mkdir /content/drive/MyDrive/checkpoints

MODEL_SAVE_PATH = "/content/drive/MyDrive/checkpoints/checkpoint.pt"

torch.save({
  'epoch': 10,
  'model_state_dict': model.state_dict(),
  'loss': 0.001,
}, MODEL_SAVE_PATH)

### Managing dataset

Obtaining dataset
- Kaggle Command
- Manually uploading

DO NOT directly store dataset in Google Drive, instead...
- Download/uploading dataset every time
- Move dataset from Google Drive into content folder
- Connect to GCP or AWS


In [None]:
#downloads dataset from kaggle

!pip install --upgrade --force-reinstall --no-deps kaggle==1.5.8
!mkdir /root/.kaggle

with open("/root/.kaggle/kaggle.json", "w+") as f:
    f.write('{"username":"amrapalisamanta","key":"<API KEY>"}')
    # Put your kaggle username & key here
!chmod 600 /root/.kaggle/kaggle.json

! kaggle competitions download -c 11785-sp23-intro-to-colab
! unzip /content/competitions/11785-sp23-intro-to-colab/11785-sp23-intro-to-colab.zip -d /content

Collecting kaggle==1.5.8
  Using cached kaggle-1.5.8-py3-none-any.whl
Installing collected packages: kaggle
  Attempting uninstall: kaggle
    Found existing installation: kaggle 1.5.8
    Uninstalling kaggle-1.5.8:
      Successfully uninstalled kaggle-1.5.8
Successfully installed kaggle-1.5.8
mkdir: cannot create directory ‘/root/.kaggle’: File exists
Traceback (most recent call last):
  File "/usr/local/bin/kaggle", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/kaggle/cli.py", line 67, in main
    out = args.func(**command_args)
  File "/usr/local/lib/python3.10/dist-packages/kaggle/api/kaggle_api_extended.py", line 757, in competition_download_cli
    self.competition_download_files(competition, path, force,
  File "/usr/local/lib/python3.10/dist-packages/kaggle/api/kaggle_api_extended.py", line 720, in competition_download_files
    url = response.retries.history[0].redirect_location.split('?')[0]
IndexError: tuple index out of range
unzi

### Restart session vs restart runtime


Restart session
- Runtime > Restart session
- Clears all session variables

Restart runtime
- Runtime > Disconnect and delete runtime
- Deletes session
- Lose files in content folder
- Switching GPUs will also delete current runtime

# Colab Pro

- Longer session runtime, reducing risk of timeout
- Priority access to GPU
- Increased storage
- Supposedly background execution
