## Recitation 0E: Introduction to Google Colab

### What is Google Colab?

Google Colab is a tool from Google Research that provides a jupyter-notebook style Python3 execution environment in your browser

### Creating a New Notebook

1. Go to https://colab.research.google.com/
2. Select "New notebook" at the bottom, OR select an existing notebook from the various sources (Google Drive / GitHub / Upload)

### Setting up the Processor and Runtime Environment

1. Click on "Runtime" in the toolbar
2. Select "Change runtime type"
3. Select the appropriate option

### Using Terminal
#### Installing Libraries

In [None]:
! pip install torch

#### Checking GPU instance

In [None]:
! nvidia-smi

In [None]:
import torch
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Device: ", DEVICE)

### Connecting to your Google Drive
#### Mounting the Drive

In [None]:
from google.colab import drive

drive.mount("/content/drive")

#### Creating a Directory

In [None]:
import os
os.mkdir("/content/drive/MyDrive/IDL-Sp23-Colab-Tutorial")

In [None]:
! ls /content/drive/MyDrive/IDL-Sp23-Colab-Tutorial

### Writing and Loading files from the directory
#### Dummy DataFrame

In [None]:
import pandas as pd
dummy_data = {"col_1":[1,2,3], "col_2":[10,20,4],}
df = pd.DataFrame(dummy_data)
df.head()

##### Saving File

In [None]:
df.to_csv("/content/drive/MyDrive/IDL-Sp23-Colab-Tutorial/test.csv", index=False)

In [None]:
! ls /content/drive/MyDrive/IDL-Sp23-Colab-Tutorial

##### Reading File

In [None]:
df_load = pd.read_csv("/content/drive/MyDrive/IDL-Sp23-Colab-Tutorial/test.csv", index_col="col_1")
df_load.head()

#### Dummy PyTorch Model

In [None]:
import torch
import torch.nn as nn

class MLP(nn.Module):

    def __init__(self, size):
        super(MLP, self).__init__()
        
        self.layers = []
        for in_dim, out_dim in zip(size[:-2], size[1:-1]):
          self.layers.extend([
              nn.Linear(in_dim, out_dim),
              nn.ReLU(),
              nn.BatchNorm1d(out_dim),
              nn.Dropout(0.5),
          ])
        self.layers.append(nn.Linear(size[-2], size[-1]))
        self.model = nn.Sequential(*self.layers)
        self.model.apply(self.init_param)

    def init_param(self, param):
      if type(param) == nn.Linear:
        nn.init.xavier_uniform_(param.weight)

    def forward(self, x):
      return self.model(x)

model = MLP([40, 2048, 512, 256, 71])

##### Saving Model

In [None]:
MODEL_SAVE_PATH = "/content/drive/MyDrive/IDL-Sp23-Colab-Tutorial/dummy_model.pt"

torch.save({
  'epoch': 10,
  'model_state_dict': model.state_dict(),
  'loss': 0.001,
}, MODEL_SAVE_PATH)

In [None]:
! ls /content/drive/MyDrive/IDL-Sp23-Colab-Tutorial

##### Loading Model

In [None]:
saved = torch.load(MODEL_SAVE_PATH)

print("epoch", saved["epoch"], ", loss", saved["loss"])


new_model = MLP([40, 2048, 512, 256, 71])
new_model.load_state_dict(saved["model_state_dict"])

### Kaggle
#### Connecting to Kaggle
Your Kaggle token could be found at:
Your Profile > Account tab > API section > Create New API token

In [None]:
import json

TOKEN = {"username":"<INSERT YOUR KEY HERE>","key":"<INSERT YOUR KEY HERE>"}


! pip install kaggle==1.5.12
! mkdir -p .kaggle
! mkdir -p /content & mkdir -p /content/.kaggle & mkdir -p /root/.kaggle/

with open('/content/.kaggle/kaggle.json', 'w') as file:
    json.dump(TOKEN, file)

! pip install --upgrade --force-reinstall --no-deps kaggle
! ls "/content/.kaggle"
! chmod 600 /content/.kaggle/kaggle.json
! cp /content/.kaggle/kaggle.json /root/.kaggle/

! kaggle config set -n path -v /content

#### Sample Kaggle Competition
https://www.kaggle.com/competitions/11785-sp23-intro-to-colab/
##### Downloading the Data

In [None]:
! kaggle competitions download -c 11785-sp23-intro-to-colab
! unzip /content/competitions/11785-sp23-intro-to-colab/11785-sp23-intro-to-colab.zip -d /content

##### Making a submission

In [None]:
! kaggle competitions submit -c 11785-sp23-intro-to-colab -f /content/sampleSubmission.csv -m "A test submission"

### Variable Inspector

In [None]:
import numpy as np
array = np.ones((10, 10))

### Resetting the Runtime
1) Runtime > Restart Runtime


2) Runtime > Disconnect and Delete Runtime

### Tip When Upgrading to Google Colab Pro
- Using your AndrewID to purchase Colab Pro will ensure you don't have to deal with the storage issue when saving your model checkpoints in Google Drive folders
- If you encounter the error of not having payments enabled from your ID, email IT Services at it-help@cmu.edu, they should get it resolved
- Please try to follow the best practices for saving model metrics and checkpoints as outlined in recitation 0s. It will help immensely in developing cleaner code and collaborating on ablations with your study group peers