# Colab Helper Functions

## Mount the Google Drive
You will need to grant Drive access to Colab every time you start a new machine. You can do this by running the following cell. 

In [1]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


Once you have Drive authenticated, you can browse for the file by clicking the Folder icon on the left sidebar. For Eric, the Capstone project folder can be found using the path in the next cell. You can have the same path by adding a shortcut to the shared Drive to your personal GDrive.

# Read in the monkey data spreadsheet from Google Drive
A copy is also stored in the Github, which you should prefer using.

# Important columns
`id`: The ID assigned by the NYU CDS team to the scan. All the scan files are named accordingo to this ID.

`monkey_id`: The number assigned by the Langone team to the monkey.

`constructed_filepath`: The filepath to the original .OCT file, if we were able to find it.

`pt_present`: Boolean whether or not we have a PyTorch array corresponding to the 

`scan_shape`: The original dimensions of the scan.

`iop`: The intraocular pressure associated with the scan.

`icp`: The intracranial pressure associated with the scan. Used as label.

In [3]:
import pandas as pd 
import os 
import numpy as np
import shutil

import gspread
from google.colab import auth
from oauth2client.client import GoogleCredentials

auth.authenticate_user()
gc = gspread.authorize(GoogleCredentials.get_application_default())
worksheet = gc.open('Monkey Data').sheet1
rows = worksheet.get_all_values()
data = pd.DataFrame.from_records(rows[1:])
data.columns = rows[0]

In [4]:
data.head(5)

Unnamed: 0,id,monkey_id,training,oct_present,tiff_present,original scan folder (OCT),original scan folder (tiff),original scan file,scan date,scan time,eyeball (od or os),iop,icp,scan area,scan notes,handler,eric notes,constructed_filepath,exists,scan_shape
0,1,1,False,False,False,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,baseline,9,04/25/2013,11:43,OS,15,8.5,5x5,,zixiao,,,False,
1,2,1,True,True,True,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,monkey_1_4.25.13/'raw bog and tiffs'/icpnormal...,1,04/25/2013,11:45,OS,15,8.5,5x5,,zixiao,,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,True,"[512, 512, 2048]"
2,3,1,True,True,True,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,monkey_1_4.25.13/'raw bog and tiffs'/icpnormal...,2,04/25/2013,11:46,OS,15,8.5,5x5,good image,zixiao,,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,True,"[512, 512, 2048]"
3,4,1,True,True,True,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,monkey_1_4.25.13/'raw bog and tiffs'/icpnormal...,3,04/25/2013,11:50,OS,15,8.5,3x3,,zixiao,,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,True,"[512, 512, 2048]"
4,5,1,True,True,True,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,monkey_1_4.25.13/'raw bog and tiffs'/icpnormal...,4,04/25/2013,11:51,OS,15,8.5,3x3,centered,zixiao,,monkey_1_4.25.13/raw bog and tiffs/icpnormal/1...,True,"[512, 512, 2048]"


## Clone the Github
Not any different from your local command line. Since the repo is public, you do not even need to authenticate. Once you are done fiddling around, we recommend you export the notebook from Colab and make a pull request adding the notebook into the `src/notebooks` folder of the repo.

Let's also add some symbolic links to make sure we can run the same code everywhere. 

- The original raw OCTs are put in the Capstone2021/data/raw/ folder.
- The downsized Torch arrays are put in the Capstone2021/data/torch_arrays_128/ folder.
- Model weights are put in the Capstone2021/models/ folder.

In [30]:
!cd /content/
!git clone https://github.com/Bulbasaurzc/Capstone2021
!ln -s /content/gdrive/MyDrive/"CDS Capstone Project"/Data/torch_arrays_128/ Capstone2021/data/
!ln -s /content/gdrive/MyDrive/"CDS Capstone Project"/Data/Raw Capstone2021/data/
os.chdir('Capstone2021')

fatal: destination path 'Capstone2021' already exists and is not an empty directory.
ln: failed to create symbolic link 'Capstone2021/data/torch_arrays_128': File exists
ln: failed to create symbolic link 'Capstone2021/data/Raw': File exists


## You might want to switch to your branch

In [55]:
!git pull
!git checkout eric
!git checkout main

remote: Enumerating objects: 7, done.[K
remote: Counting objects:  14% (1/7)[Kremote: Counting objects:  28% (2/7)[Kremote: Counting objects:  42% (3/7)[Kremote: Counting objects:  57% (4/7)[Kremote: Counting objects:  71% (5/7)[Kremote: Counting objects:  85% (6/7)[Kremote: Counting objects: 100% (7/7)[Kremote: Counting objects: 100% (7/7), done.[K
remote: Compressing objects:  50% (1/2)[Kremote: Compressing objects: 100% (2/2)[Kremote: Compressing objects: 100% (2/2), done.[K
Unpacking objects:  25% (1/4)   Unpacking objects:  50% (2/4)   Unpacking objects:  75% (3/4)   remote: Total 4 (delta 2), reused 4 (delta 2), pack-reused 0[K
Unpacking objects: 100% (4/4)   Unpacking objects: 100% (4/4), done.
From https://github.com/Bulbasaurzc/Capstone2021
   dd138e0..f9c9e58  eric       -> origin/eric
Updating dd138e0..f9c9e58
Fast-forward
 data/monkey_data.csv | 3534 [32m+++++++++++++++++++++++++[m[31m-------------------------[m
 1 file changed, 1767 insertions

# Use our custom Dataset class
This code will lazily load in the downsized PyTorch arrays for the monkeys. You can pass this into a standard PyTorch DataLoader.

In [66]:
from src.data.torch_utils import MonkeyEyeballsDataset
from torch.utils.data import Dataset, DataLoader

labels = pd.read_csv('data/monkey_data.csv')
labels = labels[labels['torch_present'] & ~labels['icp'].isnull() & ~labels['iop'].isnull() & labels['icp'] > 0] 
train_labels = labels[labels['monkey_id'] != 14]
val_labels = labels[labels['monkey_id'] == 14]

med_train = MonkeyEyeballsDataset('data/torch_arrays_128', train_labels)
med_val = MonkeyEyeballsDataset('data/torch_arrays_128', val_labels)

dataloader_train = DataLoader(med_train, batch_size=8, num_workers=4, shuffle=True, pin_memory=True) 
dataloader_val = DataLoader(med_val, batch_size=8, num_workers=4, shuffle=False, pin_memory=True)

  cpuset_checked))


# Installing requirements
The Github should have a `requirements.txt` file which we can use to set up our local environment.

In [None]:
!pip3 install -r requirements.txt