<a href="https://colab.research.google.com/github/Mroschelle/CS289A_project/blob/main/ColabGPU_289_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Colab GPU Acceleration Setup Guide

**First, make a copy of this notebook for yourself so that you can make edits. "File" -> "Save a copy in Drive"**

Above select the "Runtime" dropdown -> "Change runtime type". You should use Python 3, and "hardware accelerator" should be "GPU".

The following code cell will ensure that you have the GPU enabled. It will also provide system specifications. You should see something similar (with possibly different specs) as:

```
GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-d977e794-801c-3a65-2cd2-2fe83043d501)
Wed Apr  8 23:17:24 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0    25W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
Model name:          Intel(R) Xeon(R) CPU @ 2.20GHz
Socket(s):           1
Core(s) per socket:  1
Thread(s) per core:  2
L3 cache:            56320K
CPU MHz:             2200.000
13G
Avail
34G
```

In [None]:
#GPU count and name
!nvidia-smi -L

#use this command to see GPU activity while doing Deep Learning tasks, for this command 'nvidia-smi' and for above one to work, go to 'Runtime > change runtime type > Hardware Accelerator > GPU'
!nvidia-smi

!lscpu |grep 'Model name'

#no.of sockets i.e available slots for physical processors
!lscpu | grep 'Socket(s):'

#no.of cores each processor is having 
!lscpu | grep 'Core(s) per socket:'

#no.of threads each core is having
!lscpu | grep 'Thread(s) per core'

!lscpu | grep "L3 cache"

#if it had turbo boost it would've shown Min and Max MHz also but it is only showing current frequency this means it always operates at 2.3GHz
!lscpu | grep "MHz"

#memory that we can use
!free -h --si | awk  '/Mem:/{print $2}'

#hard disk space that we can use
!df -h / | awk '{print $4}'

/bin/bash: nvidia-smi: command not found
/bin/bash: nvidia-smi: command not found
Model name:                      Intel(R) Xeon(R) CPU @ 2.20GHz
Socket(s):                       1
Core(s) per socket:              1
Thread(s) per core:              2
L3 cache:                        55 MiB
CPU MHz:                         2200.166
12G
Avail
69G


# IMPORTANT: For every 12hrs or so Disk, RAM, VRAM, CPU cache etc data that is on our alloted virtual machine will get erased. MAKE SURE TO SAVE YOUR DATA.

The following code cell runs some code for Tensorflow-gpu to ensure that it can access the colab GPU. The output should contain something similar to:

```
physical_device_desc: "device: XLA_GPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 15701463552
locality {
  bus_id: 1
  links {
  }
}
incarnation: 12081463421592476599
physical_device_desc: "device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0"
]
``` 

In [8]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [11]:
%cd /content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project

/content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project


In [12]:
import tensorflow
from tensorflow.python.client import device_lib 
print(device_lib.list_local_devices())

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 205680351718223895
xla_global_id: -1
]


The following command can be used as a bash command to monitor the usage and memory of your GPU.

```
!nvidia-smi
```

If you would like to monitor constant updates of the GPU, you can run the following for 1 second updates

```
!watch -n 1 nvidia-smi
```

In [None]:
!nvidia-smi

Wed Apr 19 00:27:19 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   56C    P0    27W /  70W |    373MiB / 15360MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [14]:
from google.colab import drive

drive.mount('/content/gdrive')
%cd /content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).
/content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project


In [30]:
#dependencies
#Anaconda 3.6
#Pytorch
#Tensorboard
#OpenCV python
#torchvision
#scipy
import torch
from math import ceil, sqrt
import pickle
import torch.nn as nn
import os
from torch.autograd import Variable
# from skimage import io, transform
import numpy as np
import cv2
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils
import re
# from tensorboardX import SummaryWriter
from torch.utils.tensorboard import SummaryWriter
import time
from Neural_Network_Class import conv_deconv #Class where the network is defined
import pdb, glob
import scipy.io as spio

load = 1
train = 1
test = 1
tests_num = 1000
writer = SummaryWriter('runs_hn')
# once done, type in "tensorboard --logdir=runs_hn --bind_all" in terminal and go to the link being shown to visualize data

class ImageDataset(Dataset): #Defining the class to load datasets

    def __init__(self,input_dir,train=True):
        self.input_dir=input_dir
        self.train=train
        self.pix_size = 71
        self.current_datain=np.array([],dtype=np.float).reshape(0,self.pix_size,self.pix_size)
        self.current_dataout=np.array([],dtype=np.float).reshape(0,self.pix_size,self.pix_size)
        self.test_packts = 1
        self.train_packts = len(glob.glob(self.input_dir+'/*.mat')) - self.test_packts
    def __len__ (self):
        if self.train:
            return self.train_packts*2000 #I have kept size of testing data to be 50
        else:
            return self.test_packts*2000

    def __getitem__(self,idx):
        if self.train:
            if idx % 2000 == 0:
                ind = (int(idx/2000))%(self.train_packts)+1
                self.current_datain = spio.loadmat(self.input_dir+'/'+str(ind)+'.mat', squeeze_me=True)['tumorImage_withPSF']
                self.current_dataout = spio.loadmat(self.input_dir+'/'+str(ind)+'.mat', squeeze_me=True)['tumorImage_noPSF']
        else:
             if idx % 2000 == 0:
                ind = (int(idx/2000))%(self.test_packts)+1
                print(self.input_dir+'/'+str(self.train_packts + ind)+'.mat')
                self.current_datain =  spio.loadmat(self.input_dir+'/'+str(self.train_packts + ind)+'.mat', squeeze_me=True)['tumorImage_withPSF']
                self.current_dataout = spio.loadmat(self.input_dir+'/'+str(self.train_packts + ind)+'.mat', squeeze_me=True)['tumorImage_noPSF']   
        input_image= self.current_datain[idx%2000].reshape((1,self.pix_size,self.pix_size))     
        input_image = (input_image - input_image.min())/(input_image.max()-input_image.min())
        output_image=self.current_dataout[idx%2000].reshape((1,self.pix_size,self.pix_size))       
        output_image = (output_image - output_image.min())/(output_image.max() - output_image.min())              

        sample = {'input_image': input_image, 'output_image': output_image}             

        return sample

image_dir = "/content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project/images"
# train_dataset=ImageDataset(input_dir="images") #Training Dataset
train_dataset=ImageDataset(input_dir=image_dir) #Training Dataset
print(len(train_dataset))

test_dataset=ImageDataset(input_dir=image_dir,train=False) #Testing Dataset
batch_size = 250 #mini-batch size
n_iters = 240 #total iterations
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = ceil(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=False)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
# torch.cuda.set_device(device)
# model=conv_deconv().cuda(1) # Neural network model object
model=conv_deconv().to(device)

iter=0
iter_new=0 
check=os.listdir("checkpoints") #checking if checkpoints exist to resume training
if load and len(check):
    check.sort(key=lambda x:int((x.split('_')[2]).split('.')[0]))
    model=torch.load("checkpoints/"+check[-1],map_location=torch.device('cpu')).to(device)
    iter=int(re.findall(r'\d+',check[-1])[0])
    iter_new=iter
    print("Resuming from iteration " + str(iter))
    #os.system('python visualise.py')

                                                                              # https://discuss.pytorch.org/t/can-t-import-torch-optim-lr-scheduler/5138/6 
beg=time.time() #time at the beginning of training
if train:
    print("Training Started!")
    criterion=nn.MSELoss() #.cuda(1)  #Loss Class
        
    learning_rate = 0.005
    optimizer = torch.optim.Adam(model.parameters(),lr=learning_rate) #optimizer class
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)# this will decrease the learning rate by factor of 0.1
    for epoch in range(num_epochs):
        print("\nEPOCH " +str(epoch+1)+" of "+str(num_epochs)+"\n")
        for i,datapoint in enumerate(train_loader):
            datapoint['input_image']=datapoint['input_image'].type(torch.FloatTensor) #typecasting to FloatTensor as it is compatible with CUDA
            datapoint['output_image']=datapoint['output_image'].type(torch.FloatTensor)
            # input_image = Variable(datapoint['input_image'].cuda(1)) #Converting a Torch Tensor to Autograd Variable
            input_image = Variable(datapoint['input_image']) #.cuda(0)) #Converting a Torch Tensor to Autograd Variable
            # output_image = Variable(datapoint['output_image'].cuda(1))
            output_image = Variable(datapoint['output_image']) #.cuda(0))
            
            optimizer.zero_grad()  #https://discuss.pytorch.org/t/why-do-we-need-to-set-the-gradients-manually-to-zero-in-pytorch/4903/3
            outputs = model(input_image)
            # loss = criterion(outputs.to(torch.device('cpu')), output_image)
            # loss = criterion(outputs, output_image.cuda(1))
            loss = criterion(outputs, output_image) #.cuda(0))
            loss.backward() #Backprop
            optimizer.step()    #Weight update
            writer.add_scalar('Training Loss',loss.data.item(), iter)
            iter=iter+1
            if iter % 25 == 0 or iter==1:
                # Calculate Accuracy         
                test_loss = 0
                total = 0
                # Iterate through test dataset
                for j,datapoint_1 in enumerate(test_loader): #for testing
                    datapoint_1['input_image']=datapoint_1['input_image'].type(torch.FloatTensor)
                    datapoint_1['output_image']=datapoint_1['output_image'].type(torch.FloatTensor)
                
                    input_image_1 = Variable(datapoint_1['input_image']) #.cuda(1))
                    output_image_1 = Variable(datapoint_1['output_image']) #.cuda(1))
                    
                    # Forward pass only to get logits/output
                    outputs = model(input_image_1)
                    test_loss += criterion(outputs, output_image_1).data.item()
                    total+=1 # datapoint_1['output_image'].size(0)
                test_loss= test_loss/total   #sum of test loss for all test cases/total cases
                writer.add_scalar('Test Loss',test_loss, iter) 
                # Print Loss
                time_since_beg=(time.time()-beg)/60
                print('Iteration: {}. Loss: {}. Test Loss: {}. Time(mins) {}'.format(iter, loss.data.item(), test_loss,time_since_beg))
            if iter % 500 ==0:
                torch.save(model,'checkpoints/model_iter_'+str(iter)+'.pt')
                print("model saved at iteration : "+str(iter))
                # writer.export_scalars_to_json("runs_hn/scalars.json") #saving loss vs iteration data to be used by visualise.py
        scheduler.step()        
    writer.close()          

if test:
    iter = 0
    print('Testing the %d first samples'%tests_num)
    fig = plt.figure(figsize=(30,30))
    #input_imgs = zeros(71,71,test_num)
    #output_imgs = zeros(71,71,test_num)
    #true_outs = zeros(71,71,test_num)
    #input_psf = zeros(71,71,test_num)
    #output_psf = zeros(71,71,test_num)
    for i in range(tests_num):
        # Calculate Accuracy  
        datapoint_1 = test_dataset[i]
        datapoint_1['input_image']=torch.tensor(datapoint_1['input_image']).type(torch.FloatTensor)
        datapoint_1['output_image']=torch.tensor(datapoint_1['output_image']).type(torch.FloatTensor)
   
        if torch.cuda.is_available():
            input_image_1 = Variable(datapoint_1['input_image'].to(device))
            output_image_1 = Variable(datapoint_1['output_image'].to(torch.device('cpu')))
        else:
            input_image_1 = Variable(datapoint_1['input_image'])
            output_image_1 = Variable(datapoint_1['output_image'])
        
        outputs = model(input_image_1.reshape((1,1,71,71)))
        point_src = outputs*0
        point_src[0,0,36,36] = 1
        PSF = model(point_src).reshape((71,71))
        point_src = ((point_src.cpu()).reshape((71,71))).data.numpy()
        PSF = (PSF.cpu()).data.numpy()
        time_since_beg=(time.time()-beg)/60
        print('Iteration: {}. Time(mins) {}'.format(iter, time_since_beg))           
        # plt.subplot(221)
        # plt.imshow(((input_image_1.cpu()).reshape((71,71))).data.numpy())
        # plt.subplot(222)
        # plt.imshow(((outputs.cpu()).reshape((71,71))).data.numpy())
        # plt.subplot(224)
        # plt.imshow(((output_image_1.cpu()).reshape((71,71))).data.numpy())
        # plt.savefig('test_results/' + str(iter) + '.png') 
        iter = iter + 1  
        spio.savemat('test_results/data_'+ str(iter) + '.mat', {     'input_image':((input_image_1.cpu()).reshape((71,71))).data.numpy(),
                                                        'output':((outputs.cpu()).reshape((71,71))).data.numpy(),
                                                        'gnd_truth':((output_image_1.cpu()).reshape((71,71))).data.numpy()})
    plt.close('all')
    fig = plt.figure(figsize=(30,30))
    plt.subplot(121)
    plt.imshow(point_src)
    plt.subplot(122)
    plt.imshow(PSF)
    plt.savefig('test_results/PSF' + '.png')
    plt.close('all')

writer.close()
#decrease learning rate

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  self.current_datain=np.array([],dtype=np.float).reshape(0,self.pix_size,self.pix_size)
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  self.current_dataout=np.array([],dtype=np.float).reshape(0,self.pix_size,self.pix_size)


18000
test18000
Training Started!

EPOCH 1 of 4

/content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project/images/10.mat
Iteration: 1. Loss: 0.01691884733736515. Test Loss: 0.014308645972050726. Time(mins) 0.21441726287206014
/content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project/images/10.mat
Iteration: 25. Loss: 0.005788101349025965. Test Loss: 0.0056763505563139915. Time(mins) 0.8415995438893636
/content/gdrive/My Drive/Coursework/Spring 2023/289A Intro to ML/289A Project/images/10.mat


KeyboardInterrupt: ignored

In [17]:
from google.colab import drive

drive.mount('/content/gdrive')


Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
 !ls /content/gdrive/My\ Drive/_2022:23/_SP23/'1 CS289A - Intro to ML'/project/code/images


10.mat	1.mat  2.mat  3.mat  4.mat  5.mat  6.mat  7.mat  8.mat	9.mat
