# **Alzheimer-Stall-Catchers-Point-Cloud**

This notebook contains necessary code for inference of test data explained in **Point Cloud Based Approach** using 3D convolutional models such as:
- Resnet3D 18
- Resnet3D 101, 152, 200
- Densenet3D 121, 169, 201, 264

### Test GPU

This portion is to test available GPU on the machine. Torch models run on CUDA enabled devices, and it is recommended to use such a machine. In case of running the code on Google Colab, after connecting to a new session, be sure to test which GPU is provided for the session. **Tesla K80** is the slowest GPU that will take 5-6 times training time than **Tesla T4** or **Tesla P100**

In [None]:
import torch
torch.cuda.get_device_name(0)

### Mount Google Drive 

**Skip this step if running on local machine**

Running the notebook on Colab requires data files and custom python modules copied from google drive or uploading the files to the colab session. However due to large dataset sizes, uploading data on each session if not a viable option. It is recommended to upload the data on a google drive, and running the following cell, allow file access to that drive for easily copying necessary files.

(Note that free google drive accounts give you only 15 GB of storage. In case data volume is bigger than that, you can create segments of the total data, hosting them on different drives, and then simply add shortcuts of the different drive folders to a single drive, giving you access to all the data from one google drive.)

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Importing Necessary Libraries

In [None]:
import pandas as pd
import numpy as np
from tqdm import tqdm
import gc
import time
import shutil
import os
import glob

from sklearn.metrics import matthews_corrcoef as mcc

# PyTorch libraries and modules
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.utils.data
from torchvision.models.video import r3d_18

torch.manual_seed(100)

import csv

from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import itertools

# Copy data

**Skip this cell if running on local machine**

After mounting google drive, you need to copy the necessary custom python modules, dataset and other files. The following cell first copies that large zipped data, and extracts them into the current colab workspace.

In [None]:
shutil.copyfile("/content/drive/My Drive/Alzheimer/resnet.py", "resnet.py")
shutil.copyfile("/content/drive/My Drive/Alzheimer/densenet.py", "densenet.py")
shutil.copyfile("/content/drive/My Drive/Alzheimer/submission_format.csv", "submission_format.csv")


submission_format_csv = "submission_format.csv"

print("Done")

Done


If you have not yet converted the original dataset to the point cloud dataset yet, head to the <a href="https://github.com/ClockWorkKid/Alzheimers-Stall-Catchers/tree/master/Dataset%20Visualization%20and%20Processing">**DATA VISUALIZATION AND PROCESSING**</a> section of the repository


For importing test dataset from google drive, the data has been partitioned into 15 zip files, and each of the files are extracted into the colab session one after another.

In [None]:
print("This section is going to take about 20 minutes time")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_1.zip";
print("partition 1 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_2.zip";
print("partition 2 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_3.zip";
print("partition 3 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_4.zip";
print("partition 4 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_5.zip";
print("partition 5 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_6.zip";
print("partition 6 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_7.zip";
print("partition 7 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_8.zip";
print("partition 8 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_9.zip";
print("partition 9 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_10.zip";
print("partition 10 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_11.zip";
print("partition 11 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_12.zip";
print("partition 12 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_13.zip";
print("partition 13 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_14.zip";
print("partition 14 imported")

!jar -xf "/content/drive/My Drive/SayeedColab/Alzheimer Data/test_15.zip";
print("partition 15 imported")

path="./test/"

Alzheimer Stall Catchers **Test** dataset contains 14160 data samples, and there is a sanity check to see whether all the data files have been imported successfully. If running on local machine, you can just specify the data folder to check if all files are there.

In [None]:
files = [f for f in glob.glob("test/" + "*" + ".pt", recursive=True)]
print("Total: " + str(len(files)) + " should be 14160")

# Functions

In [None]:
def string2torch(test_y):
  df = pd.DataFrame(test_y, columns = ['Fname'])
  y = (df['Fname'].values)

  for i,filename in enumerate(y):
    seq_name = filename.split(".mp4")[0]
    y[i] = seq_name

  processed = np.array(y)
  processed = processed.astype(np.int)
  processed = torch.from_numpy(processed)

  return processed

In [None]:
def make_submission_file(filenames, y_pred):
  submit = []
  filenames = filenames.astype(int)
  for i in filenames:
    submit += [str(i)+'.mp4']

  submission_dict = {"filename": submit, "stalled": np.round(y_pred,3)}
  submission_csv = pd.DataFrame(submission_dict)
  submission_csv.to_csv("submission3D.csv", index=False)

In [None]:
with open(submission_format_csv, mode='r') as infile:
    reader = csv.reader(infile)
    test_list_csv = {rows[0]: rows[1] for rows in reader}
    infile.close()


files = list(test_list_csv.keys())
files.pop(0)
test_len = len(files)
print("Total data: " + str(test_len))

# **Network Model**

As mentioned at the beginning of the notebook, Resnet3D/Densenet3D models have been used for training. The model must be imported and sent to device prior to training, and depending on which model you wish to train, you have to import that model either from torch libary (resnet3D 18) or our custorm python modules (resnet.py/densenet.py)

###**Resnet3D 18**
```
from torchvision.models.video import r3d_18

model = r3d_18(pretrained = False) # Change to true for a pretrained model 

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
model.fc.out_features = 2
```

###**Resnet 101, 152, 200**
```
import resnet

# models are 101 152 200
model = resnet.resnet101(   
                num_classes=2,
                shortcut_type='B',
                sample_size=64,
                sample_duration=32)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
```

###**Densenet 121, 169, 201, 264**
```
from densenet import generate_model

# models are 121 169 201 264
model = generate_model(model_depth = 264 , num_classes = 2) 

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
```
###**Loading Model Checkpoints**
Finally, be sure to load the trained model weight file.
```
checkpoint_model = "weight_3D.pth"  # weight file location

model.load_state_dict(torch.load(checkpoint_model))
```

In [None]:
depth, height, width = 32, 64, 64   # dimension for converting point cloud to voxels

shutil.copyfile("/content/drive/My Drive/Alzheimer/densenet264_ep_9_acc_90.317_mcc_0.761.pth", "weight_3D.pth")
checkpoint_model = "weight_3D.pth"

In [None]:
import densenet

model = generate_model(model_depth = 264 , num_classes = 2) # values are 121 169 201 264
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

model.load_state_dict(torch.load(checkpoint_model))

# **Inference**

In [None]:
y_pred = []
filenames = np.array([])


big_batch_size = 1024   # number of test data loaded at a time

batch_size = 128


for big_batch_no in range(math.ceil(test_len/big_batch_size)):

  this_batch_len = big_batch_size
  if (test_len - big_batch_size * big_batch_no) < big_batch_size:
    this_batch_len = test_len - big_batch_size * big_batch_no


  test_x = np.zeros((this_batch_len, 3, depth, height, width), dtype=np.float32)
  test_yy = np.zeros(this_batch_len)


  # Load one big batch
  for i in tqdm(range(this_batch_len)):

    original_idx = i + big_batch_size * big_batch_no

    f = files[original_idx]
    original_name = f.replace(".mp4", "")

    test_x[i, :, :, :, :] = torch.load(path + original_name + ".pt")
    test_yy[i] = int(original_name)

  # Test one big batch
  test_y = torch.from_numpy(test_yy).int()
  ## change the data types from here
  test_x = torch.from_numpy(test_x).float()
  #test_y = torch.from_numpy(test_y)
  
  test = torch.utils.data.TensorDataset(test_x, test_y)
  test_loader = torch.utils.data.DataLoader(test, batch_size = batch_size, shuffle = False, num_workers=4)

  model.eval()
  for i,(images,labels) in tqdm(enumerate(test_loader)):
      #print(labels)
      filenames = np.append(filenames, labels)
      
      images = images.view(-1,3,depth,height,width)
      test = Variable(images.to(device), requires_grad=False)
      labels = Variable(labels.to(device), requires_grad=False)

      with torch.no_grad():
        # Forward propagation
        outputs = model(test)

        # Get predictions from the maximum value
        m=nn.Softmax()
        predicted = m(outputs)[:,1]
        print(predicted)
        #print(f"prediction size are {predicted.shape}")
        y_pred = np.append(y_pred, predicted.cpu().numpy())

make_submission_file(filenames, y_pred)

print("Done")