<a href="https://colab.research.google.com/github/yecatstevir/teambrainiac/blob/main/source/DL/visualization_playground.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Visualization Playground
## For 3D Convolutional Neural Network on Group Brain fMRI

This notebook turns fMRI brain images from flat matlab files into 4D tensor objects for CNN training.

To start:
- Mount Google Colab, clone fMRI repository locally, and create path to AWS for saving and loading
- Select desired brain images by subject id, splitting into train, validation, and test sets

Pipeline flow for each batch of images:
- Import desired brain images from AWS paths from data_path_dict
- Drop brain images that are unlabeled
- Mask out the brain, normalize the pixel values, and cast into 4D space
- Aggregate images into tensor-compatible objects for model use
- Upload tensor object dictionary of labels and images to AWS S3
        
  

## Mount Colab in Google Drive and Import Images

In [72]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/gdrive')  

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [73]:
# Clone the entire repo.
!git clone -l -s https://github.com/yecatstevir/teambrainiac.git

# Change directory into cloned repo DL folder
%cd teambrainiac/source/DL

# !ls

Cloning into 'teambrainiac'...
remote: Enumerating objects: 2004, done.[K
remote: Counting objects: 100% (189/189), done.[K
remote: Compressing objects: 100% (160/160), done.[K
remote: Total 2004 (delta 101), reused 75 (delta 29), pack-reused 1815[K
Receiving objects: 100% (2004/2004), 110.41 MiB | 41.14 MiB/s, done.
Resolving deltas: 100% (1292/1292), done.
/content/teambrainiac/source/DL/teambrainiac/source/DL


### Load path_config.py to access AWS credentials

In [74]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

Saving path_config.py to path_config.py
User uploaded file "path_config.py" with length 196 bytes


## Import Packages

In [75]:
# Possible Missing Packages
!pip install boto3
!pip install nilearn

# General Library Imports
import re
import scipy.io
import os
import pickle
import numpy as np
import nibabel as nib
import pandas as pd
import boto3
import tempfile
import tqdm
import random
from path_config import mat_path
from botocore.exceptions import ClientError
from collections import defaultdict
from sklearn.preprocessing import StandardScaler

# From Local Directory
from access_data_dl import *
from process_dl import *

# Pytroch Libraries
import torch

import altair as alt



## Import Dictionary of Paths to Flat Matlab Images

In [5]:
# Open path dictionary file to get subject ids
path = "../data/data_path_dictionary.pkl"
data_path_dict = open_pickle(path)

In [6]:
label_data_dict = access_load_data(data_path_dict['labels'][0], True)
input = np.array(label_data_dict['rt_labels']).T[0]


In [7]:
input = np.array(label_data_dict['rt_labels']).T[0]
df = pd.DataFrame(data=input, columns=['Patient Status'])
df['image_index'] = [x+1 for x in df.index]

reg_type = ['Up Regulation' if x==1 else 'Down Regulation' if x==0 else 'Buffer (No Regulation)' for x in df['Patient Status']]
df['Patient Status'] = reg_type

df.head()

Unnamed: 0,Patient Status,image_index
0,Buffer (No Regulation),1
1,Buffer (No Regulation),2
2,Buffer (No Regulation),3
3,Up Regulation,4
4,Up Regulation,5


In [8]:
reg_in_scanner = alt.Chart(df).mark_tick(thickness=5).encode(
    x = 'image_index:Q',
    color = alt.Color('Patient Status:N', scale=alt.Scale(scheme='dark2'))
).properties(
    width = 800,
    title = 'Patient Regulation in Scanner'
)


# c='#446CCF' = blue
# 11:00
# '#F58518' = yellow





reg_in_scanner

## Pulling the output all together

In [69]:
def avg_tensors(nested_tensors):
  metric_list = []
  for tensor_list in nested_tensors:
    temp_sum = 0
    for tensor in tensor_list:
      temp_sum += tensor.item()
    metric_list.append(temp_sum/len(tensor_list))
  
  if len(metric_list) < 10:
    new_metrics = []
    for i in range(10):
      try:
        new_metrics.append(metric_list[i])
      except:
        if nested_tensors.name == 'accuracy':
          new_metrics.append(1)
        else:
          new_metrics.append(0)
    metric_list = new_metrics

  return metric_list

In [71]:
filenames = ['metrics_batch_1_1', 'metrics_batch_1_2', 'metrics_batch_2_1', 'metrics_batch_4_1']
train_error = []


for i,file in enumerate(filenames):
  print('/content/gdrive/My Drive/%s.pkl'%(file))
  metrics_dict = open_pickle('/content/gdrive/My Drive/%s.pkl'%(file))['round_0']
  df = pd.DataFrame(metrics_dict).T
  try:
    train_error['accuracy_'+str(i)] = avg_tensors(df['accuracy'])
    train_error['loss_'+str(i)] = avg_tensors(df['loss'])
  except:
    train_error = df.copy()
    train_error['accuracy_'+str(i)] = avg_tensors(df['accuracy'])
    train_error['loss_'+str(i)] = avg_tensors(df['loss'])
    train_error = train_error.drop(list(df.columns), axis=1)
  
train_error

/content/gdrive/My Drive/metrics_batch_1_1.pkl
/content/gdrive/My Drive/metrics_batch_1_2.pkl
/content/gdrive/My Drive/metrics_batch_2_1.pkl
/content/gdrive/My Drive/metrics_batch_4_1.pkl


Unnamed: 0,accuracy_0,loss_0,accuracy_1,loss_1,accuracy_2,loss_2,accuracy_3,loss_3
epoch_1,0.507937,0.710441,0.507937,0.710441,0.588624,1.039068,0.679894,0.61838
epoch_2,0.529762,0.686388,0.529762,0.686388,0.832011,0.448916,0.878307,0.35328
epoch_3,0.521825,0.691642,0.521825,0.691642,0.916667,0.333925,0.955026,0.150648
epoch_4,0.569444,0.682483,0.569444,0.682483,0.950617,0.182544,1.0,0.0
epoch_5,0.53373,0.684524,0.53373,0.684524,1.0,0.0,1.0,0.0
epoch_6,0.543651,0.681526,0.543651,0.681526,1.0,0.0,1.0,0.0
epoch_7,0.555556,0.67529,0.555556,0.67529,1.0,0.0,1.0,0.0
epoch_8,0.621032,0.6646,0.621032,0.6646,1.0,0.0,1.0,0.0
epoch_9,0.619048,0.657988,0.619048,0.657988,1.0,0.0,1.0,0.0
epoch_10,0.65873,0.6433,0.65873,0.6433,1.0,0.0,1.0,0.0


In [65]:
alt.Chart(source).mark_line(
    point=alt.OverlayMarkDef(color="red")
).encode(
    x='x',
    y='f(x)'
)

'accuracy'

Unnamed: 0,accuracy_1,loss_1
epoch_1,0.507937,0.710441
epoch_2,0.529762,0.686388
epoch_3,0.521825,0.691642
epoch_4,0.569444,0.682483
epoch_5,0.53373,0.684524
epoch_6,0.543651,0.681526
epoch_7,0.555556,0.67529
epoch_8,0.621032,0.6646
epoch_9,0.619048,0.657988
epoch_10,0.65873,0.6433


In [39]:
# [y.item() for y in [x for x in df['accuracy']]]

acc_list = []
for x in df['accuracy']:
  temp_sum = 0
  for y in x:
    temp_sum += y.item()
  acc_list.append(temp_sum/len(x))

acc_list



[0.5079365032059806,
 0.5297619104385376,
 0.5218254008463451,
 0.5694444456270763,
 0.5337301641702652,
 0.5436507953064782,
 0.555555556501661,
 0.6210317526544843,
 0.6190476247242519,
 0.6587301535265786]

## To do finishing up
- Finish training the model on training data and save it
- Put all metrics in the same dictionary or dataframe for the first round of training with 10 epochs
- Build visualizations for epoch accuracies during training



For Validation and Testing
 - Import validation dataset
 - Change metrics dictionary to contain predictions
 - Run and train on validation set
 

 Other
 - Do write-up~