<a href="https://colab.research.google.com/github/dominikklepl/Neural-Networks-Intracranial-hemorrhage-detection/blob/master/03_Channel_models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Models with CTs processed with 3 windows
In this notebook I will build neural networks using the images with three windows (bone, brain and subdural) that are saved in color channels.

## Setup and paths to data

Import all required packages

In [0]:
#dealing with zip
import zipfile

#importing labels and working with dataframe
import pandas as pd

#manipulation with images
import numpy as np
import imageio as iio
import cv2

#models

Set paths to all required data

In [15]:
GDRIVE_PATH = "/gdrive"
WORK_DIR = "/content/"
BASE_DIR = GDRIVE_PATH + "/My Drive/RSNA-comp/"
ZIP_PATH = BASE_DIR + "channels_proto.zip"
TRAIN_PATH = BASE_DIR + "train_proto.csv"
MODEL_PATH = BASE_DIR + "models" #for saving learned weights
RESULT_PATH = BASE_DIR + "results" #for saving performance of models
IMAGES_PATH = WORK_DIR + "images"

#if the model and results aren't created already, create them
!mkdir /gdrive/My\ Drive/RSNA-comp/models
!mkdir /gdrive/My\ Drive/RSNA-comp/results

#also create temporary folder in working directory for unzipping images
!mkdir /content/images

mkdir: cannot create directory ‘/gdrive/My Drive/RSNA-comp/models’: File exists
mkdir: cannot create directory ‘/gdrive/My Drive/RSNA-comp/results’: File exists
mkdir: cannot create directory ‘/content/images’: File exists


Connect Google Drive. That's where my data is stored.

In [12]:
from google.colab import drive
drive.mount(GDRIVE_PATH)

Drive already mounted at /gdrive; to attempt to forcibly remount, call drive.mount("/gdrive", force_remount=True).


Images are saved in a zip file. For easier and faster manipulation, let's extract them to a folder in the working directory that we created before ("/content/images")

In [0]:
img_archive = zipfile.ZipFile(ZIP_PATH)
img_archive.extractall(path=IMAGES_PATH)

Load the csv with labels and metadata

In [29]:
train_df = pd.read_csv(TRAIN_PATH)
train_df.sample(3)

Unnamed: 0.2,Unnamed: 0,level_0,Unnamed: 0.1,index,SOPInstanceUID,Modality,PatientID,StudyInstanceUID,SeriesInstanceUID,StudyID,ImagePositionPatient,ImageOrientationPatient,SamplesPerPixel,PhotometricInterpretation,Rows,Columns,PixelSpacing,BitsAllocated,BitsStored,HighBit,PixelRepresentation,WindowCenter,WindowWidth,RescaleIntercept,RescaleSlope,fname,MultiImagePositionPatient,ImagePositionPatient1,ImagePositionPatient2,MultiImageOrientationPatient,ImageOrientationPatient1,ImageOrientationPatient2,ImageOrientationPatient3,ImageOrientationPatient4,ImageOrientationPatient5,MultiPixelSpacing,PixelSpacing1,img_min,img_max,img_mean,img_std,img_pct_window,MultiWindowCenter,WindowCenter1,MultiWindowWidth,WindowWidth1,any,epidural,intraparenchymal,intraventricular,subarachnoid,subdural,pct_cut
72,72,81588,81588,566803,ID_09e0aa4e0,CT,ID_8f804543,ID_018977ae65,ID_c0ddf3fb24,,-114.39,1.0,1,MONOCHROME2,512,512,0.488281,16,16,15,1,50.0,100.0,-1024.0,1.0,../input/rsna-intracranial-hemorrhage-detectio...,1,-104.372,75.512,1,0.0,0.0,0.0,1.0,0.0,1,0.488281,-1024,2512,310.988605,864.250186,0.307724,,,,,1,0,0,0,0,1,"(0.3, 1.0]"
79,79,47391,47391,328703,ID_859ff1837,CT,ID_5200eb1d,ID_a91fe6a504,ID_a9f543e20b,,-131.0,1.0,1,MONOCHROME2,512,512,0.488281,16,12,11,0,40.0,80.0,-1024.0,1.0,../input/rsna-intracranial-hemorrhage-detectio...,1,20.229671,-83.471729,1,0.0,0.0,0.0,0.961262,-0.275637,1,0.488281,0,2580,550.002155,569.645085,0.348465,1.0,40.0,1.0,80.0,1,0,0,1,0,0,"(0.3, 1.0]"
505,505,101848,101848,127838,ID_dca99fea3,CT,ID_ad95b001,ID_3997fd0bf7,ID_5d8f9a6ffb,,-125.0,1.0,1,MONOCHROME2,512,512,0.488281,16,16,15,1,40.0,150.0,-1024.0,1.0,../input/rsna-intracranial-hemorrhage-detectio...,1,-116.698,25.573,1,0.0,0.0,0.0,0.93358,-0.358368,1,0.488281,-1024,2573,208.605007,823.634313,0.214363,,,,,0,0,0,0,0,0,"(0.2, 0.3]"


Some of the columns are useless at this point, let's keep just those that we actually need. Also some of the columns have too long name, yes SOPInstanceUID I'm talking about you so we'll rename them.

In [28]:
train = train_df[['SOPInstanceUID',
                  'PatientID',
                  'any',
                  'epidural',
                  'intraparenchymal',
                  'intraventricular',
                  'subarachnoid',
                  'subdural']].copy()
train.rename(columns={'SOPInstanceUID': 'ID',
                      'PatientID': 'Patient'},
              inplace=True)
train.head(5)

Unnamed: 0,ID,Patient,any,epidural,intraparenchymal,intraventricular,subarachnoid,subdural
0,ID_1e3b6bd54,ID_139ecbbf,1,0,1,1,0,0
1,ID_069fe65a8,ID_ac4691db,1,0,0,1,0,0
2,ID_6e4410d09,ID_3d4f7d62,1,0,0,0,0,1
3,ID_50cdcac76,ID_010f1536,1,0,0,0,1,0
4,ID_1cfbbc596,ID_0463ca7e,1,0,0,0,0,1


Let's add a filename column to the dataframe which points to the image file. This will be later useful in the data loader.

In [31]:
train['filename']=IMAGES_PATH + "/" + train['ID'] + ".png"
train.head(2)

Unnamed: 0,ID,Patient,any,epidural,intraparenchymal,intraventricular,subarachnoid,subdural,filename
0,ID_1e3b6bd54,ID_139ecbbf,1,0,1,1,0,0,/content/images/ID_1e3b6bd54.png
1,ID_069fe65a8,ID_ac4691db,1,0,0,1,0,0,/content/images/ID_069fe65a8.png


## Helper functions
Before we can start building awesome neural networks, we need a few helper functions for constructing data (i.e. images) in form that's acceptable for the models to learn from.

First, a function for loading a single image and resizing it to the input size.