<a href="https://colab.research.google.com/github/nirbhik-datta/CrimesInCommunities/blob/master/EjectionFractionPrediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Ejection Fraction Prediction

###Overview

The deterioration of cardiac function is a key indicator of heart disease. To determine this functionality, doctors measure the end-systolic and end-diastolic volumes. These measurements are used to determine the EF (Ejective Fraction). The ejective fraction is the percentage of blood ejected from the left ventricle with each heartbeat.

![alt text](https://storage.googleapis.com/kaggle-competitions/kaggle/4729/media/heart-illustration2-resized.jpg)

(https://www.kaggle.com/c/second-annual-data-science-bowl)

Dataset comprised of cardiac MRI images of 1000 patients. Images compiled by NIH and Children's National Medical Center. Dataset known as the Sunnybrook Cardiac Data (SCD) or as 2009 Cardiac MR Left Ventricle Segmentation Challenge data. Segmentation challenge consists of accurate dilineation of endocardial (inner heart tissue) and epicardial (outer heart layer) contours of the left ventricle (LV). Manual dilineation is time consuming / tedious and may vary depending on observer.

Four main challenges associate with automatic segmentation of LV using cine MRI:
* overlap between intensity distributions within cardiac regions
* lack of edge information
* shape variability between endocordial and epicardial slices / phases (inner and outer slices of heart)
* variability of these factors from subject to subject

Goal of contest is to compare LV segmentation with expert contours. Contest provides open-source code for contour evaluation. Metric for evaluation is Continuous Ranked Probability Score. For each MRI, predict the cummulative probability of distribution for both systolic and diastolic volumes.

https://www.midasjournal.org/browse/publication/658


###Dataset

The dataset consists of 45 cine-MRI images four main pathological groups:
* Heart Failure with infarction(HF-I) (EF < 40% and evidence of Gd enhancement)
* Heart Failure without infarction(HF) (EF < 40% and no evidence of Gd enhancement)
* LV Hypertrophy(HYP) (normal EF (EF > 55%) and ratio of left ventricular (LV) mass over body surface area is > 83 g/m^2)
* Healthy group(N) (had normal EF and no Hypertrophy)

(Gd enhancement - Indicator of adverse cardiovascular outcomes)
(LV Hypertrophy - Thickening of heart walls of main heart pumping chamber)

http://www.cardiacatlas.org/studies/sunnybrook-cardiac-data/


### Potential Models

U-Net : CNNs for Biomedical Image Segmentation
* For biomedical image processing, issue is images for training not easily accessible
* Builds on fully convolutional layer - modified to work on few training images and yield more precise segmentation
* Given little data, model uses data augmentation by applying elastic deformations on the available data
* Starts with contracting path for context, ends with symmetric expanding path for precise localization
* Biomedical image processing - desired input should include image localization (class label assigned to each pixel); number of training images rarely reaches 1000s
* Sliding window + patches inefficient due to overlapping patches; also has trade off between localization and use of context
* This model builds on FCN (below) - builds on to train on few examples and yield more precise segmentaiton
* Main idea of FCN - use contracting network where pooling operator replaced by upsampling (increases resolution of output); use skip connections to learn low-level features of data
* Modification to FCN - include large number of feature channels to allow for propagation of context to high resolution -> resulting network has expansive section with symmetric shape to contracting section (U-shaped)
* Use little training data - use excessive data augmentation by applying elastic data elastic deformations to available images
* Allows network to learn invariance to such deformations

--Network Architecture
* Contracting path (left) and expansive path (right)
* Contracting path - repeated application of 2 3x3 convolutions (unpadded) followed by ReLU; 2x2 max-pooling with stride of 2 for downsampling (each downsample, double the feature channels)
* Expansive path - every step upsampling of feature map followed by 2x2 convolution (halves feature channel); then followed by concatenation with corresponding cropped (prevent loss of border pixels) feature map from contracting path (skip connection) + 2 3x3 convolutions with ReLU activation after
* Final layer has 1x1 convolution to map 64-component vector to desired number of classes (total of 23 convolutional layers)
* To allow tilting of output segmentation map (allow focus a subimage), even x and y input size

--Training
* Input image and corresponding segmentation map used to train network used to train network using SGD implementation of Caffe
* Energy function computed using pixel-wise softmax over final feature map combined with cross-entropy loss function ?
* Pre-compute weight map of ground truth segmentation to compensate the different frequency of pixels of a certain class to allow for seperation of touching objects of same class
* Weight initialization using Gaussian Distribution with standard deviation of sqrt(2/N) where N = number of incoming nodes of one neuron
* Further details in paper

--Data Augmentation
* Data augmentation important to teach desired invariance and robustness properties with few training examples
* For microscopial images, need shift and rotation invariance as well as robustness to deformation and grey value variations
* Random elastic deformities key for segmentation with few images
* Smooth deformations applied by using random displacement vectors on coarse 3x3 grid; displacements sampled from Gaussian distribution with 10 pixel standard deviation; per-pixel displacement then computed using bicubic interpolation (extension of cubic interpolation in 2D); dropout layers at end of contracting path allow for implicit data augmentation

https://arxiv.org/pdf/1505.04597

U-Net Implementation, trained network and more information:
https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/

Fully Convolutional Networks for Semantic Segmentation
* Pixel level prediction - natural progression from coarse inference to fine inference
* Semantic Segmentaiton - each pixel labled with the class of its enclosing object (has shortcommings addressed with the paper)
* Paper shows Fully convolutional network (FCN) trained end to end exceeds state-of-art without need for further machinery - from supervised pre-training for pixel-wise prediction
* whole-image-at-a-time forward and backprop
* in-network upsampling allows for pixelwise-prediction and learning with nets using subsampled pooling
* Semantic Segmentation inherently has tension with semantic and location - global info resolves what while local info resolves where - use skip architecture to make use of feature spectrum (local-to-global pyramid)
* Convnets built on translation invariance - basic operations operate on local input regions and depend on local relative spatial coordinates
* Layer by layer computation faster than patch-by-patch (looking at subimages) due to overlapping fields
* Typical recognition nets (LeNet , AlexNet) take fixed sized inputs and produce non-spatial outputs - fully connected layers have fixed dimensions and throw away spatial coordinates
* Treating FC layers as convolutions with kernals covering entire image allows you to take input of any size and output a classification map
* Replacing FC with convolutions using filters allow for faster computation; reduce output size using subsampling (also keeps filters small)
* Connect coarse output to dense pixel using interpolation
* Can interpolate by factor f with "backward convolution (deconvolution)" with output stride f (easily implemented by reversing forward and backward passes of convolution)


--Segmentation Architecture
* Cast ILSVRC classifiers (for object detection / image detection) into FCNs and augment them for dense prediction with innetwork upsampling and pixelwise loss - train for segmentation by fine tuning; utilize skip connections to fuze coarse, semantic info and local, appearance info (Figure 3 in paper below)
* Train with per-pixel multinomial logistic loss (soft-max) - validate with mean pixel IoU (overlap between target and predicted mask) - ignores pixels masked out in ground truth for being ambiguous/ difficult

--From classifier to dense FCN
* convolutionizing image classifiers - take models (AlexNet, GooLeNet, VGG) and remove final classification layer - appened a 1x1 convolution layer with dimension 21 to predict score of coarse output locations, followed by deconvolution layer to bilinearly upsample pixel coarse outputs to pixel dense outputs (bilinear upsampling = using nearby pixels to calculate pixel value using linear interpolation)
* Get rid of issue with coarse segmentation using skips with lower layers with finer strides

https://arxiv.org/pdf/1605.06211.pdf

Caffe: Convolutional Architecture for Fast Feature Embedding
* Framework for deep learning algorithms and models; Python and MATLAB bindings
* By separating models by implementation
* Allows usage of cloud
* Caffe model definitions written in config files using Protocol Buffer language (similar to XML but more lightweight; used to structure serialized data)
* Spurred by reproducable reserach

--Architecture
* Caffe stores and communicates using 4-dimensional arrays (blobs)
* blobs provide unified memory interface - holds batch of images (or other forms of data), parameters, or parameter updates
* Data storage - more info in paper
* Layers  - Caffe layer takes on or more blobs as inputs and yields one or more blobs as outputs; responsible for forward pass (inputs to output) and backward pass(takes gradient with respect to output and computes gradient with respect to the parameters and inputs(which are backpropagated to earlier layers))
* layer types includes convolution, pooling, inner-product, nonlinearities such as ReLU, local normalization, losses, etc.
* Caffe models are end-to-end ML systems
* Trains by fast and standard stochastic gradient decent
* Vital to training - learning rate decay schedules, momentum and snapshots for starting and resuming
* Finetuning - adaptation of existing model to new architecture or data); Caffe implements through snapshots of existing model and existing model definition - used to initialize old model weights for new tasks and intialize new weights as needed
* Finetuning (finer adjustments to improve performance) vs Tranfer Learning (taking a previous model and using it for new task)

More info regarding usage and tutorial in paper / website

https://arxiv.org/pdf/1408.5093.pdf


PixelNet
https://www.cs.cmu.edu/~aayushb/pixelNet/pixelnet.pdf

Relevant Pieces for LV Segmentation Challenge
* https://www.midasjournal.org/browse/publication/686
* https://www.midasjournal.org/browse/publication/678
* https://www.midasjournal.org/browse/publication/683
* https://www.midasjournal.org/browse/publication/677
* https://www.midasjournal.org/browse/publication/679
* https://www.midasjournal.org/browse/publication/684
* https://www.midasjournal.org/browse/publication/691

###Other Terms


Cine MRI

* Cine studies obtained by repeatedly imaging the heart at a single slice location throughout the cardiac cycle

* Multiple lines of k-space (echoes) can be acquired in a single heart beat - data from multiple heart beats used to fill this k-space matrix (further research needed)

* To properly evaluate heart function, multiple slices at different locations needed

http://mriquestions.com/beating-heart-movies.html




DICOM (Digital Imaging and Communications in Medicine) - used to store, exchange, and transmit images.

* Images have associated patientID
* DICOM data object consists of number of attributes (name, ID, etc.)
* Special attribute containing pixel data (typically single image but can consist of multiple frames - allowing for multi-frame data)
* 3 data encoding schemes - GROUP, ELEMENT, VR (2 bytes each) for VRs not in OB, OW, OF, SQ, UT, or UN (VR =  value representation); if in this list, find the table in link below
* Value multiplicity provides info regarding # of data elements for a given attribute (for string if multiple encoded successive data elements seperated by '\' character
* To display image, lookup table to display digitally assigned pixel values (allows for visual consistency) - DICOM greyscale standard display function (GSDF)
* DICOM Part 10 files - offline files (futher info at link below)


https://en.wikipedia.org/wiki/DICOM



### Code

In [0]:
with open('/etc/apt/sources.list') as f:
  txt = f.read()
with open('/etc/apt/sources.list.backup', 'w') as f:
  f.write(txt)
with open('/etc/apt/sources.list', 'w') as f:
  f.write(txt.replace('# deb-src','deb-src'))