# Introduction
(150 - 200 words)

The long acquisition time in fully sampled MRI leads to low patient throughput, problems with patient comfort and compliance, artefacts from patient motion, and high exam costs. Reducing acquisition time by under-sampling helps to mitigate these issues, but at the cost of reconstructed image quality. The aim of this machine learning task is to improve the viability of the more efficient under-sampling strategy by designing a system that maximises the quality of under sampled reconstructions.

The MRI dataset provided contained raw k-space data from 100 3D volumes, each having approximately 30-40 2D slices. The data was split into training and testing sets with ratio 7:3 respectively. The training set contained only fully sampled k-space data from which ground truth and under-sampled data could be derived. The test set contained 4-fold and 8-fold under sampled k-space data as well as the 4-fold and 8-fold masks that generated this data.

# Design
(450 - 600 words)

After some through research of various types of neural networks, we decided that the best model for this project would be a U-Net. This is because, at the time of writing, U-Nets are regarded to be one of the best types of Convolutional Neural Networks for biomedical imaging.

The major attraction to U-Nets for such tasks is their apparent lack of overfitting risk.

[Ref 1] The performance of the U-Net models continues to increase with increasing model capacity, and 17 even the largest model with over 200 million parameters is unable to overfit the training data.

The other types of Neural Networks that we looked at for this task were: 
- Convolutional Neural Network
- Recurrent Neural Network
- 

We chose to experiment with the following hyperparameters (change as necessary):
- Epochs
- Learning Rate
- Dropout Probability
- Step Size
- Batch Size

Guess we should explain what each are and why we c
In addition to this we altered the layers in the neural network...

# Implementation
(600 - 800 words)

Initially, we began implementing a U-Net from scratch however, quickly ran into problems that were quite ambiguous. After some research, we found a preexisting model provided in the facebook research fast MRI project (https://github.com/facebookresearch/fastMRI/tree/master/models/unet). 

By opting to use a preexisting model, we could ensure that our basic model configuration was correct. It also allowed us to deduce that any issues along the way were caused by parts of the code that were not directly related to the model class.

The initial implementation of the neural network had the following structure. The model carries out down-sampling and up-sampling, resulting in the formation of two deep convolutional networks within the U-Net.

### Optimizer

We tested the model with a variety of different optimizers. Experimenting with we found that the optimizer with the best and most consistent result across epochs was RMSprop. 

Experimentation details can be found in the Experiment section of this report.

### Test Validation Split

An imperative part of the task was to monitor for signs of overfitting, by splitting training data into training and validation data. Overfitting occurs when the model is learning the training data too well, causing the training loss to continuously decrease, while the validation loss continuously increases.

The mechanism for splitting the data is rather naive, as it takes the first x images as training data, and the remaining 1-x as validation data. A future enhancement to this would be to randomise the selection process, as this could boost model accuracy, due to structural difference in images towards the end of the dataset.

This allows for each epoch to run a 'training epoch' and a 'validation epoch', as shown in the code snippet below.

As epochs progress, we monitor the training and validation loss, storing values into an array. Doing this allows us to plot a graph of the training/validation loss over time, as shown below.

In [None]:
# Run Epochs for training
# StepLR sets the learning rate of each parameter group to the initial lr decayed by gamma every step_size epochs.
scheduler = StepLR(optimizer, step_size, lr_gamma) 
current_epoch = 0
# record loss overtime for plotting
train_loss_ot = []
val_loss_ot = []

print("Training on " + str(len(train_data)))
print("Validating on " + str(len(val_data)))
    
# run model epochs
for epoch in range(current_epoch, epochs):
    scheduler.step(epoch)
    print("Epoch: " + str(epoch+1) + "/" + str(epochs))
    train_loss, train_time = training_epoch(epoch, model, train_loader, optimizer) # run a training epoch
    val_loss, val_time = validation_epoch(epoch, model, val_loader, optimizer) # run a validation epoch
    train_loss_ot.append(train_loss)
    val_loss_ot.append(val_loss)
    print(" Train Loss: " + str(train_loss) + " | Validation Loss: " + str(val_loss)) # print loss for the completeted epoch
    print("Train Time: " + str(train_time) + " | Validation Time: " + str(val_time))

#### insert image of some graphs here


### Performance Measure



To measure performance, we used training and validation loss, as discussed above, as well as SSIM.

SSIM (Structural Similarity Index Measure), is a measurement of how structurally similar two images are, making it a form of accuracy measure. This allows for a calculation to be made regarding the average 'accuracy' of the model's generated images, and their corresponding ground truths.

In [None]:
def ssim(gt, pred):
    """ Compute Structural Similarity Index Metric (SSIM). Required 3D input np arrays"""
    return compare_ssim(
        gt.transpose(1,2,0), pred.transpose(1,2,0), multichannel=True, data_range=gt.max()
    )  

# calculate average ssim for training and validation data
length = len(gts)
i = 0
ssim_comb = 0
for i in range(0,length):
    ssim_comb += ssim(gts[i], preds[i])

ssim = ssim_comb / length

print("Average SSIM: " + str(ssim))

# Experiments
(1350 - 1800 words)

Note: This is just an example of the kind of layout and stuff to put in this section...we'll need to choose the most worthwhile experimentations to talk about

### Experimenting With Hyperparmeters

| Epochs  | Learning Rate | Dropout Probability  | Step Size | Training Loss  | Validation Loss | SSIM |
|---|---|---|---|---|---|---|
|30 |0.01|0.001|15|0.0488 |0.0448|0.41407|
|20 |0.1|0.001|15|0.056 |0.054 |0.41165|
|20 |0.1|0.01 |15|0.0685|0.0642|0.42711|
|50 |0.1|0.01 |15|0.0659|0.0641|0.41018|
|75 |0.1|0.01 |15|0.0648|0.0622|0.42522|
|20 |0.1|0.01 |25|0.1144|0.1141|0.38437|
|75 |0.1|0.01 |25|0.0581|0.0549|0.46331|




As noticable from the table above, a low training/validation loss does not necessarily mean the average SSIM of the images will be higher...

#### Channels

|Fold number|Channels|Training Loss|Validation Loss|SSIM Accuracy|
|---|---|---|---|---|
|4|8|0.052945|0.049208|0.41429
|4|16|0.049980|0.046485|0.37170|
|4|32|0.052396|0.049223|0.37234|
|4|64|0.050388|0.048842|0.40308|
|4|128||||
|8|8||||
|8|16||||
|8|32||||
|8|64||||
|8|128||||

### Experimenting with Optimizers

To ensure all optimizer were tested equally, a consistent setting was used for all other hyperparemeters, as shown below. The values were chosen at random, and are not drawn from the results of hyperparameter testing. 

    epochs = 30
    dropout_prob = 0.01
    learning_rate = 0.1
    weight_decay = 0.0
    step_size = 15
    lr_gamma = 0.1
    num_pool_layers = 4
    chans = 8             # num of channels

The table below lists all the ssim measures and final training/validation loss for each optimizer for both of the undersampling rates.

|Fold number|Optimizer|Training Loss|Validation Loss|SSIM Accuracy|
|---|---|---|---|---|
|4|Adagrad|0.051821|0.048778|0.41008|
|4|Adam|0.049791|0.049859|0.37895
|4|ASGD (Averaged Stochastic Gradient Descent)|0.052209|0.050458|0.40193|
|4|RMSprop|0.067036|0.064925|0.42793|
|4|SGD (Stochastic Gradient Descent)|0.069914|0.069806|0.31157|
|8|Adagrad|0.073590|0.074585|0.32659|
|8|Adam|0.074420|0.077764|0.39571|
|8|ASGD (Averaged Stochastic Gradient Descent)|0.073846|0.074285|0.37586|
|8|RMSprop|0.078464|0.081506|0.38945|
|8|SGD (Stochastic Gradient Descent)|0.076306|0.071779|0.38620|



Despite RMSprop giving the best results after 30 epochs for both 4 and 8 fold, it could be argued other optimizers, such as Adam, would perform better with a higher number of epochs. 
This is because, from the graphs displayed below, it appears that RMSProp hits its peak at around 20 epochs...

4 Fold Graphs:

|Adagrad|Adam|ASGD|RMSprop|SGD|
|---|---|---|---|---|
|![](loss-4f-30adagrad.png)|![](loss-4f-30adam.png)|![](loss-4f-30asgd.png)|![](loss-4f-30rmsprop.png)|![](loss-4f-30sgd.png)|

8 Fold Graphs:

|Adagrad|Adam|ASGD|RMSprop|SGD|
|---|---|---|---|---|
|![](loss-8f-30adagrad.png)|![](loss-8f-30adam.png)|![](loss-8f-30asgd.png)|![](loss-8f-30rmsprop.png)|![](loss-8f-30sgd.png)|



#  Conclusions
(300 - 400 words)

# Description of Contribution
(150 - 200 words)

# Reference List

## References
1. https://arxiv.org/pdf/1811.08839.pdf
2. https://arxiv.org/ftp/arxiv/papers/1704/1704.06825.pdf