# LUNA Train Unet

# Dependency Descriptions
1. **keras**: is a high-level neural networks library (that allows for easy and fast prototyping)

In [1]:
from __future__ import print_function

import numpy as np
from keras.models import Model
from keras.layers import Input, merge, Convolution2D, MaxPooling2D, UpSampling2D
from keras.optimizers import Adam
from keras.optimizers import SGD
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras import backend as K

Using TensorFlow backend.


In [2]:
WORKING_PATH = "../../../../output/build-simple-model/"
IMG_ROWS = 512
IMG_COLS = 512

K.set_image_dim_ordering('th')  # Theano dimension ordering in this code
# dimension ordering is simply the order dimensions come in (ex: width, height, z)
# and this is using theano's ordering convention

**[Dice Coefficient Loss Function](https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient)**: compares the predicted and actual node mask (similar metric to what was used in Ultrasound Nerve Segmentation challenge that the U-net was originally written for)


Everything should be working as they did it in their tutorial first, so you are sure you understand their code (and how their code works). Then you can slowly change the code to fit your own ideas, so you are sure errors are not due to an error in the copying of the tutorial's code. 

Therefore training and predicting will be done on the typical train/test split (that the tutorial recommends) and after getting the tutorial to work successfully you can use 10 fold cross validation in place of it to choose a model, then train the model on the entire dataset and predict.

# Understanding Of Sequential Order of Code
## Loading / Preprocessing Training Data
```python
imgs_train = np.load(working_path+"trainImages.npy").astype(np.float32)
imgs_mask_train = np.load(working_path+"trainMasks.npy").astype(np.float32)

imgs_test = np.load(working_path+"testImages.npy").astype(np.float32)
imgs_mask_test_true = np.load(working_path+"testMasks.npy").astype(np.float32)
    
mean = np.mean(imgs_train)  # mean for data centering
std = np.std(imgs_train)  # std for data normalization

imgs_train -= mean  # images should already be standardized, but just in case
imgs_train /= std
```

## Actually Creating the Unet
*goal: to understanding exactly how this set of code*
## Steps:
1. Create the intial structure of a Unet
2. Create checkpoints for the unet to save its best weights (at that time period)
3. Give the unet an initial set of weights (optional)
3. Train the unet on trianing data (consisting of lung image, and node mask)

## Getting the Unet
```python
# where the return of the function should give you the "model"
model = get_unet()
```

## Creating the Unet 
*using keras define the initial structure of the model (layers, nodes, etc...)*

**so how does this code create the structure of a model?**

## Research into Unet
### To Understand Everything:
1. Go through [this guide](https://keras.io/getting-started/sequential-model-guide/)
2. Go through [other guide](https://keras.io/getting-started/functional-api-guide/)

### Sequential Models ([reference](https://keras.io/getting-started/sequential-model-guide/))
- Sequential Model: linear stack of layers
- tell model what input shape to expect (first layer must recieve info about input shape)
- before training a model, configure the learning process, which requires a `compile` method which contains:
  1. an optimizer: from existing optimizers, or instance of optimizer class, [reference](https://keras.io/optimizers/)
  2. a loss function: the object the model will try to minimize, existing loss function or just an objective function, [reference](https://keras.io/objectives/)
     - note custom objective functions have specific structures (like must have y_true, y_pred and return a scalar)
  3. list of metrics: existing metrics, custom metrics must return single tensor value, [reference](https://keras.io/metrics/)
- keras models: trained on Numpy arrays of input data and labels, use the `fit` function (sequential model api [complete reference](https://keras.io/models/sequential/))

#### Examples 
- *[complete examples folder](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py)*
- good demonstration of CNN ([here](https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py))

### Getting started with the Keras functional API [reference](https://keras.io/getting-started/functional-api-guide/#the-concept-of-layer-node)
- Keras functional API: allows you to define complex models
- layer instances are callable (on a tensor?), returns a tensor
  - then the input and output tensors are used to define a `Model`
  - understand tensors below in the section called "**Tensor Understanding**"
- then the model is trained exactly the same a `Sequential` model
  - perfect example:
    ```python
    from keras.layers import Input, Dense
    from keras.models import Model

    # this returns a tensor
    inputs = Input(shape=(784,)) # your inputs are a tensor

    # a layer instance is callable on a tensor, and returns a tensor
    x = Dense(64, activation='relu')(inputs) # modify the inputs with another tensor
    x = Dense(64, activation='relu')(x)      # and so on and so forth
    predictions = Dense(10, activation='softmax')(x) # final output layer modification

    # this creates a model that includes
    # the Input layer and three Dense layers
    model = Model(input=inputs, output=predictions) 
    # input tensor, and output tensor wraps everything together
    # the rest is directly from `Sequential` models
    model.compile(optimizer='rmsprop',
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])
    model.fit(data, labels)  # starts training
    ```
- the entire created model could even be considered a large tensor! and used again!

#### Examples!
- [here](https://keras.io/getting-started/functional-api-guide/#more-examples)

### Tensor Understanding
- [wikipedia definition](https://en.wikipedia.org/wiki/Tensor): tensors are geometric objects that describe linear relations between geometric vectors, scalars, and other tensors (like the dot product, cross product or even linear maps)
  - Given a coordinate basis or fixed frame of reference, a tensor can be represented as an organized multidimensional array of numerical values.
  - The order (also degree or rank) of a tensor is the dimensionality of the array needed to represent it, or equivalently, the number of indices needed to label a component of that array. For example, a linear map is represented by a matrix (a 2-dimensional array) in a basis, and therefore is a 2nd-order tensor. 
  - there are many definitions (the definitions describe the same geometric concept in different languages and differing levels of abstraction)
- simple machine learning definition: multidimensional arrays (generalizing arrays and matrices), [reference](http://stats.stackexchange.com/questions/144860/how-are-tensors-used-in-neural-networks)
  - more explanations [here](http://stats.stackexchange.com/questions/198061/why-the-sudden-fascination-with-tensors)

## Understanding Our Convolutional Network
```python
def get_unet():
    inputs = Input((1,IMG_ROWS, IMG_COLS)) # input tensor with size of imgs (1, 512, 512)
    
    # a bunch of layers, tensors modifying previous tensors
    # somehow this structure of conv, conv, pool, then up, conv, conv creates what we need
      # must be in an example somewhere (or just how U-nets work?)
    conv1 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(inputs)
    conv1 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(pool1)
    conv2 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

    conv3 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(pool2)
    conv3 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)

    conv4 = Convolution2D(256, 3, 3, activation='relu', border_mode='same')(pool3)
    conv4 = Convolution2D(256, 3, 3, activation='relu', border_mode='same')(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)

    conv5 = Convolution2D(512, 3, 3, activation='relu', border_mode='same')(pool4)
    conv5 = Convolution2D(512, 3, 3, activation='relu', border_mode='same')(conv5)

    up6 = merge([UpSampling2D(size=(2, 2))(conv5), conv4], mode='concat', concat_axis=1)
    conv6 = Convolution2D(256, 3, 3, activation='relu', border_mode='same')(up6)
    conv6 = Convolution2D(256, 3, 3, activation='relu', border_mode='same')(conv6)

    up7 = merge([UpSampling2D(size=(2, 2))(conv6), conv3], mode='concat', concat_axis=1)
    conv7 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(up7)
    conv7 = Convolution2D(128, 3, 3, activation='relu', border_mode='same')(conv7)

    up8 = merge([UpSampling2D(size=(2, 2))(conv7), conv2], mode='concat', concat_axis=1)
    conv8 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(up8)
    conv8 = Convolution2D(64, 3, 3, activation='relu', border_mode='same')(conv8)

    up9 = merge([UpSampling2D(size=(2, 2))(conv8), conv1], mode='concat', concat_axis=1)
    conv9 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(up9)
    conv9 = Convolution2D(32, 3, 3, activation='relu', border_mode='same')(conv9)

    conv10 = Convolution2D(1, 1, 1, activation='sigmoid')(conv9) # final layer, tensor

    # create the Model by telling it its input/output
    # where output is a tensor of other layers
    model = Model(input=inputs, output=conv10) 

    # compile the model with our very own loss and metric, and the existing Adam optimizer
    model.compile(optimizer=Adam(lr=1.0e-5), loss=dice_coef_loss, metrics=[dice_coef])

    return model
```

## Clarification:
1. what are those tensors?
   1. what do they do?
   2. what are those parameters within them?
2. why those tensors in that specific order?
   1. how does it create the model we want to do segmentation?
3. how does U-net relate? 
   1. what makes U-net different then a NN, or a CNN?
4. what is the Adam optimizer?
5. what does our custom loss and metric do? how does it do it?

**look at examples, documentation (of both keras and U-net) to answer the above questions to completely understand all the code**

## Clarification Answers:
### Convolutional Neural Networks
- 

### Convolution2D
- [documentation](https://keras.io/layers/convolutional/#convolution2d)
- do not understand enough about Convolutions, CNNs, so none of this makes sense? refer to "**Convolutional Neural Networks**" above

### MaxPooling2D
### UpSampling2D
### merge
### Structure of Layers
### Adam Optimizer
### Custom Loss and Metric
#### Dice Coefficient
#### Understanding of Code for Function
#### Difference Between Loss and Metric


## Save the trained model at checkpoints
```python
model_checkpoint = ModelCheckpoint('unet.hdf5', monitor='loss', save_best_only=True)
```

## Use weights given by tutorial
```python
if use_existing:
    model.load_weights('./unet.hdf5')
```
## Train model on training data
```python
model.fit(imgs_train, imgs_mask_train, batch_size=2, nb_epoch=20, verbose=1, shuffle=True,
              callbacks=[model_checkpoint])
```
_**The final weights are what you want, with those weights you put them on the model and can start making predictions**_