# Skin Cancer Classification -- Minihackathon

<img src="images/skin.jpg" style="width: 256px;">

The Mini-Hackathon is based on work by Andre Esteva et al.

In this script, you will be shown the ropes of transfer learning. First you train a model yourself. Thereafter, you will load the inception v3 model and retrain the model using the transfer-learned model.

Afterwards you will be shown some options to modify your model even more via layer unfreezing and early stopping.
You can skip the unfreezing and early stopage part, they serve the tutorial part of the hackathon.

The aim of the hackathon is to create the model with highest classification efficiency.

Notice that the images are complex and training times might be significant.

The original challenge related to this Hackathon was [here](https://challenge.kitware.com/#phase/5840f53ccad3a51cc66c8dab),
[Udacity's wrapper on the contest](https://github.com/udacity/dermatologist-ai) and here -->
[Dasato](https://dasoto.github.io/skincancer/)


## Step 1 : Import modules 

Import InceptionV3: InceptionV2 model

Dense, Dropout: CNN layers

In [None]:
### switch off deprecation and future warnings
import warnings

def fxn():
    warnings.warn("deprecated", DeprecationWarning)

with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    fxn()
warnings.simplefilter(action='ignore', category=FutureWarning)

In [None]:
# Limit GPU Usage
import tensorflow as tf
config = tf.compat.v1.ConfigProto() # Tensorflow 2.0 version
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)

In [None]:
# Load libraries required for transfer learning
# Before import any package, it's good to install it first using command such as conda install tensorflow or conda install tensorflow.
from keras.applications.inception_v3 import InceptionV3, preprocess_input,decode_predictions
from keras.preprocessing import image
import numpy as np
from keras.layers import Dense, GlobalAveragePooling2D,Dropout,Input
from keras.models import Sequential, Model
from keras import backend as K
from IPython.display import display
import matplotlib.pyplot as plt

## Step 2 : Pre-process data, create image generator
 
Create an image data generator using ImageDataGenerator class, the generator helps us to make it easy to load data

In [None]:
#Define the dictionary for Image data Generator
data_gen_args = dict(
    preprocessing_function=preprocess_input,
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    vertical_flip = True
)

# create two instances with the same arguments for train and test
train_datagen = image.ImageDataGenerator(**data_gen_args)
test_datagen = image.ImageDataGenerator(**data_gen_args)

### 2.1 Data parsing

Load the data using `flow_from_directory` method of data Generator, which takes the path to a directory, and generates batches of augmented/normalized data

In [None]:
train_generator = train_datagen.flow_from_directory(
    "Data_Minihackathon/train",
    target_size=(299,299),
    batch_size=100
)

valid_generator = test_datagen.flow_from_directory(
    "Data_Minihackathon/valid",
    target_size=(299,299),
    batch_size=100
)

In [None]:
#take a look at one image
z = plt.imread("Data_Minihackathon/test/melanoma/ISIC_0013766.jpg")
plt.imshow(z)

## Step 3:  Model definition

In [None]:
# Define your CNN model such as 3 convolutional layers with max pooling and 2 fully connected layers with dropout here:
#  e.g. conv2d—>maxpooling—>conv2d—>maxpooling—>conv2d—>maxpooling—> dropout—>Flatten—>Dense—>Dropout—>Dense

from keras.layers import Conv2D,MaxPooling2D,Flatten

model = Sequential()
model.add(Conv2D(filters = 16, kernel_size = 2, padding = 'same', activation = 'relu', input_shape = (299,299,3)))
model.add(MaxPooling2D(pool_size=2,padding='same'))
model.add(Conv2D(filters = 32, kernel_size = 2, padding = 'same', activation = 'relu'))
model.add(MaxPooling2D(pool_size=2,padding='same'))
model.add(Conv2D(filters = 64, kernel_size = 2, padding = 'same', activation = 'relu'))
model.add(MaxPooling2D(pool_size=2,padding='same'))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))

In [None]:
# Summary
model.summary()

### Compile model

In [None]:
#Hints: use model.complie function to compile your model
# Recommended hyper-parameters: epochs=60, validation_steps=3

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

In [None]:
# include early stopping to avoid overfitting and save time
from keras.callbacks import ModelCheckpoint,EarlyStopping

# Save the model with best weights, do create a model folder with mkdir saved_model
checkpointer = ModelCheckpoint('saved_model/model.hdf5', verbose=1, save_best_only=True)
# Stop the training if the model shows no improvement 
stopper = EarlyStopping(monitor='val_loss', min_delta=0.1, patience=0, verbose=1, mode='auto')

## Step 4: Training of initial model

In [None]:
# use model.fit_generator function to train your model
import time
start = time.time()
history = model.fit_generator(
    train_generator, 
    steps_per_epoch = 2,
    validation_data=valid_generator,
    validation_steps=3, 
    epochs = 2, 
    verbose=1,
   # callbacks=[checkpointer]
)
end = time.time()
print(end-start)

Accuracy will be mediocre, but how to improve it?

One solution is to use transfer learning.

## Step 5:  Transfer Learning

# Load An InceptionV3 pre-trained model with InceptionV3 class of keras.applications module.

Signature:

keras.applications.inception_v3.InceptionV3(
    include_top=True,
    weights='imagenet',
    input_tensor=None,
    input_shape=None,
    pooling=None,
    classes=1000
)

Note that the default input image size for this model is 299x299.

### Arguments
    include_top: whether to include the fully-connected
        layer at the top of the network.
    weights: one of `None` (random initialization)
        or 'imagenet' (pre-training on ImageNet).
    input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
        to use as image input for the model.
    input_shape: optional shape tuple, only to be specified
        if `include_top` is False (otherwise the input shape
        has to be `(299, 299, 3)` (with `channels_last` data format)
        or `(3, 299, 299)` (with `channels_first` data format).
        It should have exactly 3 inputs channels,
        and width and height should be no smaller than 139.
        E.g. `(150, 150, 3)` would be one valid value.
    pooling: Optional pooling mode for feature extraction
        when `include_top` is `False`.
        - `None` means that the output of the model will be
            the 4D tensor output of the
            last convolutional layer.
        - `avg` means that global average pooling
            will be applied to the output of the
            last convolutional layer, and thus
            the output of the model will be a 2D tensor.
        - `max` means that global max pooling will
            be applied.
    classes: optional number of classes to classify images
        into, only to be specified if `include_top` is True, and
        if no `weights` argument is specified.

### Returns
    A Keras model instance.

Use the pre-trained feature extraction section of the InceptionV3  image classification model and learn classification layer

To do:

1. Get the output of InceptionV3, assuming you have loaded the pre-trained model at step2
2. Define your model as the classifaction part
3. Load the pre-trained weights from HDF5 file
4. Freeze the original layers of pre-trained model(Inception3)
5. Train the classification part with your dataset

In [None]:
base_model  = InceptionV3(weights= 'imagenet', include_top=False)
print('loaded model')

In [None]:
# Get the output of Inceptionv3
# then input it to your classification part model
# Define the output layers for Inceptionv3

last = base_model.output
x = GlobalAveragePooling2D()(last)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
preds = Dense(3,activation='softmax')(x)

model = Model(inputs=base_model.input,outputs=preds)
model.summary()

### Loading weights for your model Load weights via HDF5


In [None]:
#Load the weights for the common layers from the benchmark model
# Tips: use load_weights function of keras.applications.inception_v3.InceptionV3
base_model.load_weights('saved_model/model.hdf5', by_name=True)

Freeze the original layers of Inception3 and set the weights of feature extractor be untrainable

In [None]:
for layer in base_model.layers:
    layer.trainable = False

### Compile the model 

In [None]:
model.compile(
    optimizer='adam', 
    loss='categorical_crossentropy', 
    metrics=['accuracy']
)

In [None]:
from keras.callbacks import ModelCheckpoint,EarlyStopping

# Save the model with best weights
checkpointer = ModelCheckpoint('saved_model/transfer_learning.hdf5', 
                               verbose=1,save_best_only=True)
# Stop the traning if the model shows no improvement
stopper = EarlyStopping(monitor='val_loss',min_delta=0.1,patience=1,
                        verbose=1,mode='auto')

In [None]:
# Train the model
history_transfer = model.fit_generator(
    train_generator, 
    steps_per_epoch = 2,
    validation_data=valid_generator,
    validation_steps=3, 
    epochs=2,
    verbose=1,
    callbacks=[checkpointer]
)

#### (Optional) Display the dictionary of training metrics values

In [None]:
display(history_transfer.history)

### Manual finetuning via unfreezing classification layers

In [None]:
# This is how you unfreeze

for layer in model.layers[:197]:
    layer.trainable = False
for layer in model.layers[197:]:
    layer.trainable = True

### Re-compilation with different learning rate

What happens if we slow down the learning rate?

In [None]:
from keras.optimizers import adam

# use with slow learning rate and momentum to standard value
model.compile(
    optimizer=adam(lr=0.0001, beta_1=0.9, beta_2=0.999),
    loss = 'categorical_crossentropy',
    metrics = ['accuracy']
)

In [None]:
# Save the mode with best validation loss

checkpointer = ModelCheckpoint(
    "saved_model/fine_tuning.hdf5",
    verbose = 1,
    save_best_only = True,
    monitor = "val_loss"
)

# Ensure that training stops if the validation loss does not improve

stoptheshow = EarlyStopping(
    monitor = 'val_loss, val_acc',
    min_delta = 0.1,
    patience = 2,
    verbose = 1,
    mode = 'auto'
)

#### Model training

In [None]:
history = model.fit_generator(
    train_generator, 
    steps_per_epoch=2,
    validation_data = valid_generator,
    validation_steps = 3,
    epochs = 2,
    verbose = 1,
    callbacks = [checkpointer]
)

In [None]:
# Step 7.1 : load the trained model
model.load_weights('saved_model/fine_tuning.hdf5')

# Hackathon

Now, train a model yourself to beat the baseline

In [None]:
# Your source code here
