## Transfer Learning using InceptionV3
Inception is an image processing deep learning tool combining trained on ImageNet (https://cloud.google.com/tpu/docs/inception-v3-advanced). Here, we use the fully trained InceptionV3 module to create transfer data from unseen, untrained images to utilize the feature extraction capabilities of Inception without the need to collect sufficient data to train the neural network.

We subsequently train a support vector classification (SVM) algorithm on the transfer values from Inception to differentiate different types of images (the transfer values are extracted at the last, red layer on the right of InceptionV3).

<img src=https://cloud.google.com/tpu/docs/images/inceptionv3onc--oview.png width = 600>

In [None]:
"""
Created on Wed Feb 24 17:41 2019

@author: Soeren Brandt
"""
# import some generally useful modules
import numpy as np
import matplotlib.pyplot as plt
import scipy
from collections import OrderedDict

# import modules for Inception
from PIL import Image
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras import layers

### Loading experimental image data
This is a sample set of exp. data containing 11 classes and 77 examples.

In [None]:
# Example dictionary
exp_set = OrderedDict([('C1', [148, 174, 202, 203, 239, 240, 241, 434, 435]),
                       ('C2', [175, 184, 185, 186, 187, 199, 204, 205, 424, 425]),
                       ('C3', [150, 206, 207]),
                       ('C4', [151, 177, 208, 209, 242, 243, 244, 283]),
                       ('C5', [152, 178, 188, 189, 267]),
                       ('C6', [153, 179, 190, 191, 197, 268, 284]),
                       ('C7', [157, 171, 324, 330, 440, 441, 442]),
                       ('C8', [169, 170, 312, 313, 443, 444, 445]),
                       ('C9', [159, 160, 326, 327, 436, 437, 438, 439]),
                       ('C10', [154, 181, 446, 447, 448, 449, 450]),
                       ('C11', [163, 164, 452, 453, 454])
                      ])

##### Loading
Here, images are loaded from experimental data stored in CSV files. Alternative, (grayscale) photographs can be loaded instead.

In [None]:
from Data_preparation_for_ML-toolbox import load_images_from_CSV
load_images_from_CSV(exp_set, dim = 299, folder='../exp_csv')

##### Normalization
Data normalization is an important step in machine learning. Here, we linear normalization of the image data to obtain values ranging from 0 to 1.

In [None]:
def normalize(arr):
    """
    Linear normalization
    http://en.wikipedia.org/wiki/Normalization_%28image_processing%29
    """
    arr = arr.astype('float')
    minval = arr[...,0].min()
    maxval = arr[...,0].max()
    if minval != maxval:
        # shift values into the range from 0 to 1
        arr[...,0] -= minval
        arr[...,0] *= (1.0/(maxval-minval))
    return arr

### Load Inception modele
We extract the feature extraction module (left dashed box on top)

In [None]:
feature_extractor_url = "https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1"

# Create the module, and check the expected image size
def feature_extractor(x):
    feature_extractor_module = hub.Module(feature_extractor_url)
    return feature_extractor_module(x)

IMAGE_SIZE = hub.get_expected_image_size(hub.Module(feature_extractor_url))

# we just need the Inception v3 layer
features_extractor_layer = layers.Lambda(feature_extractor, input_shape=IMAGE_SIZE+[3])
model = tf.keras.Sequential([features_extractor_layer])
model.summary()

import tensorflow.keras.backend as K
sess = K.get_session()
init = tf.global_variables_initializer()
sess.run(init)

##### Feature extraction and visualization using t-SNE
t-distributed stochastic neighbor embedding (t-SNE) constructs a probability distribution that two parameters are in close proximity in high-dimensional space and then produces a lower-dimensional representation of the proximity map. Similar to principal component analysis, this can be used to display high-dimensional data more effectively.

In [None]:
from sklearn.manifold import TSNE
from sklearn.decomposition import PCA

# Extract transfer_values from InceptionV3
transfer_value = model.predict(train_data)

# Perform PCA
pca = PCA(n_components=np.min([exp_set_size, 67]))
transfer_values_50d = pca.fit_transform(transfer_value)

# Apply t-SNE after PCA
tsne = TSNE(n_components=2)
transfer_values_reduced = tsne.fit_transform(transfer_values_50d)

# Plot t-SNE
%matplotlib inline
for chem in set(labels):
    plt.plot(transfer_values_reduced.transpose()[0][np.array(labels) == chem],transfer_values_reduced.transpose()[1][np.array(labels) == chem],'.',markersize=12)

### Perform machine learning cross-validation using test data

In [None]:
from sklearn import datasets, svm, metrics
from sklearn.model_selection import train_test_split

data_img = transfer_value
data_label = labels

# Splitting data set into training and testing
train_img, test_img, train_lbl, test_lbl = train_test_split(data_img, data_label, test_size=1/2.5, random_state=0)

# Training support vector machine
SVC = svm.SVC(C=1e2,gamma=1e-2, kernel='rbf')
SVC.fit(train_img, train_lbl)
predicted = SVC.predict(test_img)

print("Training accuracy: " + str(SVC.score(train_img, train_lbl)) + " (" + str(len(train_lbl)) + ")")
print("Validation accuracy: " + str(np.sum(predicted == test_lbl)/float(len(test_lbl))) + " (" + str(np.sum(predicted == test_lbl)) + "/" + str(len(test_lbl)) + ")")