## Loading the Inception model from the [Applications of Keras](https://keras.io/applications/) or [Transfer learning with a pretrained ConvNet](https://www.tensorflow.org/tutorials/images/transfer_learning)
Keras Applications are deep learning models that are made available alongside pre-trained weights. These models can be used for prediction, feature extraction, and fine-tuning.

Weights are downloaded automatically when instantiating a model.

## (a) The Ising Model – try your show that the square lattice data can be trained perfectly using the embeddings of Inception.

Get the embeddings first, then build a classifier

Solution to (a):

In [3]:
import numpy as np
from numpy.random import rand
import matplotlib.pyplot as plt

import jax.numpy as jnp
from jax import jit, vmap

import tensorflow as tf
import tensorflow_datasets as tfds

from sklearn.model_selection import train_test_split

import keras
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras import optimizers

Using TensorFlow backend.


In [22]:
# Import the data and shape it for training
N = 250
nx, ny = 32, 32

Xsq = np.ndarray((4*N,nx,ny,1))
ysq = np.ndarray(4*N)

for i in np.arange(N):
    Xsq[i + 0*N] = np.loadtxt("./square_T1/square_T1/{:03d}".format(i), delimiter=",").reshape(nx,ny,1)
    ysq[i + 0*N] = 0
    Xsq[i + 1*N] = np.loadtxt("./square_T2/square_T2/{:03d}".format(i), delimiter=",").reshape(nx,ny,1)
    ysq[i + 1*N] = 1
    Xsq[i + 2*N] = np.loadtxt("./square_T3/square_T3/{:03d}".format(i), delimiter=",").reshape(nx,ny,1)
    ysq[i + 2*N] = 2
    Xsq[i + 3*N] = np.loadtxt("./square_T4/square_T4/{:03d}".format(i), delimiter=",").reshape(nx,ny,1)
    ysq[i + 3*N] = 3

Xsq_train, Xsq_test, ysq_train, ysq_test = train_test_split(Xsq, ysq, test_size=0.2, random_state=0)
train_dataset = tf.data.Dataset.from_tensor_slices((Xsq_train, ysq_train))
test_dataset = tf.data.Dataset.from_tensor_slices((Xsq_test, ysq_test))


In [23]:
# Split 'value' into 3 tensors with sizes [4, 15, 11] along dimension 1
#train_data, test_data = tf.split(dataset, [800, 200], 1)

BATCH_SIZE = 64
SHUFFLE_BUFFER_SIZE = 100

train_dataset = train_dataset.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
test_dataset = test_dataset.batch(BATCH_SIZE)

In [30]:
IMG_SIZE = 250

def format_example(image, label):
    image = tf.cast(image, tf.float32)
    #image = (image/127.5) - 1
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    return image, label

In [31]:
# Final datasets to be put through tensorflow
train_dataset = train_dataset.map(format_example)
test_dataset = test_dataset.map(format_example)

In [50]:
# Get the inception model without last layer and disable training
base_model = keras.applications.inception_v3.InceptionV3(include_top=False, 
                                                        weights='imagenet', 
                                                        input_shape=Xsq_train[0].shape,  
                                                        classes=4)

base_model.trainable = False

In [134]:
# Get the embedded data
global_avg_layer = keras.layers.GlobalAveragePooling2D()
Xsq_train_emb, Xsq_test_emb = global_avg_layer(base_model(Xsq_train)), global_avg_layer(base_model(Xsq_test))
prediction_layer = Dense(32)
Xsq_train_emb, Xsq_test_emb = prediction_layer(Xsq_train_emb), prediction_layer(Xsq_test_emb)

In [138]:
model = keras.Sequential([
    base_model,
    global_avg_layer,
    prediction_layer
])

In [139]:
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=base_learning_rate),
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])

initial_epochs = 10
validation_steps=20

history = model.fit(train_batches,
                    epochs=initial_epochs,
                    validation_data=validation_batches)

In [117]:
class small_FNN_Embedded:
    def __init__(self):
        model = self
    
    @staticmethod
    def build(input_shape, num_classes, channels_first=False):
        model = Sequential()
        model.add(keras.layers.InputLayer(input_shape=input_shape))
            
        model.add(Flatten())
        
        model.add(Dense(256,  activation='relu'))
        model.add(Dropout(0.2))
        model.add(BatchNormalization())
        
        model.add(Dense(128,  activation='relu'))
        model.add(Dropout(0.2))
        model.add(BatchNormalization())
        
        model.add(Dense(64, activation='relu'))
        model.add(Dropout(0.2))
        model.add(BatchNormalization())
        
        model.add(Dense(32, activation='relu'))
        model.add(Dropout(0.2))
        model.add(BatchNormalization())
        
        model.add(Dense(16, activation='relu'))
        model.add(Dropout(0.2))
        model.add(BatchNormalization())
        
        model.add(Dense(num_classes, activation="softmax"))
        
        return model

In [129]:
def train_model(model_class, train_data, train_lbls, test_data, 
                test_lbls, num_classes, input_shape, hyperparams):
    
    # Ensure data is shaped properly, assumes channels last set up
    x_train = train_data
    x_test = test_data
    
    # Create categorical labels
    y_train = keras.utils.to_categorical(train_lbls, num_classes)
    y_test = keras.utils.to_categorical(test_lbls, num_classes)
    
    # Instantiate the model
    model = model_class.build(input_shape=(2048,1),
                   num_classes=num_classes)
    
    # Set hyperparameters
    INIT_LR = FNN_hyperparams[0]# learning rate
    EPOCHS = FNN_hyperparams[1] # number of epochs
    BS = FNN_hyperparams[2] # batch size
    OPT = optimizers.Adagrad(lr=INIT_LR) # optimizing function
    
    # Compile the model
    model.compile(loss='categorical_crossentropy',optimizer=OPT,metrics=['accuracy'])
    
    
    H = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS,
                 steps_per_epoch=32, validation_steps=200)
    
    return H, model

In [130]:
FNN_hyperparams = (0.01, 50, 64)
H_sq_FNN, sq_FNN_model = train_model(small_FNN_Embedded, Xsq_train_emb, ysq_train, Xsq_test_emb, ysq_test, 4, (2048,1,1), FNN_hyperparams)

Train on 800 samples, validate on 200 samples
Epoch 1/50


_SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'global_average_pooling2d_5/Mean:0' shape=(800, 2048) dtype=float32>]

In [121]:
Xsq_train_emb

<tf.Tensor 'global_average_pooling2d_5/Mean:0' shape=(800, 2048) dtype=float32>

In [59]:
Xsq_train.shape

TensorShape([800, 256, 256, 3])

In [47]:
classification_layers = [
    Dense(128, activation='relu'),
    Dropout(0.25),
    keras.layers.Dense(4, activation='softmax')
]

## (b)  [Rayleigh-Bénard Convection](https://en.wikipedia.org/wiki/Rayleigh%E2%80%93B%C3%A9nard_convection)

RB convection, in which a flow is heated from below and cooled  from  top,  is  one  of  the  paradigmatic  system  in  fluid  dynamics. When the temperature difference between the two plates (in dimensionless form Rayleigh number Ra) is beyond certain threshold, hot fluid tends to go up and cold fluid tends to go down, thus forming convection cells. What we supply here are the temperature snapshots from four different Ra, i.e., $Ra=10^{14}$ as `class0`,$Ra= 10^{13}$ as `class1`, $Ra= 10^{12}$ as `class2`,and $Ra= 10^{11}$ as `class3`.  The flow you see is highly turbulent; not only there are big convection cells but also lots of small vortices.  The original dataset  is  around  4000*2000.   We  have  already  downsampled  the  data into the zip file `fluid_org.zip`.

### (1) Train the data in `fluid_org.zip` with inception.  Show that these images can be classified  into  different $Ra$ nicely  with  inception.  

Take the length 2048 embeddings from the Inception model first. Then visualizing how the embeddings distribute using a two component PCA or two component T-SNE, whichever you prefer. Then use any of the previously learned method to train a classifier using the embeddings as input. **Note that T-SNE normally gives you better separation**

In [None]:
import tensorflow
import numpy as np
import matplotlib.pyplot as plt

In [3]:
import os
from PIL import Image
imgs = []
labels = []
for file in os.listdir('./fluid_org/'):
    imgs.append(np.array(Image.open('./fluid_org/'+file))/255)
    labels.append(int(file[-5]))

In [4]:
imgs = np.array(imgs)
labels = np.array(labels)

Solution to (1):

### (2) For advanced use of trainsfer learning from the pre-trained models such as fine-tuning, we need to do the transfer learning in-place, by building a network consists of the Inception and your classifier layers. 
Freeze the part you take from Inception, train
the model and report the accuracy. Then do the fine-tuning. Report
how much increase of accuracy you can manage to get. Fine tuning
by making the top few layer of the Inception model trainable instead
of freezing all the layers. Due to the slowness of training, unleash the
layers one by one. Make comments about how the accuracy change. It is
highly recommended that you train this on Google Colab with the GPU
activated.

Solution to (2):

### (3) Explore the potential of transfer learning on cropped data `fluid-crop`, which are randomly choosen regions of 100*100 pixels from each original 4000*2000 pictures, i.e.,just around 1% of the original picture! 
You can use either method you use in (1) or (2).

Solution to (3):

### (4) Build your own classifier for (2) and (3) without using Inception. Compare the performance of your own classifier with the result in (2) and (3)

Solution to (4):

### (5) Continue (3), construct two examples where a different layer's output is used as the embedding. There are over 300 layers in Inception. Pick one at around the 100th layer and one at around 200th layer. The exact layer you pick is based on your preference. Show the following.
- (i) The distributions of the embeddings similar to what you've done in (1). Together with the result you get in (1), comment the similarity and difference between what you get using the three embedding layers.
- (ii) What is the test accuracy of the three classifiers. What is the test accuracy of the three classifiers? For speeding up the training you can choose to get the embeddings first and put those into a classifier, as you did in (1).

Solution to (5):