# Image Modelling Part 2 - Use Pipeline

In the first notebook we used shell commands to prepare and split our data into a train and evaluation set. 
Furthermore, we defined some functions that will allow us to directly import our pictures and the corresponding class labels and if we want to also augment our data. 
Now, we will import the functions from the `image_modelling.py` file and use them to facilitate the data preparation step in this notebook. 
Lastly, we will use Tensorflow and Keras to create and train our neuronal network to identify turtles.

In [1]:
# Import required packages 
import tensorflow as tf
import image_modeling   # import image_modeling.py file
import tensorflow_hub as hub
import datetime
import csv
import numpy as np
import pandas as pd
# Load the TensorBoard notebook extension
%load_ext tensorboard



In [None]:
# Clear any logs from previous runs
!rm -rf ./logs/

In [None]:
# Check for Tensorflow version
print(tf.__version__)
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.INFO)

In [2]:
# Import variables from image_modelling.py file
file = open("../data/train.csv")
reader = csv.reader(file)

HEIGHT = image_modeling.HEIGHT
WIDTH = image_modeling.WIDTH
NCLASSES = image_modeling.NCLASSES
CLASS_NAMES = image_modeling.CLASS_NAMES
BATCH_SIZE = image_modeling.BATCH_SIZE
TRAINING_SIZE = image_modeling.TRAINING_SIZE
TRAINING_STEPS = (TRAINING_SIZE // BATCH_SIZE)

Double check if the variables now contain the correct values. ;) 

In [None]:
# You can compare this output with the variables in the image_modelling.py file...
print(HEIGHT)
print(CLASS_NAMES)
print(NCLASSES)
print(TRAINING_STEPS)
print(TRAINING_SIZE)

## Building our Model

Building and training a neural network involves various steps: 
1. define the architecture of the model
2. compile the model
3. train the model
4. evaluate the model

We have to start with defining the architecture. Our neural network will consist of several layers that are chained together. The input layer of our model will take our input data and hand it over to the flatten layer, which is responsible for reformatting our data. It will transform the format of our images from a three-dimensional array (HEIGHT, WIDTH, 3) to a one-dimensional array of size HEIGHT * WIDTH * 3. 
After the pixels are flattened we use a dense layer that returns a logits array with length `NCLASSES`. Each node in this layer contains a score that indicates the current image belongs to one of the n classes. 

### Simple Model

In [None]:
'''
# Lets create a simple linear model.
def linear_model():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=[HEIGHT, WIDTH, 3], name='image'))
    model.add(tf.keras.layers.Flatten(data_format="channels_last"))
    # We want to have a simple linear model so we have 
    # no activation function. 
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model
    '''

Before we can train our model we need to compile it and define more settings. We have to choose a loss function, an optimizer and metrics. 
* The **loss function** measures how accurate the model is during training by calculating the model error. Usually we want to minimize this function to improve our model. As you can see in the [TensorFlow documentation](https://www.tensorflow.org/api_docs/python/tf/keras/losses) there are lot's of different loss functions to choose from. Some, e.g. the mean squared errror, hopefully look familiar to you. ;) 
* The **[optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)** defines how the model is updated based on the data and the loss function. One optimizer we've already covered earlier and which is also used for neural networks is the stochastic gradient descent (SGD) algorithm.   
* The **metric** is used to monitor the training process. Here we can choose one of the metrics we've already encountered or [many more](https://www.tensorflow.org/api_docs/python/tf/keras/metrics). 

The following function compiles our model, loads the data using the `load_dataset()` function from the image_modelling.py file and trains the model on the loaded data. In the end the function returns our fitted model. 

In [None]:
'''
def train_and_evaluate(model,batch_size=32):

    model.compile(
        optimizer="adam", 
        # The model outputs one-hot-encoded logits, so we need
        # use the sparse version of the crossentropy loss.
        loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
        metrics=['accuracy']
    )
    
    
    log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
    tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
    train_datagen, test_datagen = image_modeling.preprocess()
    train_generator, validation_generator = image_modeling.use_image_generator(train_datagen, test_datagen, training=True)
    
    model.fit(
        train_generator, 
        validation_data=validation_generator,
        steps_per_epoch=TRAINING_STEPS, 
        epochs=10,
        callbacks=[tensorboard_callback])
          
    return model
    '''

In [None]:
''' # Build and train our model using the prior defined functions 
model = linear_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)
'''

Let us use Tensorboard to monitor our results:

In [None]:
'''
%tensorboard --logdir logs/fit
'''

### Deep Neural Network

Our simple model is not performing well. Maybe we can boost its performance by adding more layers.

In the following `dnn_model()` function we add three more hidden, dense layers after the flatten layer to increase our models complexity. 

In [None]:
'''
# Lets compare a neural network with hidden layers to the linear model
def dnn_model():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=[HEIGHT, WIDTH, 3], name='image'))
    model.add(tf.keras.layers.Flatten(data_format="channels_last"))
    model.add(tf.keras.layers.Dense(units = 40, activation = "relu"))
    model.add(tf.keras.layers.Dense(units = 40, activation = "relu"))
    model.add(tf.keras.layers.Dense(units = 30, activation = "relu"))
    # We want to have a simple linear model so we have 
    # no activation function. 
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model
    '''

In [None]:
'''
# Let us fit the deep neural network
model = dnn_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)
'''

Adding more hidden layers to our model, did indeed increase the accuracy. But still, the model's performance leaves something to be desired. Since we are working with images, switching to a convolutional neural network might help.

### Convolutional Neural Network

CNN's are widely used for image recognition. They are regularized versions of DNN's able to be deeper without generating as much parameters due to its [convolutional](https://machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/) and pooling layers.

The architecture of our CNN is even more complex. This time we combine dense layers with `Conv2D` and `MaxPooling2D` layers. The convolutional and max pooling layers are inserted between the input layer and the flatten layer. 

In [None]:
'''
# Now let us move on to a CNN model. 
def cnn_model():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=[HEIGHT, WIDTH, 3], name='image'))
    model.add(tf.keras.layers.Conv2D(filters=10, kernel_size=[5, 5], padding="same", activation="relu"))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=[2, 2], strides=2))
    model.add(tf.keras.layers.Conv2D(filters=20, kernel_size=[5, 5], padding="same", activation="relu"))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=[2, 2], strides=2))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(units=300, activation="relu"))
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model
    '''

We can have a look at the architecture of our model with the method `.summary`. As you can see in the summary below, the output of each `Conv2D` and `MaxPooling2D` layer is also a three dimensional tensor of shape (height, width, channels). As we go deeper into the network the dimensions shrink. One advantage of the shrinking dimensions is that we can computationally afford to add more output channels in each convolutional layer. We can control the number of the output channels of those layers with the `filters` argument. 

However, at the end of our model we still need the combination of the flatten and dense layers to perform classification. 

In [None]:
'''
model = cnn_model()
model.summary()
'''

In [None]:
'''
# Let us fit the convolutional neural network
model = cnn_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)
'''

We see the CNN give better results than the DNN. But we still have heavy overfitting.

## Transfer Learning

Transfer learning is when a model is trained on one task and is then reused for another task. One approach to transfer learning is fine-tuning. Here you take a trained neural net, exchange the last layer (head) for another layer, that fits the new task and then train the weights of the last layer only. 

First we need to download the headless model (this can take a while) we use MobileNetV2 which is a CNN that was trained on the [ImageNet](https://en.wikipedia.org/wiki/ImageNet) dataset, consisting of over 14 million images:

In [None]:
'''
feature_extractor_url = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/2"
feature_extractor_layer = hub.KerasLayer(feature_extractor_url,input_shape=(HEIGHT,WIDTH,3))
'''

We only want to train the last layer therefore we freeze the layers of our headless model:

In [None]:
'''
feature_extractor_layer.trainable = False
'''

Now we define our model by simply adding the output layer to our pretrained net.

In [None]:
'''
def transfer_learning_model():
    model = tf.keras.models.Sequential()
    model.add(feature_extractor_layer)
    # TODO: add the correct output layer here
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(units=300, activation="relu"))
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model
    '''

In [None]:
'''
# Let us fit our transfer learning model
model = transfer_learning_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)
'''

In [None]:
'''
%tensorboard --logdir logs/fit
'''

As we see the results of fine-tuning surpass the results of the linear model, DNN and CNN. Fine-tuning is a very powerful approach which can generalize well even with limited amount of data.

### Transfer Learning InceptionV3

In [None]:
from tensorflow.keras.applications.inception_v3 import InceptionV3
base_model = InceptionV3(input_shape = (224, 224, 3), include_top = False, weights = 'imagenet')

In [6]:
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras import datasets, layers, models

In [4]:
loaded_model = tf.keras.models.load_model('extra_images_location')

Metal device set to: Apple M1


In [7]:
#change the last layer
for layer in loaded_model.layers:
    layer.trainable = False

x = layers.Flatten()(loaded_model.output)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.2)(x)

# Add a final softmax layer with 101 nodes for classification output
x = layers.Dense(NCLASSES, activation='softmax')(x)

model = tf.keras.models.Model(loaded_model.input, x)

model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = tf.keras.metrics.TopKCategoricalAccuracy(k=5))
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
train_datagen, test_datagen = image_modeling.preprocess()
train_generator, validation_generator = image_modeling.use_image_generator(train_datagen, test_datagen, training=True)
    
inception =  model.fit(
        train_generator, 
        validation_data=validation_generator,
        steps_per_epoch=1000 // 32, 
        epochs=10,
        callbacks=[tensorboard_callback])

Found 1502 validated image filenames belonging to 101 classes.
Found 643 validated image filenames belonging to 101 classes.


2022-02-11 08:21:51.897435: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [None]:
#%tensorboard --logdir logs/fit

In [None]:
#Quick and dirty: get a test_generator to predict probabilities

test_datagen = image_modeling.ImageDataGenerator(rescale=1./255)

data=pd.read_csv('../data/test.csv')
data.image_id= data.image_id.apply(lambda x: x.strip()+".JPG")
train_dir="../images/"

test_generator = test_datagen.flow_from_dataframe(dataframe =data, 
                directory = train_dir,
                x_col="image_id",
                target_size=(224, 224),
                batch_size=1,
                class_mode=None)

## Prepare data for submission

In [8]:
test_generator = image_modeling.use_image_generator(train_datagen, test_datagen, training=False)

Found 490 validated image filenames.


In [9]:
#Get probabilities for all turtle id's
y_preds = model.predict(test_generator)
print(y_preds[0:3])
#Get indices from top 5 predictions
# Corrected: [:,:-6:-1] instead of [:,-5:]
y_preds = np.argsort(y_preds, axis=1)[:,:-6:-1]

#Save indices of top 5 predictions as dataframe
df = pd.DataFrame(y_preds)

[[0.00128775 0.0058658  0.00563102 0.01302973 0.00697664 0.00693395
  0.008296   0.00747766 0.00781973 0.01353229 0.00598821 0.00461514
  0.00706126 0.00800171 0.00912131 0.00387888 0.01591681 0.01159309
  0.00883776 0.02409496 0.00588918 0.01520464 0.0074457  0.01139925
  0.00715375 0.00617969 0.00442235 0.02169328 0.0072285  0.00437924
  0.00427897 0.00724966 0.00452739 0.02273203 0.00407767 0.0134466
  0.00457134 0.00906615 0.00494454 0.00625579 0.01225382 0.00614074
  0.00580656 0.00288526 0.02266942 0.01036452 0.03349439 0.00508936
  0.00535642 0.01045091 0.00895459 0.00480678 0.00792744 0.03061415
  0.01048016 0.02405805 0.01191712 0.00565253 0.02785902 0.015731
  0.00547631 0.0033218  0.01422122 0.00998242 0.01657759 0.01290275
  0.0058864  0.02134658 0.00431259 0.00566628 0.00395388 0.01489166
  0.01425694 0.00390338 0.00323622 0.00505687 0.00456915 0.03373691
  0.00474063 0.00511677 0.0071164  0.00455384 0.00637645 0.00608567
  0.00846918 0.01649156 0.00481377 0.00927037 0.011

In [10]:
#Create a DataFrame with top 5 predictions in submission form
list = []
array = []
for line in y_preds:
    for id in line:
        list.append(CLASS_NAMES[id])
    array.append(list)
    list = []

titles = ['prediction1', 'prediction2','prediction3','prediction4','prediction5']

submission = pd.DataFrame(array, columns= titles)

#Insert image_ids from test_data
test_data = pd.read_csv('../data/test.csv')
submission.insert(loc=0, column='image_id', value=test_data['image_id'])
submission

Unnamed: 0,image_id,prediction1,prediction2,prediction3,prediction4,prediction5
0,ID_6NEDKOYZ,t_id_uMOOrQu7,t_id_smNwfXAT,t_id_IP1t15lD,t_id_HcnnlRda,t_id_dhdJMT1K
1,ID_57QZ4S9N,t_id_smNwfXAT,t_id_IP1t15lD,t_id_uMOOrQu7,t_id_HcnnlRda,t_id_dhdJMT1K
2,ID_OCGGJS5X,t_id_uMOOrQu7,t_id_smNwfXAT,t_id_IP1t15lD,t_id_HcnnlRda,t_id_dhdJMT1K
3,ID_R2993S3S,t_id_smNwfXAT,t_id_IP1t15lD,t_id_uMOOrQu7,t_id_HcnnlRda,t_id_dhdJMT1K
4,ID_2E011NB0,t_id_smNwfXAT,t_id_IP1t15lD,t_id_uMOOrQu7,t_id_HcnnlRda,t_id_dhdJMT1K
...,...,...,...,...,...,...
485,ID_0RVNUKK1,t_id_uMOOrQu7,t_id_smNwfXAT,t_id_IP1t15lD,t_id_HcnnlRda,t_id_dhdJMT1K
486,ID_6405IKG3,t_id_uMOOrQu7,t_id_smNwfXAT,t_id_IP1t15lD,t_id_HcnnlRda,t_id_dhdJMT1K
487,ID_6WVPVB7S,t_id_uMOOrQu7,t_id_smNwfXAT,t_id_IP1t15lD,t_id_HcnnlRda,t_id_dhdJMT1K
488,ID_47C5LL2G,t_id_uMOOrQu7,t_id_smNwfXAT,t_id_IP1t15lD,t_id_HcnnlRda,t_id_dhdJMT1K


In [12]:
#Save submission data as CSV
submission.to_csv('../data/submission_extra_loc.csv', index = False)

### Misc

In [None]:
#Save model
'''
model.save('InceptionV3')
'''

In [None]:
#Load model
'''
loaded_model = tf.keras.models.load_model('InceptionV3)
loaded_model.layers[0].input_shape #(None, 150, 150, 3)
'''