# Image Modelling Part 2 - Use Pipeline

In the first notebook we used shell commands to prepare and split our data into a train and evaluation set. 
Furthermore, we defined some functions that will allow us to directly import our pictures and the corresponding class labels and if we want to also augment our data. 
Now, we will import the functions from the `image_modelling.py` file and use them to facilitate the data preparation step in this notebook. 
Lastly, we will use Tensorflow and Keras to create and train our neuronal network to identify turtles.

In [1]:
# Import required packages 
import tensorflow as tf
import image_modeling   # import image_modeling.py file
import tensorflow_hub as hub
import datetime
import csv
# Load the TensorBoard notebook extension
%load_ext tensorboard



In [2]:
# Clear any logs from previous runs
!rm -rf ./logs/

In [3]:
# Check for Tensorflow version
print(tf.__version__)
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.INFO)

2.7.0


In [4]:
# Import variables from image_modelling.py file
file = open("../data/train.csv")
reader = csv.reader(file)

HEIGHT = image_modeling.HEIGHT
WIDTH = image_modeling.WIDTH
NCLASSES = image_modeling.NCLASSES
CLASS_NAMES = image_modeling.CLASS_NAMES
BATCH_SIZE = image_modeling.BATCH_SIZE
TRAINING_SIZE = image_modeling.TRAINING_SIZE
TRAINING_STEPS = (TRAINING_SIZE // BATCH_SIZE)

Double check if the variables now contain the correct values. ;) 

In [5]:
# You can compare this output with the variables in the image_modelling.py file...
print(HEIGHT)
print(CLASS_NAMES)
print(NCLASSES)
print(TRAINING_STEPS)
print(TRAINING_SIZE)

224
['t_id_VP2NW7aV', 't_id_qZ0iZYsC', 't_id_3b65X5Lw', 't_id_YjXYTCGC', 't_id_d6aYXtor', 't_id_ksTLswDT', 't_id_hRzOoJ2t', 't_id_utw0thCe', 't_id_k1rScFLB', 't_id_n2FBHk6d', 't_id_ZfvZBX4Q', 't_id_G5eoqwD8', 't_id_FBsGDJhU', 't_id_Ts5LyVQz', 't_id_NW7wn8TC', 't_id_JI6ba2Yx', 't_id_ifWwxWF4', 't_id_uIlC9Gfo', 't_id_dVQ4x3wz', 't_id_3K93fQBS', 't_id_IlO9BOKc', 't_id_DPYQnZyv', 't_id_ROFhVsy2', 't_id_BI99coHt', 't_id_GrxmyS59', 't_id_AOWArhGb', 't_id_4XiPKIk7', 't_id_mpuNp8mf', 't_id_stWei2Uq', 't_id_15bo4NKD', 't_id_QqeoI5F3', 't_id_Kf73l69A', 't_id_Kc1tXDbJ', 't_id_2Yn71r7R', 't_id_iZQiE7wb', 't_id_m2JvEcsg', 't_id_a4VYrmyA', 't_id_UVQa4BMz', 't_id_tjWepji1', 't_id_BXWccqAn', 't_id_1KIezxkh', 't_id_e9i3Lbq4', 't_id_bYageLYA', 't_id_8b8sprYe', 't_id_2QmcRkNj', 't_id_9GFmcOd5', 't_id_smNwfXAT', 't_id_hibDzPAP', 't_id_D3kHUEgp', 't_id_B7LaSiac', 't_id_fjHGjp1w', 't_id_gJaKYxBQ', 't_id_72SiiZCp', 't_id_IP1t15lD', 't_id_uJXT7dGu', 't_id_7gFFZy7i', 't_id_87CLFCvE', 't_id_J5dngbNA', 't_id_Hcn

## Building our Model

Building and training a neural network involves various steps: 
1. define the architecture of the model
2. compile the model
3. train the model
4. evaluate the model

We have to start with defining the architecture. Our neural network will consist of several layers that are chained together. The input layer of our model will take our input data and hand it over to the flatten layer, which is responsible for reformatting our data. It will transform the format of our images from a three-dimensional array (HEIGHT, WIDTH, 3) to a one-dimensional array of size HEIGHT * WIDTH * 3. 
After the pixels are flattened we use a dense layer that returns a logits array with length `NCLASSES`. Each node in this layer contains a score that indicates the current image belongs to one of the n classes. 

### Simple Model

In [6]:
# Lets create a simple linear model.
def linear_model():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=[HEIGHT, WIDTH, 3], name='image'))
    model.add(tf.keras.layers.Flatten(data_format="channels_last"))
    # We want to have a simple linear model so we have 
    # no activation function. 
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model

Before we can train our model we need to compile it and define more settings. We have to choose a loss function, an optimizer and metrics. 
* The **loss function** measures how accurate the model is during training by calculating the model error. Usually we want to minimize this function to improve our model. As you can see in the [TensorFlow documentation](https://www.tensorflow.org/api_docs/python/tf/keras/losses) there are lot's of different loss functions to choose from. Some, e.g. the mean squared errror, hopefully look familiar to you. ;) 
* The **[optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)** defines how the model is updated based on the data and the loss function. One optimizer we've already covered earlier and which is also used for neural networks is the stochastic gradient descent (SGD) algorithm.   
* The **metric** is used to monitor the training process. Here we can choose one of the metrics we've already encountered or [many more](https://www.tensorflow.org/api_docs/python/tf/keras/metrics). 

The following function compiles our model, loads the data using the `load_dataset()` function from the image_modelling.py file and trains the model on the loaded data. In the end the function returns our fitted model. 

In [7]:
def train_and_evaluate(model,batch_size=32):

    model.compile(
        optimizer="adam", 
        # The model outputs one-hot-encoded logits, so we need
        # use the sparse version of the crossentropy loss.
        loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
        metrics=['accuracy']
    )
    
    
    log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
    tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
    train_datagen, test_datagen = image_modeling.preprocess()
    train_generator, validation_generator = image_modeling.generate_augmented_image(train_datagen, test_datagen, augment_randomly=False)
    
    model.fit(
        train_generator, 
        validation_data=validation_generator,
        steps_per_epoch=TRAINING_STEPS, 
        epochs=10,
        callbacks=[tensorboard_callback])
          
    return model

In [8]:
# Build and train our model using the prior defined functions 
model = linear_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)

Metal device set to: Apple M1
Found 1502 validated image filenames belonging to 100 classes.
Found 641 validated image filenames belonging to 100 classes.


2022-02-03 14:52:16.090085: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Let us use Tensorboard to monitor our results:

In [9]:
%tensorboard --logdir logs/fit

### Deep Neural Network

Our simple model is not performing well. Maybe we can boost its performance by adding more layers.

In the following `dnn_model()` function we add three more hidden, dense layers after the flatten layer to increase our models complexity. 

In [12]:
# Lets compare a neural network with hidden layers to the linear model
def dnn_model():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=[HEIGHT, WIDTH, 3], name='image'))
    model.add(tf.keras.layers.Flatten(data_format="channels_last"))
    model.add(tf.keras.layers.Dense(units = 40, activation = "relu"))
    model.add(tf.keras.layers.Dense(units = 40, activation = "relu"))
    model.add(tf.keras.layers.Dense(units = 30, activation = "relu"))
    # We want to have a simple linear model so we have 
    # no activation function. 
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model

In [13]:
# Let us fit the deep neural network
model = dnn_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)

Found 1502 validated image filenames belonging to 100 classes.
Found 641 validated image filenames belonging to 100 classes.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [None]:
test(model)

Adding more hidden layers to our model, did indeed increase the accuracy. But still, the model's performance leaves something to be desired. Since we are working with images, switching to a convolutional neural network might help.

### Convolutional Neural Network

CNN's are widely used for image recognition. They are regularized versions of DNN's able to be deeper without generating as much parameters due to its [convolutional](https://machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/) and pooling layers.

The architecture of our CNN is even more complex. This time we combine dense layers with `Conv2D` and `MaxPooling2D` layers. The convolutional and max pooling layers are inserted between the input layer and the flatten layer. 

In [14]:
# Now let us move on to a CNN model. 
def cnn_model():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=[HEIGHT, WIDTH, 3], name='image'))
    model.add(tf.keras.layers.Conv2D(filters=10, kernel_size=[5, 5], padding="same", activation="relu"))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=[2, 2], strides=2))
    model.add(tf.keras.layers.Conv2D(filters=20, kernel_size=[5, 5], padding="same", activation="relu"))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=[2, 2], strides=2))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(units=300, activation="relu"))
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model

We can have a look at the architecture of our model with the method `.summary`. As you can see in the summary below, the output of each `Conv2D` and `MaxPooling2D` layer is also a three dimensional tensor of shape (height, width, channels). As we go deeper into the network the dimensions shrink. One advantage of the shrinking dimensions is that we can computationally afford to add more output channels in each convolutional layer. We can control the number of the output channels of those layers with the `filters` argument. 

However, at the end of our model we still need the combination of the flatten and dense layers to perform classification. 

In [15]:
model = cnn_model()
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 224, 224, 10)      760       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 112, 112, 10)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 112, 112, 20)      5020      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 56, 56, 20)       0         
 2D)                                                             
                                                                 
 flatten_2 (Flatten)         (None, 62720)             0         
                                                                 
 dense_5 (Dense)             (None, 300)              

In [16]:
# Let us fit the convolutional neural network
model = cnn_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)

Found 1502 validated image filenames belonging to 100 classes.
Found 641 validated image filenames belonging to 100 classes.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [None]:
test(model)

We see the CNN give better results than the DNN. But we still have heavy overfitting.

>__Exercise__: try to reduce overfitting of your model by: 
- making full use of your data augmentation by increasing `steps_per_epoch` in the `train_and_evaluate` function.
- [adding regularization](https://keras.io/api/layers/regularizers/) to the `Conv2D` and `Dense` layers.
- Adding dropout layers. See section CNN Dropout Regularization of [this](https://machinelearningmastery.com/how-to-reduce-overfitting-with-dropout-regularization-in-keras/)

You can also try adding additional `Conv2D`, `MaxPooling2D` or `Dense` layers.

## Transfer Learning

Transfer learning is when a model is trained on one task and is then reused for another task. One approach to transfer learning is fine-tuning. Here you take a trained neural net, exchange the last layer (head) for another layer, that fits the new task and then train the weights of the last layer only. 

First we need to download the headless model (this can take a while) we use MobileNetV2 which is a CNN that was trained on the [ImageNet](https://en.wikipedia.org/wiki/ImageNet) dataset, consisting of over 14 million images:

In [17]:
feature_extractor_url = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/2"
feature_extractor_layer = hub.KerasLayer(feature_extractor_url,input_shape=(HEIGHT,WIDTH,3))

We only want to train the last layer therefore we freeze the layers of our headless model:

In [18]:
feature_extractor_layer.trainable = False

Now we define our model by simply adding the output layer to our pretrained net.

In [19]:
def transfer_learning_model():
    model = tf.keras.models.Sequential()
    model.add(feature_extractor_layer)
    # TODO: add the correct output layer here
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(units=300, activation="relu"))
    model.add(tf.keras.layers.Dense(units=NCLASSES, activation=None))
    return model

In [20]:
# Let us fit our transfer learning model
model = transfer_learning_model()
trained_model = train_and_evaluate(model, BATCH_SIZE)

Found 1502 validated image filenames belonging to 100 classes.
Found 641 validated image filenames belonging to 100 classes.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [21]:
%tensorboard --logdir logs/fit

Reusing TensorBoard on port 6006 (pid 6822), started 0:24:32 ago. (Use '!kill 6822' to kill it.)

In [None]:
test(model)

As we see the results of fine-tuning surpass the results of the linear model, DNN and CNN. Fine-tuning is a very powerful approach which can generalize well even with limited amount of data.

In [22]:
from tensorflow.keras.applications.inception_v3 import InceptionV3
base_model = InceptionV3(input_shape = (224, 224, 3), include_top = False, weights = 'imagenet')

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_v3/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5


In [26]:
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras import datasets, layers, models

In [38]:
#change the last layer
for layer in base_model.layers:
    layer.trainable = False

x = layers.Flatten()(base_model.output)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.2)(x)

# Add a final sigmoid layer with 1 node for classification output
x = layers.Dense(1, activation='sigmoid')(x)

model = tf.keras.models.Model(base_model.input, x)

model.compile(optimizer = RMSprop(lr=0.0001), loss = 'binary_crossentropy', metrics = 'acc')
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
train_datagen, test_datagen = image_modeling.preprocess()
train_generator, validation_generator = image_modeling.generate_augmented_image(train_datagen, test_datagen, augment_randomly=False)
    
inception =  model.fit(
        train_generator, 
        validation_data=validation_generator,
        steps_per_epoch=1000 // 32, 
        epochs=10,
        callbacks=[tensorboard_callback])

  super(RMSprop, self).__init__(name, **kwargs)


Found 1502 validated image filenames belonging to 100 classes.
Found 641 validated image filenames belonging to 100 classes.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [31]:
%tensorboard --logdir logs/fit

Reusing TensorBoard on port 6006 (pid 6822), started 0:45:42 ago. (Use '!kill 6822' to kill it.)

In [51]:
import numpy as np

array, label = train_generator
y_preds = model.predict(train_generator)
y_preds = np.argsort(y_preds, axis=1)[:,-5:]

ValueError: too many values to unpack (expected 2)