## Training the end to end model

In this notebook the end to end model of the project is trained. That is, we train a convolutional neural network to identify one of 17 cities in which a picture was taken.

In [1]:
# First import all the relevant libraries
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers,models,Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Conv2D, Dropout, Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications import MobileNetV2



2024-05-29 22:13:51.665039: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [2]:
number_of_epochs_to_train=6 #how many epochs you want to train for
batch_size = 128 #the batch size - after a few attempts this delivered the best result with the current network architecture

# Since not all pictures in the dataset are the same resolution we need to decide on how to scale the pictures, and we chose 400 x 300
# pixels
img_height = 400 
img_width = 300

## Mobilenet
The backbone of our neural network is a pretrained model named MobileNetV2 (https://keras.io/api/applications/mobilenet/). This is a very high performing image recognition network, which has the advantage of being complex but not having too many parameters. We used this at the back of our model and added two additional Dense layers of 256 neurons each with a 20% dropout rate between them. We only trained the parameters of the two additional  layers.

In [3]:
# First we download the MobileNet model
pretrained_model = tf.keras.applications.MobileNetV2(
    input_shape=(img_height, img_width, 3),
    include_top=False, # So we can customize our output layers
    weights='imagenet',
    pooling='avg'
)

pretrained_model.trainable = False # to fix the parameters of mobilenet

  pretrained_model = tf.keras.applications.MobileNetV2(


In [4]:
inputs = pretrained_model.input


# Create two additional dense layers
x = Dense(256, activation='relu')(pretrained_model.output)
x = Dropout(0.2)(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.2)(x)

# Finally we add an output layer
outputs = Dense(17, activation='relu')(x) #Because we have 17 different classes

model = Model(inputs=inputs, outputs=outputs)


model.compile(optimizer='adam',
          loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
          metrics=['accuracy'])


model.summary()



In the next block of code we create a generator and create a training and validation dataset. The use of generators was necessary for us because without them we had to deal with memory overflow problems during the model training.

In [17]:
# The preprocessing_function attribute is important in order to take full advantage of the pretrained mobilenet model 

dir_train = '/kaggle/input/train-train-clean/train_train' #The directory of the training set
dir_val = '/kaggle/input/train-val-clean/train_val' # The directory of the validation set

datagen = tf.keras.preprocessing.image.ImageDataGenerator(preprocessing_function=tf.keras.applications.mobilenet_v2.preprocess_input)

# The method flow_from_directory automatically assigns labels to the pictures based on the subdirectory in which they are stored.

train_ds = datagen.flow_from_directory(dir_train,
                                       shuffle=True,
                                       target_size=(img_height, img_width), 
                                       batch_size=batch_size, 
                                       class_mode='sparse',
                                       color_mode = 'rgb',
                                       classes = ['Madrid',
                                                 'Phoenix',
                                                 'Miami',
                                                 'Boston',
                                                 'Brussels',
                                                 'Rome',
                                                 'Barcelona',
                                                 'Chicago',
                                                 'Lisbon',
                                                 'Melbourne',
                                                 'Minneapolis',
                                                 'Bangkok',
                                                 'TRT',
                                                 'London',
                                                 'PRG',
                                                 'Osaka',
                                                 'PRS'])

val_ds = datagen.flow_from_directory(dir_val,
                                     shuffle=True,
                                     target_size=(img_height, img_width),
                                     batch_size=batch_size, 
                                     color_mode = 'rgb',
                                     class_mode='sparse',
                                     classes = ['Madrid', 
                                                 'Phoenix',
                                                 'Miami',
                                                 'Boston',
                                                 'Brussels',
                                                 'Rome',
                                                 'Barcelona',
                                                 'Chicago',
                                                 'Lisbon',
                                                 'Melbourne',
                                                 'Minneapolis',
                                                 'Bangkok',
                                                 'TRT',
                                                 'London',
                                                 'PRG',
                                                 'Osaka',
                                                 'PRS'])


class_names = train_ds.class_indices.keys()
print(class_names)
num_classes = len(class_names)

Found 20898 images belonging to 17 classes.
Found 5232 images belonging to 17 classes.
dict_keys(['Madrid', 'Phoenix', 'Miami', 'Boston', 'Brussels', 'Rome', 'Barcelona', 'Chicago', 'Lisbon', 'Melbourne', 'Minneapolis', 'Bangkok', 'TRT', 'London', 'PRG', 'Osaka', 'PRS'])


The next block trains the model using the training data.

In [16]:
model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=number_of_epochs_to_train
)


[1m164/164[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m136s[0m 802ms/step - accuracy: 0.5935 - loss: 1.3021 - val_accuracy: 0.5598 - val_loss: 1.4067


<keras.src.callbacks.history.History at 0x7a4ab363ae00>

Finally we save the model using the .keras format.

In [None]:
model.save('/kaggle/working/clean_batch_size_128_big.keras')