## Preparing the environment

In [None]:
!gdown 1y-LdQ_4dbOip6sBgZ-Ub1FI6Hh5kl3h1

In [None]:
!pip install plot_keras_history

## Preparing the dataset

In [None]:
dataset_name = "places365_300"

In [None]:
!tar -xf {dataset_name}.tar

## Importing the necessary libraries

In [None]:
import os

import numpy as np
import tensorflow as tf

from plot_keras_history import show_history
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
from mpl_toolkits.axes_grid1 import ImageGrid
import math

## Preprocessing the dataset

The following code loads images and builds a preprocessing pipeline within Datasets (Tensorflow-specific structures providing input data).

Documentation: https://www.tensorflow.org/guide/data

In [None]:
class_names = ["alley", "entrance_hall", "park"]


def get_label(file_path):
  parts = tf.strings.split(file_path, os.path.sep)
  one_hot = parts[-2] == class_names
  return tf.argmax(one_hot)


def process_path(file_path, img_size=(224, 224)):
  label = get_label(file_path)
  img = tf.io.read_file(file_path)
  img = tf.io.decode_jpeg(img, channels=3)
  img = tf.image.resize(img, [img_size[0], img_size[1]])
  return img, label


def build_dataset(path, sub_path):
  ds = tf.data.Dataset.list_files(str(f"{path}/{sub_path}/*/*"))
  ds = ds.map(process_path, num_parallel_calls=tf.data.AUTOTUNE)
  ds = ds.cache()
  ds = ds.batch(64)
  ds = ds.prefetch(tf.data.AUTOTUNE)
  return ds

In [None]:
train_ds = build_dataset(dataset_name, "train")
val_ds = build_dataset(dataset_name, "val")

## Building the model

This code builds a model that will be trained in the following cells. The field `base_model` is initialized with a pre-defined architecture loaded from Keras.

You can find the list of possible architectures at https://keras.io/api/applications/

On top of `base_model`, we add a fully-connected layer with 3 neurons, and a softmax activation function. This settings allows scaling the output in a way that it can be interpreted as a probability distribution where all of the probabilities sum up to 1.


In [None]:
class Places365Model(tf.keras.Model):
  def __init__(self):
    super().__init__()
    self.base_model = tf.keras.applications.MobileNet(
      input_shape=(224, 224, 3),
      include_top=False,
      weights=None,
      classes=3,
      pooling="avg"
    )
    self.fc = tf.keras.layers.Dense(3, activation='softmax')

  def call(self, x):
    x = self.base_model(x)
    return self.fc(x)


model = Places365Model()
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics='accuracy')

## Training the model

In [None]:
history = model.fit(
    train_ds,
    epochs=10)

## Plotting the metrics

Let's see the results of our training!

In [None]:
show_history(history)
plt.close()

## Confusion matrix

Confusion matrix is a tool that helps us to interpret the results and draw conclusions about the characteristics of the model.

It is a table presenting the number of correct predictions, and showing us how many mistakes it made, specifically focusing on the confusions between classes.

A row represents the true class, and column represents a predicted one.

In [None]:
results = model.predict(val_ds)
ds = list(val_ds.unbatch().as_numpy_iterator())

y_true = [class_names[entry[1]] for entry in ds]
y_pred = [class_names[np.argmax(result)] for result in results]

matrix_confusion = confusion_matrix(y_true, y_pred)
f, ax = plt.subplots(figsize=(10, 10))
ax = sns.heatmap(matrix_confusion, square=True, annot=True, cmap='Blues', fmt='d', cbar=False,
                 xticklabels=class_names, yticklabels=class_names, annot_kws={"fontsize": 20})
ax.xaxis.tick_top()
ax.xaxis.set_label_position('top')
plt.tick_params(axis='both', which='major', labelsize=25, labelbottom=False, left=False, bottom=False, top=False, labeltop=True)

## Classification report

Let's see the classifiation report!

We can observe the following metrics:
- **precision** - this a frequentist probability that the predicted class is true. It is calculated with the equation TP / (TP + FP).
With just one positive and correct prediction, precision will have a value of 1, even if all the other ones were False Negatives.
- **recall** - this is a frequentist probability that the true class is not missed. It is calculated with the equation TP / (TP + FN). If all the predictions are positive, recall will have a value of 1.

The rule is that increasing precision is done at the cost of recall, and vice-versa increasing recall decreases precision. This is why, we have another metric:
- **f1** - a harmonic mean between precision and recall: 2 * precision * recall / (precision + recall).

On the other hand, we have:
- **accuracy** - it tells us what is the ratio of correct predictions (in terms of both True Positives and True Negatives). It is calculated with the equation (TP + TN) / (TP + FP + TN + FN)

In general, f1 and accuracy provide better information about the quality of the model, whiile precision and recall can describe its additional characteristics.

When the dataset is unbalanced, f1 is much more reliable than accuracy.


The following article explains all the above quite well:

https://medium.com/analytics-vidhya/confusion-matrix-accuracy-precision-recall-f1-score-ade299cf63cd

In [None]:
print(classification_report(y_true, y_pred, target_names=class_names))

## Misclassified images

Let's plot the misclassified images and see how our model confused their classes!

In [None]:
images_to_plot = []

for result, gt in zip(results, ds):
  img, label = gt
  predicted_class = np.argmax(result)
  if predicted_class != label:
    images_to_plot.append((img, label, predicted_class))

fig = plt.figure(figsize=(128., 128.))
grid = ImageGrid(fig, 111, nrows_ncols=(math.ceil(len(images_to_plot) / 4), 4), axes_pad=0.6,)

for ax, im in zip(grid, images_to_plot):
    ax.set_title(f"True: {class_names[im[1]]}, predicted: {class_names[im[2]]}", fontdict=None, loc='center', color = "k", fontsize=15)
    ax.imshow(im[0] / 255)

plt.show()