# Notes from module-8 live session of ml-zoomcamp 

Course homepage: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Lisa Novello

### 8.1 Fashion Classification

The goal of this module is to build a multiclass classification to classify clothing images in 10 different classes using a neural network. The dataset used in this notebook can be retrived [here](https://github.com/alexeygrigorev/clothing-dataset-small).

### 8.2 TensorFlow and Keras

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import load_img

%matplotlib inline

Let's now load the images. We specify the `target_size`, since of course a Neural Network (NN) expects an image of a certain size. 

In [9]:
path = "./dataset/train/t-shirt"
name = '5f0a3fa0-6a3d-4b68-b213-72766a643de7.jpg'
fullname = f'{path}/{name}'
img = load_img(fullname, target_size = (299, 299))
print(img)

<PIL.Image.Image image mode=RGB size=299x299 at 0x7FDE015B5390>


We can easily transform this image in a numpy array as follows:

In [10]:
x = np.array(img)
x.shape

(299, 299, 3)

### 8.3 Pre-trained Convolutional Neural Networks

Let's use a pre-trained CNN to understand what is in our image.

At [this webpage](https://keras.io/api/applications/) you can find several available pre-trained models, trained on [ImageNet](https://www.image-net.org/), which is a dataset with lots of images.

Among them, there is the _Xception_ model. Let's use it.

In [11]:
from tensorflow.keras.applications.xception import Xception

Now, let's just create our model, specifying some parameters: 
- `weights = 'imagenet'` specifies that we want to use the weights that were obtained by training our model on ImageNet (so, again, we want a model that was already trained);
- `input_shape = (299, 299, 3)` specifies the shape of the input, which is our image;

In [12]:
model = Xception(weights = 'imagenet', input_shape = (299, 299, 3))

2022-10-27 19:09:51.026169: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-27 19:09:51.034089: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/xception/xception_weights_tf_dim_ordering_tf_kernels.h5


Now, let's use this model to classify our image as "t-shirt". However our model expects not just one image, but a batch of images.

In [13]:
X = np.array([x])
X.shape

(1, 299, 299, 3)

In [14]:
model.predict(X)



array([[0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e

We can see that these are mostly zeros or very small numbers. This is not how it should look like. And the reason for that is that the model expects the input to look in a certain way, so we have some preprocessing to do. 

For doing that, there is a function in keras called `preprocess_input` that we will use. This function was used for the images used for training the mode, so we will need to use exactly the same function to preprocess the images for which we want to get predictions.

In [15]:
from tensorflow.keras.applications.xception import preprocess_input

In [16]:
X = preprocess_input(X)

Let's have a look at the preprocessed image:

In [17]:
X[0]

array([[[ 0.4039216 ,  0.3411765 , -0.2235294 ],
        [ 0.4039216 ,  0.3411765 , -0.2235294 ],
        [ 0.41960788,  0.35686278, -0.20784312],
        ...,
        [ 0.96862745,  0.9843137 ,  0.94509804],
        [ 0.96862745,  0.9843137 ,  0.94509804],
        [ 0.96862745,  0.99215686,  0.9372549 ]],

       [[ 0.47450984,  0.4039216 , -0.12156862],
        [ 0.4666667 ,  0.39607847, -0.12941176],
        [ 0.45882356,  0.38823533, -0.15294117],
        ...,
        [ 0.96862745,  0.9764706 ,  0.9372549 ],
        [ 0.96862745,  0.9764706 ,  0.9372549 ],
        [ 0.96862745,  0.9764706 ,  0.92941177]],

       [[ 0.56078434,  0.48235297, -0.00392157],
        [ 0.5686275 ,  0.4901961 ,  0.00392163],
        [ 0.5686275 ,  0.49803925, -0.01176471],
        ...,
        [ 0.9607843 ,  0.96862745,  0.92156863],
        [ 0.9607843 ,  0.96862745,  0.92156863],
        [ 0.9607843 ,  0.96862745,  0.92156863]],

       ...,

       [[ 0.2941177 ,  0.18431377, -0.40392154],
        [ 0

Now its values are not numbers between 0 and 255 anymore, but they have been converted to numbers between -1 and 1. So this is the preprocessing, and we have to do it to get our model working correctly.

Let's now use the model to compute the predictions on our preprocessed image.

In [19]:
pred = model.predict(X)
pred



array([[3.23711633e-04, 1.57383489e-04, 2.13492734e-04, 1.52370194e-04,
        2.47625809e-04, 3.05035355e-04, 3.20591440e-04, 1.47498984e-04,
        2.03621399e-04, 1.49272106e-04, 1.95662491e-04, 2.10136932e-04,
        7.59262621e-05, 1.13971975e-04, 1.62683180e-04, 2.04638171e-04,
        1.97415735e-04, 1.44288439e-04, 1.40217075e-04, 1.73685446e-04,
        7.46688398e-04, 2.56966101e-04, 2.66808289e-04, 2.96513608e-04,
        3.73601360e-04, 2.77403655e-04, 2.16570581e-04, 2.27269644e-04,
        3.80812329e-04, 1.72165601e-04, 3.05400521e-04, 1.96430905e-04,
        3.92114365e-04, 4.78070637e-04, 2.91750825e-04, 3.25692788e-04,
        1.47394938e-04, 1.62361728e-04, 2.12710293e-04, 1.34028101e-04,
        2.40069989e-04, 6.75210380e-04, 2.54943036e-04, 1.44478588e-04,
        4.12820285e-04, 2.04408017e-04, 3.02957691e-04, 1.49339321e-04,
        1.99653310e-04, 2.27005366e-04, 2.93728663e-04, 2.27444398e-04,
        6.37643330e-04, 7.82614108e-04, 2.49556644e-04, 4.052699

Now results are different, The values are not zero anymore, though they are still quite small numbers.

In [20]:
pred.shape

(1, 1000)

The shape of this image is `(1, 1000)` because we have one image and 1000 classes. Each value is the probability that this particular image belongs to some class.

To be able to make sense of this output, we need to know which are the classes for each of these values. 

For that, there is another function in the xception package which is `decode_predictions`, which tries to make the predictions human readable. Let's apply it to our predictions:

In [21]:
from tensorflow.keras.applications.xception import decode_predictions

In [22]:
decode_predictions(pred)

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json


[[('n03595614', 'jersey', 0.68196374),
  ('n02916936', 'bulletproof_vest', 0.038139913),
  ('n04370456', 'sweatshirt', 0.034324728),
  ('n03710637', 'maillot', 0.01135422),
  ('n04525038', 'velvet', 0.0018453576)]]

So, the top class is `'jersey'` (which is a sort of pullover, or sweater). The reason is that among the classes there is not a `'t-shirt'` class.

So, this model is not a good model for our purpose of clothing classification. We need to train a different model with the classes that we need for our particular case. But we don't need to train the model from scratch, we can re-use this pre-trained model, _i.e._ build on top of this available model, in other words adapt it to our particular use case.