# Week 8 Notes

## 8.1 [Fashion classification](https://github.com/DataTalksClub/machine-learning-zoomcamp/blob/master/08-deep-learning/01-fashion-classification.md)

This week our data will be images rather than tabular data. The images are of clothing and the objective is to classify the images into one of 10 categories (multi-class classification).

A user can for example upload a picture to a website to sell clothing. Then there is a service which will suggest a category to the user. The classification service will use a neural network.

For training neural networks, we will use Tensorflow and Keras. For further theoretical understanding, checkout [cs231n](htts://cs231n.github.io).

## 8.1b [Setting up the Environment on Saturn Cloud](https://github.com/DataTalksClub/machine-learning-zoomcamp/blob/master/08-deep-learning/01b-saturn-cloud.md)


Set up Saturn Cloud with Github. Instead of using my general github SSH key, I created a specific one for my ml-zoomcamp repository.

Steps I followed:
1. Generate a new SSH key pair
1. Added public key to Github (in the settings of the repository, not account-level settings)
1. Associate the key with the repository by adding this to `~/.ssh/config`
    ```bash
    Host github-my-repo
    HostName github.com
    User git
    IdentityFile ~/.ssh/my-repo-key
    ```
1. Add private key to Saturn om (same as course instruction)

I ran into an error in Saturn terminal when I tried to verify my SSH connection or when I did `SSH -V`:

```
Saturn Cloud: OpenSSL version mismatch. Built against 30000020, you have 30300020
```

This error occurs because the OpenSSH client is built against a specific version of OpenSSL (e.g., 3.0.0), but the system tries to use a different version (e.g., 3.0.3). This mismatch prevents the SSH client from working properly.

Solution:
Set the correct OpenSSL library path by running the following line in the terminal:
```bash
export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu
```


## 8.2 [TensorFlow and Keras](https://github.com/DataTalksClub/machine-learning-zoomcamp/blob/master/08-deep-learning/02-tensorflow-keras.md)


In [1]:
import numpy as np

In [3]:
import tensorflow as tf
from tensorflow.keras.utils import load_img

In [4]:
path = "./clothing-dataset-small/train/t-shirt/"
name = "5f0a3fa0-6a3d-4b68-b213-72766a643de7.jpg"

In [5]:
fullname = f"{path}/{name}"

In [6]:
img = load_img(fullname, target_size=(299, 299))


In [11]:
x = np.array(img)

## 8.3 [Pre-Trained Convolutional Neural Networks](https://github.com/DataTalksClub/machine-learning-zoomcamp/blob/master/08-deep-learning/03-pretrained-models.md)

We will use an off-the-shelf neural network. These are available in Keras.

In [28]:
from tensorflow.keras.applications import Xception
from tensorflow.keras.applications.xception import preprocess_input
from tensorflow.keras.applications.xception import decode_predictions

In [10]:
model = Xception(
    # include_top=True,
    weights="imagenet",
    # input_tensor=None,
    input_shape=(299, 299, 3),
    # pooling=None,
    # classes=1000,
    # classifier_activation="softmax",
)

`x` is a single image. `X` are multiple images.

In [13]:
X = np.array([x])

In [14]:
X.shape

(1, 299, 299, 3)

We will first infer with the model without preprocessing the image. This gives bogus results as shown below.

In [16]:
model.predict(X)

I0000 00:00:1732778021.047308    2776 service.cc:146] XLA service 0x7f6b80001ca0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1732778021.047337    2776 service.cc:154]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2024-11-28 07:13:41.084564: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-11-28 07:13:41.435784: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:531] Loaded cuDNN version 8907


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3s/step


I0000 00:00:1732778023.254692    2776 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


array([[0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
        0.0000000e+00, 0.0000000e+00, 0.0000000e

Now, let's preprocess:

In [18]:
X = preprocess_input(X)

In [19]:
X[0]

array([[[ 0.4039216 ,  0.3411765 , -0.2235294 ],
        [ 0.4039216 ,  0.3411765 , -0.2235294 ],
        [ 0.41960788,  0.35686278, -0.20784312],
        ...,
        [ 0.96862745,  0.9843137 ,  0.94509804],
        [ 0.96862745,  0.9843137 ,  0.94509804],
        [ 0.96862745,  0.99215686,  0.9372549 ]],

       [[ 0.47450984,  0.4039216 , -0.12156862],
        [ 0.4666667 ,  0.39607847, -0.12941176],
        [ 0.45882356,  0.38823533, -0.15294117],
        ...,
        [ 0.96862745,  0.9764706 ,  0.9372549 ],
        [ 0.96862745,  0.9764706 ,  0.9372549 ],
        [ 0.96862745,  0.9764706 ,  0.92941177]],

       [[ 0.56078434,  0.48235297, -0.00392157],
        [ 0.5686275 ,  0.4901961 ,  0.00392163],
        [ 0.5686275 ,  0.49803925, -0.01176471],
        ...,
        [ 0.9607843 ,  0.96862745,  0.92156863],
        [ 0.9607843 ,  0.96862745,  0.92156863],
        [ 0.9607843 ,  0.96862745,  0.92156863]],

       ...,

       [[ 0.2941177 ,  0.18431377, -0.40392154],
        [ 0

We can see the numbers are no longer ranging from 0-255, but are between -1 and 1. They have been scaled. It's important to note that when the model was trained, it was on this preprocessed data. So when we make a prediction, we need to preprocess the images. This is similar to what we did it with sklearn when scaling certain features or other preprocessing steps.

In [21]:
pred = model.predict(X)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step


In [22]:
pred.shape

(1, 1000)

There is 1 image and 1000 classes. Each number is the probability for a given class.

In [27]:
f"Class {pred.argmax()} has the highest probability: {pred.max():.2f}"

'Class 610 has the highest probability: 0.68'

There is a utility function for this in Keras:

In [30]:
decode_predictions(pred)

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
[1m35363/35363[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


[[('n03595614', 'jersey', 0.68196344),
  ('n02916936', 'bulletproof_vest', 0.0381399),
  ('n04370456', 'sweatshirt', 0.034324728),
  ('n03710637', 'maillot', 0.011354211),
  ('n04525038', 'velvet', 0.0018453576)]]