# Deep Learning: Image Classification Using Keras

### Task 0: Getting Started

In this project, you'll learn how to train a deep neural Network on image data. You will create a classification model that will be able to classify images into three different categories based on the image topic:

- City landscape
- Mountain landscape
- Beach landscape

The image data, 600 images in total, has been collected from pixabay. You will start by investigating the data, the directories structure, the visualization, etc. Then you will create the dataset/loaders, for the training process. After that, you will select an appropriate model for the task together with the loss function, metrics, and optimizer. You'll use keras to train the model, see how to evaluate the results, and finally, use the trained model to infer on new images.

For the purpose of this project, a Jupyter Notebook will be used. Open the `/usercode/project.ipynb` file and write one piece of code for each task in one cell.

Let's start!

### Task 1: Import Packages

Let’s start by importing the necessary packages for this project. You will need:

- `os` and `glob` for getting the data paths
- `random` for random data sampling
- `matplotlib.pyplot` for visualization
- `tensorflow` for specific tensor operations
- `keras` from `tensorflow` for the Deep Learning part
- `keras_cv` for the model

If you’re unsure how to do this, click the “Show Hint” button.

To import packages:

- Use the `import` statement
  ```python
  import <package_name>
  ```

- Use the following template to import module from a package.
  ```python
  from <package_name> import <module_name>
  ```

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
import os
import glob
import random
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
import keras_cv
```

### Task 2: Explore Directories Structure

The images data is located under `/usercode/data/images/` directory.

1. The following sub-directories are of major concern for this project:
2. `/usercode/data/images/beach`: This folder contains images of beach landscapes.
3. `/usercode/data/images/city`: This folder contains images of city landscapes.
4. `/usercode/data/images/mountain`: This folder contains multiple images of mountain landscapes.

Each of these subdirectories contains 200 images.

Create a dictionary where the keys will be the names of the sub-directories of the above directory and the values will be a list with the file paths under each sub-directory. This way you will have a better idea of the structure of the dataset and the directories.

After you create the dictionary print the directory name, the number of files inside each directory, and the first five (5) file paths of each sub-directory.

If you're unsure how to do this, click the "Show Hint" button.

- Use the `os.path.listdir()` function to get the names of the sub-directories, as follows:
  ```python
  os.path.listdir("</some/path>")
  ```

- Use the `glob.glob()` function for getting the list of the file paths of one directory, as follows:
  ```python
  glob.glob("</some/path/*>")
  ```

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
IMAGES_DIR = "/usercode/data/images/"

dir_names = os.listdir(IMAGES_DIR)
img_paths_dict = dict()
for dir_name in dir_names:
    img_paths_dict[dir_name] = glob.glob(f"{IMAGES_DIR}{dir_name}/*")

for dir_name, file_paths in img_paths_dict.items():
    print(dir_name, len(file_paths))
    for file_path in file_paths[:5]:
        print(file_path)
```

### Task 3: Visualize Image Data

In this task, let’s plot some images to get an idea of what they look like.

For this, use the `pyplot` module from the `matplotlib` library to show the first 5 images of each directory.

If you're unsure how to do this, click the "Show Hint" button.

- Use `plt.imread()` to read an image
- Use `plt.imshow()` to show the image

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
for dir_name, file_paths in img_paths_dict.items():
    print(dir_name)
    for file_path in file_paths[:5]:
        image = plt.imread(file_path)
        plt.imshow(image)
        plt.show()
```

### Task 4: Create a Data Loader

While training a deep learning model, it is difficult to hold all the data in the memory. For that, you use data loaders that read the data from files and serve them to the model. Keras/TF comes with many data-loading functions that cover a variety of cases. Since you have an image classification task with the images of each class in separate directories you can use the `image_dataset_from_directory()` function.

In this task, create a dataset using the `image_dataset_from_directory()` function with all the available images.

For image size use 128x128, this will be the size you will also use for training. The reason is that the training will go faster if the size of the images is smaller, especially if it runs on CPU instead of GPU. However, keep in mind that sometimes dropping the resolution leads to worse results.

The `label_mode` should be <"categorical"> since you want the output to be a one-hot vector, meaning 1 for the correct class and 0 for all the other classes.

If you're unsure how to do this, click the "Show Hint" button.

- Here is the template of `image_dataset_from_directory()` function from the Keras documentation:

  ```python
  keras.utils.image_dataset_from_directory(
      directory,
      labels="inferred",
      label_mode="int",
      class_names=None,
      color_mode="rgb",
      batch_size=32,
      image_size=(256, 256),
      shuffle=True,
      seed=None,
      validation_split=None,
      subset=None,
      interpolation="bilinear",
      follow_links=False,
      crop_to_aspect_ratio=False,
      data_format=None,
  )
  ```
- Use `dataset.class_names` to get access to the names of the classes.
- Use `tf.math.argmax()` to get the index of the largest element of a tensor.
- Use the `numpy()` method on a tensor to make it into a NumPy array. Also, cast it to `uint8` type to plot it.

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
dataset = keras.utils.image_dataset_from_directory(
    IMAGES_DIR, image_size=(128, 128), label_mode="categorical"
)
```

### Task 5: Inspect Dataset Output

The `dataset` object you created in **Task 4** will iterate over the image paths, load them as tensors, and return them in batches. Let's inspect the type and dimensionality of the objects returned objects and visualize some of these images.

Write the code to:
- Load one batch of images-labels pairs.
- Print
  1. The type of the returned object.
  2. The dtype of the tensor.
  3. The dimensions/shape of it.
- Plot the first image of the batch.

If you're unsure how to do this, click the "Show Hint" button.

- To get one batch (images, labels) from the dataset, create an iterator `iter()` from the dataset, and call `next()` on it.
- To plot the image correctly with `matplotlib.pyplot`, change the type of the image from `float32` to `uint8`.

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
images, labels = next(iter(dataset))

print(f"type: {type(images)}")
print(f"dtype: {images.dtype}")
print(f"shape: {images.shape}")

print(dataset.class_names[tf.math.argmax(labels[0])])
plt.imshow(images[0].numpy().astype("uint8"))
plt.show()
```

### Task 6: Create Train and Validation Datasets

The data is usually split into 3 subsets during model training.
- Train data: these are the images that you will use to train our model. You will iterate over them multiple times (epochs) and provide the ground truth (correct class for each image) so that the model "learns" to predict the correct output itself.
- Validation data: these are images that you don't feed to the model during training. Instead, after each epoch you get the model predictions on these images and compare them with the ground truth. This way you can tell how our model performs on new images.
- Test data: these are the images you use at the end of the training process to evaluate the final model performance. 

In our case, you will use the validation data as test data for simplicity.

Create the two datasets with a split ratio of:
- Train dataset: 80%.
- Validation dataset: 20%.

Use the same parameters as for the `dataset` from the previous task:
- Image size 128x128.
- Label mode "categorical".

If you're unsure how to do this, click the "Show Hint" button.

- To create a validation split use the `image_dataset_from_directory()` function. This function takes the following arguments:
  - `validation_split`: Optional float between 0 and 1, fraction of data to reserve for validation.
  - `subset`: Subset of the data to return. One of "training", "validation", or "both". Only used if validation_split is set. When subset="both", the utility returns a tuple of two datasets (the training and validation datasets respectively).

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
train_ds = keras.utils.image_dataset_from_directory(
    IMAGES_DIR,
    image_size=(128, 128),
    validation_split=0.2,
    subset="training",
    label_mode="categorical",
    seed=0,
)
valid_ds = keras.utils.image_dataset_from_directory(
    IMAGES_DIR,
    image_size=(128, 128),
    validation_split=0.2,
    subset="validation",
    label_mode="categorical",
    seed=0,
)
```

### Task 7: Create the Model

Now, let's create the model. There are two steps in this task. 

The first step is to create the backbone of the model. This is the big part that is pre-trained on a large amount of data and contains the general knowledge of the model. Instead of starting from scratch, in most cases, it is a good idea to use an already trained backbone and then fine-tune it on your own data.

One of the best image-related models for general tasks is the EfficientNet family. For this task, you will use the smaller model named EfficientNet B0. If this is not good enough for other tasks, you can always use a bigger model, like EfficientNet B4, or even the biggest one: EfficientNet B7!

The second step is to create our classifier. You'll use the backbone and define an ImageClassifier object. This will be the model that you will train on the data and the use for inference. Since there are three classes (city, mountain, beach), the model output will be three numbers that sum up to 1. The class with the highest number will be the one that the model predicted for the given image.

So, in conclusion, for this task:

1. For the first step, you will make use of the `keras_cv.models.EfficientNetV2Backbone` class and call the `from_preset()` method with `efficientnetv2_b0_imagenet` as the backbone name.
2. For the second step, you will use the `keras_cv.models.ImageClassifier` class and create an instance using the backbone of the first step. You will also define the number of classes (three in this case) and the activation. 

>**Note:** For classification tasks, where each sample can belong to one and only one class, softmax is a good activation choice.

If you're unsure how to do this, click the "Show Hint" button.

- Use the `keras_cv.models.EfficientNetV2Backbone.from_preset()` to get the backbone.
- Use the `keras_cv.models.ImageClassifier()` to get the final model.

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
backbone = keras_cv.models.EfficientNetV2Backbone.from_preset("efficientnetv2_b0_imagenet")

model = keras_cv.models.ImageClassifier(
    backbone=backbone,
    num_classes=3,
    activation="softmax",
)
```

### Task 8: Define Loss and Metrics

The loss function is the function that is sought to be minimized during training. It usually depicts how far are the predictions from the ground truth (correct classes). The loss function choice is very important since it is the one that drives the process of training. In this project, there are more than two exclusive classes. For this, a good choice is the `CategoricalCrossentropy`.

The metrics are functions that are used as indicators of the model’s performance. They don’t affect the training process themselves, but they give an idea of how well the model will perform. For this project, a simple yet good metric is `CategoricalAccuracy`, which is the ratio of correct predictions.

> **Note:** The metric result is usually a number that can be interpreted in a meaningful way (e.g. % of success, % of failed cases, number of false positives, etc.). In contrast, the loss function result is not always easy to interpret. Another difference between the loss function and the metrics functions is that the first has to be differentiable so that it can be backpropagated and model weights be updated. The later can be any function, as long as it can be represented in the framework.

In this task, you’ll define the loss function and an evaluation metric.

If you're unsure how to do this, click the "Show Hint" button.

- Use the `CategoricalCrossentropy()` method from `keras.losses` module to compute catagorical crossentropy.
- Use the `CategoricalAccuracy()` method from `keras.metrics` module to calculate prediction accuracy.

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
loss = keras.losses.CategoricalCrossentropy()
metric = keras.metrics.CategoricalAccuracy()
```

### Task 9: Compile the Model

Since this project is using keras as the DL framework, keras provides a simple way to make use of the loss function and the metrics. 
In this task, you’ll compile the model using the loss and metric functions. For this, you’ll call the `compile()` method on the model and pass the defined `loss` and `metric` from **Task 8**, and keras will take care of everything.


If you're unsure how to do this, click the "Show Hint" button.

This is the template for the compile method from the keras documentation

```python
Model.compile(
    optimizer="rmsprop",
    loss=None,
    loss_weights=None,
    metrics=None,
    weighted_metrics=None,
    run_eagerly=False,
    steps_per_execution=1,
    jit_compile="auto",
    auto_scale_loss=True,
)
```

If you're stuck, click the "Show Solution" button.

Use the following code to complete this task:
```python
model.compile(loss=loss, metrics=metric)
```

### Task 10: Train the Model

Now it's time to train the model. Again, keras framework offers a very convenient way to train a model given our datasets which is the `fit()` method.

Train the model for 2 epochs. After each epoch:
- The model will be called on our validation data and the validation metric will be calculated.
- This number is the expected accuracy of the model on new (unseen) data.

>**Note:** One epoch is one pass through all the train data

If you’re unsure how to do this, click the “Show Hint” button.

This is the template of the `fit()` method from keras documentation
```python
Model.fit(
    x=None,
    y=None,
    batch_size=None,
    epochs=1,
    verbose="auto",
    callbacks=None,
    validation_split=0.0,
    validation_data=None,
    shuffle=True,
    class_weight=None,
    sample_weight=None,
    initial_epoch=0,
    steps_per_epoch=None,
    validation_steps=None,
    validation_batch_size=None,
    validation_freq=1,
)
```

If you’re stuck, click the “Show Solution” button.


Use the following code to complete this task:

```python
model.fit(train_ds, validation_data=valid_ds, epochs=2)
```

### Task 11: Evaluate the Model

Now the model is trained. Let's see what is the model performance. Use the `evaluate()` method to get the loss and the accuracy of the model on the validation data. print the two numbers.


If you’re unsure how to do this, click the “Show Hint” button.


This is the evaluate() method template from the keras configuration.

```python
Model.evaluate(
    x=None,
    y=None,
    batch_size=None,
    verbose="auto",
    sample_weight=None,
    steps=None,
    callbacks=None,
    return_dict=False,
    **kwargs
)
```

If you’re stuck, click the “Show Solution” button.


Use the following code to complete this task:
```python
loss, acc = model.evaluate(valid_ds, verbose=False)
print(f"{loss=:.3}, {acc=:.3}")
```

### Task 12: Use Model on an Unseen Image

Now for the last task you will use the trained model to get the prediction on a new photo.
- Pick a random photo from any of the directories.
- Use the `predict()` model method to get the prediction for this photo.
- Get the name of the class from the model prediction
- Print the label and plot the image to confirm that it is correct.

If you’re unsure how to do this, click the “Show Hint” button.


This is the `predict()` template from the keras documentation

```python
Model.predict(x, batch_size=None, verbose="auto", steps=None, callbacks=None)
```


If you’re stuck, click the “Show Solution” button.

Use the following code to complete this task:
```python
image_paths = glob.glob(f"{IMAGES_DIR}/*/*")
image = plt.imread(random.choice(image_paths))

predictions = model.predict(image[None, ...], verbose=False)[0]
pred_cls = valid_ds.class_names[predictions.argmax()]

print(pred_cls)
plt.imshow(image)
plt.show()
```

### Task 13: Congratulations

Congratulations! In this project you successfully trained a classification model on image data and used it to predict the class of previously unseen images. 