<a href="https://colab.research.google.com/github/Atsoutse1/git-jedha-ats/blob/main/01_Deep_Learning_in_production_solutions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep Learning in production

We have seen how to push in production a machine learning model. Now, you have to push a deep learning model into production.

It is up to you to choose which model you would like to use on this <a href="https://www.kaggle.com/prasunroy/natural-images" target="_blank">Natural images dataset</a>.

As a reminder, here are the steps:

1. Train your image classifier, you can take the opportunity to use MLFlow so as to track your experimentations,
2. Deploy it to SageMaker,
3. Start making inferences on test images.

## Solution

First download the dataset and unzip it in your working folder.

One may do a little EDA before starting to build any model. Here, we skip this part to stay concise.

### Prepare dataset

In [None]:
# Change this path before running this notebook
DATAS_PATH = "data/natural_images/"

In [None]:
import os
import cv2
import numpy as np
from sklearn.preprocessing import LabelEncoder
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from keras.layers import Dense, Conv2D, MaxPool2D, Flatten, Dropout
from keras.models import Sequential
import mlflow

In [None]:
labels = os.listdir(DATAS_PATH)
print("Labels: ", labels)

Labels:  ['cat', 'car', 'fruit', 'dog', 'person', 'flower', 'motorbike', 'airplane']


In [None]:
def generate_dataset(labels):
    """Generate the dataset.

    Args:
        labels (list[str]): list of labels, which are the same as the folder names

    Return:
        tuple[list[str], list[str]]: returns two lists X and y, respectively the images 
            and the labels
    """
    X = []
    y = []
    for label in labels:
        path = os.path.join(DATAS_PATH, label)
        folder_data = os.listdir(path)
        for image_path in folder_data:
            # Read the image
            image = cv2.imread(os.path.join(path, image_path))
            # Resize the image to fit model input size
            image_resized = cv2.resize(image, (32,32))
            # Append image and associated label
            X.append(np.array(image_resized))
            y.append(label)
    return X, y

In [None]:
X_raw, y_raw = generate_dataset(labels)

In [None]:
# Convert those list into numpy array
X_raw = np.array(X_raw)
y_raw = np.array(y_raw)
print(f"X shape: {X_raw.shape}\ny shape: {y_raw.shape}")

In [None]:
# As little preprocessing we standardize the images
X = X_raw / 255.0

In [None]:
# Process the y in order to get vectors of 0s and 1s
y_encoded = LabelEncoder().fit_transform(y_raw)
y = to_categorical(y_encoded)

In [None]:
# Split into train and test set
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    test_size=0.3,
                                                    random_state=42,
                                                    shuffle=True)

### Define our model

It is inspired by <a href="http://yann.lecun.com/exdb/lenet/" target="_blank">LeNet5</a>.

In [None]:
def lenet(X_train):
    model = Sequential()
    model.add(Conv2D(filters=32, kernel_size=(5,5), activation="relu", input_shape=X_train.shape[1:]))
    model.add(MaxPool2D(pool_size=(2, 2)))
    model.add(Conv2D(filters=64, kernel_size=(3, 3), activation="relu"))
    model.add(MaxPool2D(pool_size=(2, 2)))
    model.add(Dropout(rate=0.25))
    model.add(Flatten())
    model.add(Dense(256, activation="relu"))
    model.add(Dropout(rate=0.5))
    model.add(Dense(8, activation="softmax"))

In [None]:
model = lenet(X_train)
model.compile(
    loss="categorical_crossentropy",
    optimizer="adam",
    metrics=["accuracy"]
)

### Train

Using MLFlow!

In [None]:
# Enable auto-logging to MLflow to capture TensorBoard metrics.
mlflow.tensorflow.autolog()

with mlflow.start_run():
    history = model.fit(X_train, y_train, epochs=25, validation_split=0.2)