# **ML MODEL CREATION AND EVALUATION**

## Objectives

* Create a classification model that can distinguish healthy from infected leaves

## Inputs

* Prepared train, validation and test datasets
  - outputs/dataset/train_X.npy
  - outputs/dataset/train_y.npy
  - outputs/dataset/validation.npy
  - outputs/dataset/validation_y.npy
  - outputs/dataset/test_X.npy
  - outputs/dataset/test.npy

## Outputs

* Classification model
  - outputs/model/powdery_mildew_alerter_v?.keras 

---

# Preparation for model design

## Change working directory

In [3]:
import os

os.chdir("./..")  # change to parent directory
working_dir = os.getcwd()
working_dir  # check output for correct directory

'd:\\vscode-projects\\mildew-alert'

## Load prepared data

In [4]:
import numpy as np

output_dataset = working_dir + "/outputs/dataset"
train_X = np.load(output_dataset + "/train_X.npy")
train_y = np.load(output_dataset +"/train_y.npy")
validation_X = np.load(output_dataset + "/validation_X.npy")
validation_y = np.load(output_dataset +"/validation_y.npy")
test_X = np.load(output_dataset + "/test_X.npy")
test_y = np.load(output_dataset +"/test_y.npy")

## Pre-processing with encoder

In [5]:
from sklearn.preprocessing import LabelEncoder

encoder = LabelEncoder()

train_y_encoded = encoder.fit_transform(train_y)
validation_y_encoded = encoder.transform(validation_y)
test_y_encoded = encoder.transform(test_y)

---

# Model

## Model design
  - input layer
  - three convolution + max pooling layers
  - flatten layer
  - two dense layers with 35% dropout between them
  - output layer
  - adaptive moment estimation optimizer and binary crossentropy loss function as suggested by empirical performance

In [6]:
from keras.models import Sequential
from keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Dropout 

def create_model():
    model = Sequential()

    #input layer
    model.add(Input(shape=(75, 75, 3)))

    # convolution + maxpool layer 1 - 32 filters
    model.add(
        Conv2D(
            filters=32,
            kernel_size=(3,3),
            activation="relu"
               )
    )
    model.add(MaxPooling2D(2, 2))

    # convolution + maxpool layer 2 - 64 filters
    model.add(
        Conv2D(
            filters=64,
            kernel_size=(3,3),
            activation="relu"
        )
    )
    model.add(MaxPooling2D(2, 2))

    # convolution + maxpool layer 3 - 128 filters
    model.add(
        Conv2D(
            filters=128,
            kernel_size=(3,3),
            activation="relu"
        )
    )
    model.add(MaxPooling2D(2, 2))

    # flatten layer
    model.add(Flatten())

    #two dense layers with 35% dropout
    model.add(Dense(256, activation="relu"))
    model.add(Dropout(0.35))
    model.add(Dense(256, activation="relu"))

    # output layer
    model.add(Dense(1, activation="sigmoid"))

    #specify optimizer, loss function and metric
    model.compile(
        optimizer="adam",
        loss="binary_crossentropy",
        metrics=["accuracy"]
    )

    return model

## Create the model

In [7]:
powdery_mildew_alerter = create_model()

## Fit data / train model

In [8]:
powdery_mildew_alerter.fit(
    train_X,
    train_y_encoded,
    epochs=10,
    verbose=1,
    validation_data=(validation_X, validation_y_encoded)
    )

Epoch 1/10
[1m92/92[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 74ms/step - accuracy: 0.7879 - loss: 0.4135 - val_accuracy: 0.9833 - val_loss: 0.0656
Epoch 2/10
[1m92/92[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 72ms/step - accuracy: 0.9786 - loss: 0.0652 - val_accuracy: 0.9929 - val_loss: 0.0310
Epoch 3/10
[1m92/92[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 87ms/step - accuracy: 0.9896 - loss: 0.0365 - val_accuracy: 1.0000 - val_loss: 0.0049
Epoch 4/10
[1m92/92[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 86ms/step - accuracy: 0.9951 - loss: 0.0158 - val_accuracy: 1.0000 - val_loss: 0.0028
Epoch 5/10
[1m92/92[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 86ms/step - accuracy: 0.9968 - loss: 0.0151 - val_accuracy: 0.9976 - val_loss: 0.0033
Epoch 6/10
[1m92/92[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 87ms/step - accuracy: 0.9920 - loss: 0.0229 - val_accuracy: 1.0000 - val_loss: 0.0016
Epoch 7/10
[1m92/92[0m [32m━━━━

<keras.src.callbacks.history.History at 0x23a3439e390>

## Evaluate performance

In [9]:
powdery_mildew_alerter.evaluate(test_X, test_y_encoded, verbose=1)

[1m27/27[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 20ms/step - accuracy: 0.9977 - loss: 0.0840


[0.04259925335645676, 0.9988151788711548]

Model has a 100% accuracy on validation set and 99.77% on test set

## Save model (change version variable if needed)

In [None]:
version = "v1"
save_dir = working_dir + "/outputs/model"

if not "model" in os.listdir(working_dir + "/outputs"):
    os.makedirs(save_dir)

powdery_mildew_alerter.save(f"{save_dir}/powdery_mildew_alerter_{version}.keras")

---

# Generate model evaluation visuals

---