# **Data Modelling and Evaluation**

---

## Objectives

* Answer business requirement 2: 
    * The client seeks to predict whether a cherry leaf is healthy or infected with powdery mildew.

## Inputs

* inputs/cherry_leaves_dataset/cherry-leaves/train
* inputs/cherry_leaves_dataset/cherry-leaves/test
* inputs/cherry_leaves_dataset/cherry-leaves/validation
* image shape embeddings

## Outputs

* Images distribution plot in train, validation, and test set
* Image augmentation
* Class indices to change prediction inference in labels
* Machine learning model creation and training
* Save model
* Learning curve plot for model performance
* Model evaluation on pickle file
* Prediction on the random image file





## Additional Comments:

N/A


---

# Set Data Directory

---

## Import libraries

In [15]:
import os
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.preprocessing import StandardScaler
import joblib



## Set working directory

In [16]:
cwd= os.getcwd()

In [17]:
os.chdir('/workspace/Portfolio_5_Cherry_Leaves_Mildew')
print("You set a new current directory")

You set a new current directory


In [18]:
work_dir = os.getcwd()
work_dir

'/workspace/Portfolio_5_Cherry_Leaves_Mildew'

## Set input directories

Set train, validation and test paths.

In [19]:
my_data_dir = 'inputs/cherry_leaves_dataset/cherry-leaves'
train_path = my_data_dir + '/train'
val_path = my_data_dir + '/validation'
test_path = my_data_dir + '/test'
print(train_path, val_path, test_path)

inputs/cherry_leaves_dataset/cherry-leaves/train inputs/cherry_leaves_dataset/cherry-leaves/validation inputs/cherry_leaves_dataset/cherry-leaves/test


## Set output directory

In [21]:
version = 'v1'
file_path = f'outputs/{version}'

if 'outputs' in os.listdir(work_dir) and version in os.listdir(work_dir + '/outputs'):
    print('Old version is already available create a new version.')
    pass
else:
    os.makedirs(name=file_path)

Old version is already available create a new version.


### Set label names

In [22]:
# Set labels
labels = os.listdir(train_path)
print('Label for the images are', labels)

Label for the images are ['healthy', 'powdery_mildew']


### Set image shape

In [23]:
## Import saved image shape embedding
version = 'v1'
image_shape = joblib.load(filename=f"outputs/{version}/image_shape.pkl")
image_shape

(50, 50)

---

## Number of images in the train, test, and validation data

---

In [26]:
def load_images_from_folder(folder, label):
    images = []
    labels = []
    for filename in os.listdir(folder):
        img = cv2.imread(os.path.join(folder, filename))
        if img is not None:
            img = cv2.resize(img, (50, 50))  # Resize image
            images.append(img.flatten())     # Flatten image
            labels.append(label)
    return images, labels



---

## Image data augmentation

---

### ImageDataGenerator

* Intiatize ImageDataGenerator

* Augment training image dataset

* Augment validation image dataset

* Augment test image dataset

### Plot augmented training image

### Plot augmented validation and test images

### Save class_indices

---

## Model creation

---

### ML model


* Import model packages

* Model

* Model Summary

* Early Stopping

### Fit model for model training

### Save model


---

## Model Performace

---

### Model learning curve

### Model Evaluation

Load saved model

Evaluate model on test set

Save evaluation pickle

### Predict on new data

Load a random image as PIL

Convert image to array and prepare for prediction

Predict class probabilities

---

## Push files to Repo

#### Push generated/new files from this Session to GitHub repo

* .gitignore

In [None]:
!cat .gitignore

* Git status

In [None]:
!git status

* Git add

In [None]:
!git add .

* Git commit

In [None]:
!git commit -am " Add new plots"

* Git Push

In [None]:
!git push