# **Modelling and Evaluation**

## Objectives

### Answer **business requirement 2**:
- The client is interested in predicting whether a cherry leaf is healthy or contains powdery mildew. 

## Inputs

* inputs/cherryleaves_dataset/leaf_images/train
* inputs/cherryleaves_dataset/leaf_images/validation
* inputs/cherryleaves_dataset/leaf_images/test
* image shape embeddings

## Outputs

* Images distribution plot in train, validation and test set.
* Image augmentation.
* Class indices to change prediction inference in labels.
* Machine Learning model creation and training.
* Save model.
* Learning curve plot for model performance.
* Model evaluation on pickle file. 
* Prediction on random image file.


---

## Import Packages 

In [None]:
# import os(
# import pandas as pd
# import numpy as np
# import matplotlib.pyplot as plt
# import seaborn as sns
# from matplotlib.image import imread


---
##  Set Working directory

In [None]:
import os
current_dir = os.getcwd()
current_dir

In [None]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

In [None]:
# work_dir = os.getcwd()
work_dir

### Set Input Directories
Set train, validation, and test paths

In [None]:
# my_data_dir = 'etc'
# train_path
# val_path
# test_path

### Set Output Directory 

In [None]:
# version = 'v1'
# file_path = f''

# if 'outputs' in os.listdir(work_dir) and version in os.listdir(work_dir + '/outputs'):

# etc 


### Set Labels 

In [None]:
# labels = os.listdir(train_path)
# labels are: etc 

### Set Image Shape

In [None]:
## Import saved image shape embedding 
# import joblib
# version = 'v1'

# etc 

---
## Number of Images in Train, Test, and Validation Datasets


In [None]:
# df_freq = pd.dataframe([])
# etc 

---
# Image Data Augmentation
---

## ImageDataGenerator used: 

In [None]:
# from tensorflow.keras.preprocessing.image import ImageDataGenerator

### **Initialise** ImageDataGenerator:


In [None]:
# augmented_image_data(etc )

### Augment training image dataset:

In [None]:
"""
# set batch size first 
batch_size = 20
train_set = augmented_image_data.flow_from_directory(train_path,
                                              target_size=image_shape[:2],
                                              color_mode='rgb',
                                              batch_size=batch_size,
                                              class_mode='binary',
                                              shuffle=True
                                              )

train_set.class_indices
"""

### Augment validation image dataset:

In [None]:
"""
validation_set = ImageDataGenerator(rescale=1./255).flow_from_directory(val_path,
                                                          target_size=image_shape[:2],
                                                          color_mode='rgb',
                                                          batch_size=batch_size,
                                                          class_mode='binary',
                                                          shuffle=False
                                                          )

validation_set.class_indices
"""

### Augment test image dataset:

In [None]:
"""
test_set = ImageDataGenerator(rescale=1./255).flow_from_directory(test_path,
                                                    target_size=image_shape[:2],
                                                    color_mode='rgb',
                                                    batch_size=batch_size,
                                                    class_mode='binary',
                                                    shuffle=False
                                                    )

test_set.class_indices
"""

### **Plot Augmented Training Image**

In [None]:
"""
for _ in range(3):
    img, label = train_set.next()
    print(img.shape)   
    plt.imshow(img[0])
    plt.show()
"""

### **Plot Augmented Validation and Test Images**

In [None]:
"""
for _ in range(3):
    img, label = validation_set.next()
    print(img.shape)   
    plt.imshow(img[0])
    plt.show()
"""

In [None]:
"""
for _ in range(3):
    img, label = test_set.next()
    print(img.shape)  
    plt.imshow(img[0])
    plt.show()
"""

### Save class indices:

In [None]:
"""
joblib.dump(value=train_set.class_indices ,
            filename=f"{file_path}/class_indices.pkl")
"""

---
# Model Creation
----

## ML Model

### Import Model Packages:

In [None]:
"""
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dropout, Flatten, Dense, Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.utils import plot_model
"""

## Model

In [None]:
def image_tf_model():
    """
    Docstring for this model
    """

    # model = Sequential()
## input layer
    # model.add(etc)


# return model

### Model Summary 

In [None]:
# image_tf_model().summary()

### Early Stopping
* Calls upon the keras module EarlyStopping:

In [None]:
# early_stop = EarlyStopping(monitor='val_loss',patience=3)

## Fit the Model for Training
* Using the .fit() method 

In [None]:
"""
model = image_tf_model()
model.fit(train_set,
          epochs=25,
          steps_per_epoch = len(train_set.classes) // batch_size,
          validation_data=validation_set,
          callbacks=[early_stop],
          verbose=1
          )
"""

### Save Model:

In [None]:
# model.save('outputs/v1/powdery_mildew_detector_model.h5')

---
# Model Performance
---

## Model Learning Curve


In [None]:
"""
losses = pd.DataFrame(model.history.history)

sns.set_style("darkgrid") # CHANGE THE SEABORN STYLE
losses[['loss','val_loss']].plot(style='.-')
plt.title("Loss")
plt.savefig(f'{file_path}/model_training_losses.png', bbox_inches='tight', dpi=150)
plt.show()


print("\n")
losses[['accuracy','val_accuracy']].plot(style='.-')
plt.title("Accuracy")
plt.savefig(f'{file_path}/model_training_acc.png', bbox_inches='tight', dpi=150)
plt.show()
"""

## Model Evaluation

### Load saved model:

In [None]:
"""
from keras.models import load_model
model = load_model('outputs/v1/powdery_mildew_detector_model.h5')
"""

### Evaluate model on test set:

In [None]:
# evaluation = model.evaluate(test_set) 

### Save evaluation pickle:

In [None]:
"""
joblib.dump(value=evaluation ,
            filename=f"outputs/v1/evaluation.pkl")
"""

## Predict On New Image Data 

Load random image as PIL:

In [None]:
"""
from tensorflow.keras.preprocessing import image

pointer = 66 
label = labels[0] # select healthy or Powdery Mildew

pil_image = image.load_img(test_path + '/'+ label + '/'+ os.listdir(test_path+'/'+ label)[pointer],
                          target_size=image_shape, color_mode='rgb')
print(f'Image shape: {pil_image.size}, Image mode: {pil_image.mode}')
pil_image
"""

Convert image to an array to make a prediction:

In [None]:
# my_image = image.img_to_array(pil_image)
# etc 

Predict class probabilities:

In [None]:
# pred_proba = model.predict(my_image)[0, 0]
# etc 

# Push files to Repository 

## Pushing generated/new files to GitHub repository

* .gitignore 

In [None]:
# the cat gitignore command to check it 

* Add, commit, push files to GitHub. 

---
## Conlcusions and Next Steps
* The Model has now been created a trained, and can predict on new/live data.
* The dashboard will now be created utilising the model and files created in the notebooks. 

In [None]:
import os
try:
    # create here your folder
    # os.makedirs(name='')
except Exception as e:
    print(e)
