# **** Modeling and Evaluation Notebook

## Objectives

* Answering Business requirement 2 : 
        *Predict whether a leaf is infected with powdery mildew or not. 

## Inputs

*  Client will input images of leaves from the following datasets : <br/>
    ** inputs/cherry-leaves_dataset/cherry-leaves/train<br/>
    ** inputs/cherry-leaves_dataset/cherry-leaves/validation <br/>
    ** inputs/cherry-leaves_dataset/cherry-leaves/test <br/> 
    ** image shape embeddings

## Outputs

* Generate images distribution plot in train, validation, and test set
* Perform image augmentation when needed
* Class indices to change prediction inference in labels
* Machine learning model creation and training
* Saved model
* Learning curve plot for model performance
* Model evaluation on pickle file
* Prediction on the random image file

## Additional Comments

* In case you have any additional comments that don't fit in the previous bullets, please state them here. 



---

## Import regular packages 

In [1]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.image import imread

## Change working directory

* Since we are storing the notebooks in a subfolder, we change the working directory.

In [2]:
import os
current_dir = os.getcwd()
current_dir
os.chdir('/workspace/mildew-detection-in-cherry-leaves')
print("You set a new current directory")

You set a new current directory


Confirm the new current directory

In [3]:
current_dir = os.getcwd()
current_dir

'/workspace/mildew-detection-in-cherry-leaves'

## Set input directories 

Set Train, Validation and Test Set

In [4]:
my_data_dir = 'inputs/cherry-leaves_dataset/cherry-leaves'
train_path = my_data_dir + '/train'
val_path = my_data_dir + '/validation'
test_path = my_data_dir + '/test'

Set output directory 

In [5]:
import os
version = 'v1'
file_path = f'outputs/{version}'

if 'outputs' in os.listdir(current_dir) and version in os.listdir(current_dir + '/outputs'):
    print('Old Version is already available, create a new version.')
    pass
else:
    os.makedirs(name=file_path)

Old Version is already available, create a new version.


Set Labels

In [6]:
labels = os.listdir(train_path)

print(
    f"Project Labels: {labels}"
    )

Project Labels: ['healthy', 'powdery_mildew']


Set image shape 

In [7]:
import joblib
version = 'v1'
image_shape = joblib.load(filename=f"outputs/{version}/image_shape.pk1")
image_shape

(256, 256, 3)

# Number of images in train, test and validation data

Calculate the quantity of images in each set by generating a bar graph and save the resulting plot in the '/output/' directory.

In [23]:
df_freq = pd.DataFrame([])
for folder in ['train', 'validation', 'test']:
  for label in labels:
    df_freq = df_freq.append(
        pd.Series(data={'Set': folder,
                        'Label': label,
                        'Frequency':int(len(os.listdir('/workspace/mildew-detection-in-cherry-leaves/inputs/cherry-leaves_dataset/cherry-leaves/test/healthy')))}
                        ),
                        ignore_index=True
        )
    print(f"* {folder} - {label}: {len(os.listdir('/workspace/mildew-detection-in-cherry-leaves/inputs/cherry-leaves_dataset/cherry-leaves/test/healthy'))} images")

  
print('/')
sns.set_style('darkgrid')
plt.figure(figsize=(8,5))
sns.barplot(data=df_freq, x='Set', y='Frecuency', hue='Label')
plt.savefig(f"{filepath}/labels_distribution.png", bbox_inches='tight',dpi=200)
plt.show()

* train - healthy: 422 images
* train - powdery_mildew: 422 images
* validation - healthy: 422 images
* validation - powdery_mildew: 422 images
* test - healthy: 422 images
* test - powdery_mildew: 422 images
/


ValueError: Could not interpret input 'Frecuency'

<Figure size 800x500 with 0 Axes>

---

# Section 2

Section 2 content

---

NOTE

* You may add as many sections as you want, as long as it supports your project workflow.
* All notebook's cells should be run top-down (you can't create a dynamic wherein a given point you need to go back to a previous cell to execute some task, like go back to a previous cell and refresh a variable content)

---

# Push files to Repo

* If you don't need to push files to Repo, you may replace this section with "Conclusions and Next Steps" and state your conclusions and next steps.

In [None]:
import os
try:
    # create here your folder
    # os.makedirs(name='')
except Exception as e:
    print(e)
