# **Data Visualization**

## Objectives
To develop, train, and evaluate a machine learning model for image classification, incorporating image augmentation and comprehensive performance analysis through various metrics and visualizations. The objective includes modifying class indices, plotting augmented images, creating and summarizing the model, and saving the trained model. Model performance will be assessed using accuracy, ROC curves, and classification reports, followed by plotting the confusion matrix and saving the evaluation results. The final output will also involve making predictions on a random image file.

## Inputs
* Images are taken from the test, train, validation folders and their subfolders.
```
└───inputs/ 
    └───potato_disease_dataset/ 
        ├───test/
        │   ├───healthy
        │   ├───early_blight
        │   └───late_blight                   
        ├───train/
        │   ├───healthy
        │   ├───early_blight
        │   └───late_blight          
        └───validation/
            ├───healthy
            ├───early_blight
            └───late_blight               
```
* Image shape embeddings.


## Outputs
- Image augmentation.
    - Plot augmented images for each dataset.
- Modify class indices to alter prediction in labels.
- Create a machine learning model and display its summary.
    - Train the model.
    - Save the model.
    - Plot the learning curve to show model performance.
        - Model A - generate separate plots for accuracy and loss.
        - Model B - create a comprehensive model history plot.
        - Model C - visualize model history using Plotly.
- Evaluate the model using a saved file.
    - Calculate accuracy.
    - Plot the ROC curve.
    - Generate a classification report for Model A.
    - Model B - provide a classification report including macro average and weighted average.
    - Model C - produce a synthetic classification report per label.
- Plot the confusion matrix.
- Save the evaluation results in a pickle file.
- Predict on a random image file.

---

## Import necessary packages for this notebook

In [1]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.image import imread

---

## Set Working Directory

As the notebooks are within a subfolder, we will need to change the working directory when running the notebook in the editor.

We need to change the working directory from its current folder to its parent folder.
* We access the current directory with `os.getcwd()`

In [4]:
current_dir = os.getcwd()
current_dir

'/workspace/ci-ms5-spudscan/jupyter_notebooks'

We want to make the parent of the current directory the new current directory.
* `os.path.dirname()` gets the parent directory
* `os.chir()` defines the new current directory

In [5]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

You set a new current directory


Confirm the new current directory.

In [6]:
current_dir = os.getcwd()
current_dir

'/workspace/ci-ms5-spudscan'

---