# **Data Visualization**

<p style="text-align: center;">
    <img style="width: 35%; height: 20%; float: left;" src="../assets/images/data_visualization.jpg" alt="Data Visualization image">
</p>

## Objectives

* Accomplish first business requirement:

  _**"The client is interested in conducting a study to visually differentiate a cherry leaf that is healthy from one that contains powdery mildew."**_

## Inputs Required

* Image data will be sourced from the following directories and their subfolders:

  - **Training Images**: inputs/cherry_leaves_dataset/cherry-leaves/train
  - **Validation Images**: inputs/cherry_leaves_dataset/cherry-leaves/validation
  - **Test Images**: inputs/cherry_leaves_dataset/cherry-leaves/test

## Generated Outputs

1. Embeddings for Image Shapes.
2. Average and variability of images per label plot.
3. Plot to distinguish the contrast between healthy leaves and ones infected with powdery mildew.
4. Generate code that satisfies business requirement 1 and can be used to build an image montage on the Streamlit dashboard.

---

# Set up the working environment

## Import libraries

In [1]:
import os
import joblib
import itertools
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.image import imread
from tensorflow.keras.preprocessing import image
sns.set_style("white")
print("\033[92mLibraries Imported Successfully!\033[0m")

[92mLibraries Imported Successfully![0m


# Change working directory

* To maintain a straightforward folder structure for the application, we must navigate from the current folder to its parent folder by using `os.getcwd()` to access the current directory.

In [2]:
current_dir = os.getcwd()
current_dir

'/workspaces/mildew-detection-in-cherry-leaves/jupyter_notebooks'

* To update the current directory to its parent directory, follow these steps:

  * Use `os.path.dirname()` to obtain the parent directory.
  * Utilize `os.chdir()` to set the new current directory to the parent directory.

In [3]:
os.chdir(os.path.dirname(current_dir))
print(f"\033[92mYou set a new current directory!\033[0m")

[92mYou set a new current directory![0m


* Confirm the new current directory.

In [4]:
new_current_dir = os.getcwd()
new_current_dir

'/workspaces/mildew-detection-in-cherry-leaves'

# Set input and output directory paths

**Inputs**

In [5]:
data_dir = 'inputs/cherry_leaves_dataset/cherry-leaves'
train_path = data_dir + '/train'
validation_path = data_dir + '/validation'
test_path = data_dir + '/test'

**Outputs**

In [6]:
version = 'V_1'

file_path = f'outputs/{version}'
version_file_path = os.path.join(new_current_dir, file_path)

if os.path.exists(version_file_path):
    # check version file path exists, if not creates a new directory.
     print(f"\033[91mVersion {version} already exists. Create a new version please! \033[0m")
     pass
else:
    os.makedirs(name=file_path)
    print(f"\033[92mVersion {version} created successfully! \033[0m")

[91mVersion V_1 already exists. Create a new version please! [0m


# Set label names

In [7]:
labels = os.listdir(train_path)
print('Labels for the images are:', labels)

Labels for the images are: ['healthy', 'powdery_mildew']


---