# Data Visualisation

## Objectives

- This notebook addresses Business Requirement 1: 
- analyzing and visualizing the differences between healthy leaves and those affected by mildew.

## Inputs

- inputs/cherry-leaves_dataset/cherry-leaves/train
- inputs/cherry-leaves_dataset/cherry-leaves/validation
- inputs/cherry-leaves_dataset/cherry-leaves/test

## Outputs

- Image shape embeddings pickle files.
- Visualizations of image mean and variability are provided for each label.
- A contrast plot highlights differences between leaves with and without mildew.
- A randomly generated montage displays both healthy and unhealthy leaves.

## Additional Comments | Insights | Conclusions

- This notebook outlines steps that offer data insights and include code to address Business Requirement 1.

---

## Import Libraries

In [1]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
import seaborn as sns 
sns.set_style('white')
import joblib
from matplotlib.image import imread

## Set the working directory

The working directory must be changed from its current folder to its parent folder

In [5]:
current_dir = os.getcwd()
current_dir

'/workspaces/mildew-detection-in-cherry-leaves/jupyter_notebooks'

We want to make the parent of the current directory the new current directory
* os.path.dirname() gets the parent directory
* os.chir() defines the new current directory

In [6]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

You set a new current directory


Confirm the new current directory

In [None]:
current_dir = os.getcwd()
current_dir

'/workspaces/mildew-detection-in-cherry-leaves'

---

## Set input directories

Set train, validation and test paths.

In [11]:
my_data_dir = 'inputs/cherry-leaves_dataset/cherry-leaves'
train_path = my_data_dir + '/train'
val_path = my_data_dir + '/validation'
test_path = my_data_dir + '/test'

## Set output directory

In [12]:
version = 'v1'
file_path = f'outputs/{version}'

if 'outputs' in os.listdir(current_dir) and version in os.listdir(current_dir + '/outputs'):
    print('A version already exists, please create a new version.')
    pass
else:
    os.makedirs(name=file_path)

A version already exists, please create a new version.


## Set Label Names

In [13]:
# Set the labels
labels = os.listdir(train_path)
print('Label for the images are', labels)

Label for the images are ['healthy', 'powdery_mildew']


---