# Data Visualization for Mildew Detection in Cherry Leaves

## Objectives
* Understand the visual differences between healthy and powdery mildew-infected cherry leaves.
* Analyze the distribution of image sizes and ensure the dataset is balanced.
* Prepare visual content that can be used for the dashboard or further analysis.

## Inputs
* Preprocessed dataset with images categorized into healthy and powdery mildew classes.

## Outputs
* Visualizations including sample images, average images, and variability images for each class.
* Any intermediate files or plots saved for future reference or use in presentations.

## Additional Comments
* Visualization is key to understanding the data and guiding the model development process.

---

# Set Data Directory

## Import libraries

In [None]:
import matplotlib.pyplot as plt
from matplotlib.image import imread
import seaborn as sns
import numpy as np
import os
from PIL import Image

## Set working directory

In [None]:
current_dir = os.getcwd()
print("Original working directory:", current_dir)

In [None]:
# Change the current working directory to the project root
os.chdir(os.path.abspath(os.path.join(current_dir, relative_path_to_root)))

# Verify the change
print("New current working directory:", os.getcwd())

## Set input directories

In [None]:
base_path = "inputs/cherry_leaves_dataset/cherry-leaves"
train_path = os.path.join(base_path, 'train')
val_path = os.path.join(base_path, 'validation')
test_path = os.path.join(base_path, 'test')
categories = ['healthy', 'powdery_mildew']

## Set output directory

In [None]:
# Define the base output directory name
output_base_path = "outputs/data_visualization"

# Optional: add versioning or categorization
version = "v1"
output_dir = os.path.join(output_base_path, version)

# Create the output directory if it doesn't exist
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
    print(f"Created output directory: {output_dir}")
else:
    print(f"Output directory already exists: {output_dir}")


## Set Label Names

In [None]:
# Set the labels by listing the directories in your train_path
labels = os.listdir(train_path)
print('Labels for the images:', labels)

---