# 04 — Hypothesis H1: Texture & Color Variability (GLCM)

**Objective**  
Quantify texture differences between healthy and mildew-infected cherry leaves using GLCM features to test H1:
> Mildew-infected leaves exhibit higher texture and color variability than healthy leaves.

**Inputs**  
- Image dataset: `inputs/cherry_leaves_dataset/{healthy, powdery_mildew}`

**Outputs**  
- CSV: `inputs/features/v1/glcm_features.csv` (per-image GLCM features)  
- CSV: `inputs/features/v1/glcm_stats.csv` (p-values & effect sizes)  
- CSV: `inputs/features/v1/glcm_feature_means.csv` (class-wise means/medians)  
- Plot: `plots/v2/glcm_boxplots.png` (boxplots per feature by class)

**Notes**  
Images are resized to 100×100, converted to grayscale, quantized to fixed gray levels, and GLCM features are averaged across multiple distances and angles.

In [1]:
from pathlib import Path
import sys

def find_project_root(start: Path) -> Path:
    """Walk up until a folder containing 'src' is found, else return start."""
    p = start
    for _ in range(5):
        if (p / "src").exists():
            return p
        p = p.parent
    return start

PROJECT_ROOT = find_project_root(Path.cwd())
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

from src.paths import DATA_DIR, PLOTS_DIR, PROJECT_ROOT

print("PROJECT_ROOT:", PROJECT_ROOT)
print("DATA_DIR:", DATA_DIR)
print("PLOTS_DIR:", PLOTS_DIR)

PROJECT_ROOT: C:\Users\ksstr\Documents\Coding\milestone-project-5
DATA_DIR: C:\Users\ksstr\Documents\Coding\milestone-project-5\inputs\cherry_leaves_dataset
PLOTS_DIR: C:\Users\ksstr\Documents\Coding\milestone-project-5\plots\v1


In [2]:
# Configuration and output directories
from pathlib import Path

CLASSES = ("healthy", "powdery_mildew")
ALLOWED = {".jpg", ".jpeg", ".png", ".JPG", ".JPEG", ".PNG"}

IMG_SIZE = (100, 100)           # (width, height)
GRAY_LEVELS = 32                # quantization levels for GLCM
DISTANCES = [1, 2, 3]           # pixel distances
ANGLES = [0, 0.25, 0.5, 0.75]   # angles in units of pi (0, π/4, π/2, 3π/4)
FEATURES = ["contrast", "energy", "homogeneity", "correlation"]

# Output locations
FEATURES_DIR = Path("inputs") / "features" / "v1"
FEATURES_DIR.mkdir(parents=True, exist_ok=True)

PLOTS_V2_DIR = PLOTS_DIR.parent / "v2"
PLOTS_V2_DIR.mkdir(parents=True, exist_ok=True)

print("FEATURES_DIR:", FEATURES_DIR.resolve())
print("PLOTS_V2_DIR:", PLOTS_V2_DIR.resolve())

FEATURES_DIR: C:\Users\ksstr\Documents\Coding\milestone-project-5\jupyter_notebooks\inputs\features\v1
PLOTS_V2_DIR: C:\Users\ksstr\Documents\Coding\milestone-project-5\plots\v2
