# SakuraScan – Modelling (PyTorch)

## Objectives
- Train a binary image classifier to distinguish healthy cherry leaves from leaves with powdery mildew.
- Use transfer learning with a pretrained CNN (ResNet18) for robust performance.
- Save the trained model for use in the SakuraScan Streamlit dashboard.

## Inputs
- Image dataset stored in `Data/source_images/healthy` and `Data/source_images/powdery_mildew`.

## Outputs
- Trained PyTorch model weights saved to `app_pages/src/models/sakuramodel_resnet18.pth`.
- Printed training and validation accuracy and loss.


In [None]:
"""
Model training script for SakuraScan using PyTorch and transfer learning.
"""

from pathlib import Path
import os
from typing import Tuple, Dict, List

import torch
from torch import nn, optim
from torch.utils.data import DataLoader, random_split
from torchvision import datasets, transforms, models

In [None]:
"""
Set up paths, constants, and devive configuration.
"""

PROJECT_ROOT = Path('.').resolve()

DATA_DIR = PROJECT_ROOT / 'Data' / 'source_images'
MODEL_DIR = PROJECT_ROOT / 'app_pages' / 'src' / 'models'
MODEL_DIR.mkdir(parents=True, exist_ok=True)

MODEL_PATH = MODEL_DIR / 'sakuramodel_resnet18.pth'

BATCH_SIZE = 32 # Number of images processed in one training step. Larger batches train faster but use more memory.
NUM_EPOCHS = 8 # Adjust if i want to train longer, (How many full passes the model makes over the entire training dataset. More epochs = more learning, up to a point.)
LEARNING_RATE = 1e-4 # Controls how big the weight updates are during training. Too high = unstable, too low = slow learning.
VAL_SPLIT = 0.2 # Fraction of the dataset reserved for validation to evaluate model performance.
IMAGE_SIZE = 224 # Target resolution for all input images (ResNet models expect 224×224 pixels).

device = torch.device('cuda' if torch.cuda.is_avaliable() else 'cpu')
device