# Crop Leaf Disease Detection — Starter Notebook

This Colab notebook initializes your environment and sets up the project structure for Fastai/PyTorch.

**Sections:**
1. Setup & Dependencies  
2. Dataset Download & Organization  
3. Data Exploration & Visualization  
4. Baseline Model (Fastai CNN)  
5. Evaluation & Metrics  
6. Improvements (Transfer Learning, Augmentations)  


In [None]:
pip install kaggle


In [None]:
from google.colab import files
files.upload()  # choose kaggle.json

!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json


In [None]:
%cd /content/drive/MyDrive/CropLeafDiseaseDetectionApp/crop-leaf-disease-detection

In [None]:
!kaggle datasets download -d emmarex/plantdisease

In [18]:
!rm -rf data/plantdisease.zip

In [19]:
!sudo rm -rf data/plantdisease.zip

In [None]:
!kaggle datasets download -d emmarex/plantdisease -p data/
!unzip data/plantdisease.zip -d data/


In [13]:
!ls data/PlantVillage | wc -l
!ls data/plantvillage/PlantVillage/ | wc -l

15
15


In [None]:
# ===============================
# 1) Setup & Dependencies
# ===============================
# If your repo is public on GitHub, you can clone it in Colab (optional):
# !git clone <YOUR_REPO_URL>
# %cd <YOUR_REPO_FOLDER>

# Option A: Install from requirements in parent folder (if you opened notebooks/starter.ipynb directly from repo)
try:
    import sys, os
    req_path = os.path.abspath(os.path.join(os.getcwd(), '..', 'requirements.txt'))
    if os.path.exists(req_path):
        print('Installing from', req_path)
        !pip -q install -r "$req_path"
    else:
        print('requirements.txt not found next to this notebook. Installing core libs...')
        !pip -q install torch torchvision fastai matplotlib pandas numpy scikit-learn
except Exception as e:
    print('Warning:', e)

import torch, torchvision
from fastai.vision.all import *
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
print('✅ Setup complete. Torch:', torch.__version__, '| CUDA available:', torch.cuda.is_available())


In [None]:
# ===============================
# 2) GPU check (optional)
# ===============================
!nvidia-smi || echo 'No NVIDIA GPU available in this runtime.'


In [None]:
# ===============================
# 3) (Optional) Mount Google Drive if your data is stored there
# ===============================
USE_DRIVE = False  # set to True if you want to mount
if USE_DRIVE:
    from google.colab import drive
    drive.mount('/content/drive')
    print('Drive mounted.')


## Next Steps
- **Dataset Download & Organization**: Put your dataset under `data/` with subfolders per class (e.g., `data/train/<class_name>/...`, `data/valid/<class_name>/...`).
- **Exploration**: Visualize a few images and check class counts.
- **Baseline**: Create a `DataBlock` and a `Learner` with a simple CNN (or use transfer learning with ResNet-34/50).
- **Metrics**: Track accuracy, confusion matrix.
- **Improvements**: Augmentations, LR finder, unfreezing, Grad-CAM.


In [None]:
# ===============================
# 4) Fastai DataBlock (template)
# ===============================
from fastai.vision.all import *
data_path = Path('data')  # adjust if your dataset lives elsewhere

# Example: assumes ImageFolder structure under data/
dls = ImageDataLoaders.from_folder(
    data_path,
    train='train',
    valid='valid',
    item_tfms=Resize(224)
)
dls.show_batch(max_n=8)

# Baseline learner with a pretrained model (ResNet-34)
learn = vision_learner(dls, resnet34, metrics=accuracy)
learn.fine_tune(1)


In [None]:
# ===============================
# 5) Evaluation
# ===============================
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(6,6))
interp.most_confused(min_val=2)
