This repository contains the code used in the manuscript to classify lemon, mango, and pomegranate leaves as healthy vs diseased by:
- extracting deep features from ImageNet-pretrained CNN backbones (EfficientNet-B0, DenseNet201, MobileNetV2, ResNet50),
- selecting features with a mutual-information mRMR variant (no leakage; applied per training fold), and
- training classical classifiers (SVM-RBF, Bagged Trees, kNN, MLP).
The pipeline reports mean ± SD across stratified 10-fold CV, includes imbalance-robust metrics (Balanced Accuracy, MCC, Cohen’s κ), per-class AUPRC, pooled confusion matrices, and generates qualitative activation heatmaps.
- Python ≥ 3.10
- PyTorch ≥ 2.0, torchvision ≥ 0.15
- scikit-learn, pandas, numpy, matplotlib, joblib, Pillow, scipy
Example (pip):
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121 # pick wheel for your CUDA/CPU
pip install scikit-learn pandas numpy matplotlib joblib pillow scipy
## 2. Data Layout
The code expects a folder with the structure below (class names are matched case-insensitively and remapped to ["Diseased","Healthy"]):
DATA_ROOT/
├── Healthy/
│ ├── img_001.jpg
│ └── ...
└── Diseased/
├── img_101.jpg
└── ...
## 3. How to Run
Edit the CONFIG block at the top of the script:
DATA_ROOT: dataset root
OUT_DIR: output directory for tables and figures
optional: AUGMENT="light" to enable mild train-time augmentation
Run:python main.py
## 4. Outputs
Under OUT_DIR you will find:
tables_3_7.xlsx
Table3, Table4: best-by-F1 summaries (mean ± SD)
Table5: average Prediction Speed (obs/sec) and Training Time (sec)
Table6_CM: pooled confusion matrices for top-4 configs
Table7_mean and Table7_mean_sd_raw (to render “mean ± SD” per manuscript)
figure5_roc_best_pipeline.png/.svg
figure6_densenet201_svm_heatmaps.png/.svg
cache_full_[backbone]/feats_[backbone].npz (optional, if CACHE_FEATURES=True)