Automated bean leaf disease classification by prototype machine learning models to classify three leaf conditions: angular leaf spot, bean rust or healthy.
Custom architecture: Bean-CNN (baseline CNN), Bean-CNN-LSTM (hybrid CNN-LSTM)
Pretrained adaptations: EfficientNet-B7 + FC, EfficientNet-B7 + LSTM
Custom models were developed in Python and TensorFlow 2.17.0. Our custom ML models are verified against Python 3.10.12 and 3.12.4, TensorFlow 2.16.0 and 2.17.0 on Jupyter Notebook (Windows 11, macOS Sequoia 15.0.1) and Google Colab (Linux Ubuntu 22.04 LTS). The further experiments with EfficientNet were carried out with PyTorch 2.6.0 on Kaggle's P100 GPU.
Using the ibean dataset by Makerere AI Research Lab, we developed our models by integrating CNN with LSTM.
Five additional training sets were created using various augmentation methods: increased brightness, cropping, flipping, rotation, and the multi-technique combination.
For these experiments, each training dataset contains 2715 images.
The custom models (Bean-CNN and Bean-CNN-LSTM) were developed on the original dataset, and then trained further on all augmented sets.
Each Jupyter Notebook contains one model development process and uses the identical code with different training sets.
This repo contains 3 notebook files showing the custom model development:
original Bean-CNN (905 training samples), original Bean-CNN-LSTM (905 training samples) and the best custom model Bean-CNN-LSTM (on the flip set with 2715 samples). The details of data is decribed below.
▪️Data description
| Set | Sample size |
|---|---|
| Training-Original Training-Increased brightness Training-Combination Training-Cropping Training-Flipping Training-Rotation |
905 2715 2715 2715 2715 2715 |
| Validation | 195 |
| Test | 195 |
The pretrained models were modified through fine-tuning process, and titled as EfficientNet-B7+FC and EfficientNet-B7+LSTM.
For these experiments, we used one training set, which was expanded by applying multiple augmentation methods (rotation, flips, scaling, blurring and contrast/brightness adjustment).
▪️Data description
| Set | Sample size |
|---|---|
| Training | 41,300 |
| Validation | 5,320 |
| Test | 128 |
🔷 Important Note: Due to the data size allowance limit, our data is stored in the Huggingface repository.