LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis
Quan Huu Cap, Hiroyuki Uga, Satoshi Kagiwada, Hitoshi Iyatomi
Paper: https://arxiv.org/abs/2002.10100
Accepted for publication in the IEEE Transactions on Automation Science and Engineering (T-ASE)
Abstract: Many applications for the automated diagnosis of plant disease have been developed based on the success of deep learning techniques. However, these applications often suffer from overfitting, and the diagnostic performance is drastically decreased when used on test datasets from new environments. In this paper, we propose LeafGAN, a novel image-to-image translation system with own attention mechanism. LeafGAN generates a wide variety of diseased images via transformation from healthy images, as a data augmentation tool for improving the performance of plant disease diagnosis. Thanks to its own attention mechanism, our model can transform only relevant areas from images with a variety of backgrounds, thus enriching the versatility of the training images. Experiments with five-class cucumber disease classification show that data augmentation with vanilla CycleGAN cannot help to improve the generalization, i.e. disease diagnostic performance increased by only 0.7% from the baseline. In contrast, LeafGAN boosted the diagnostic performance by 7.4%. We also visually confirmed the generated images by our LeafGAN were much better quality and more convincing than those generated by vanilla CycleGAN.
- Jul 25, 2021: Added a new option to load the mask images from disk. Running the LFLSeg module during training is quite slow. Instead, we can generate the masks of all training images beforehand and load it during training. Refer to prepare_mask.py of how to generate mask images from the pre-trained LFLSeg, and the unaligned_masked_dataset.py of how to load the mask images. See below of how to train with this new feature.
Tutorial of how to create dataset and train the LFLSeg module is available in the LFLSeg
- Normal dataset: A normal dataset will have 4 directories for two domains A (trainA, testA) and B (trainB, testB). Each directory must contain only images (no other file types).
An example of the dataset named
healthy2brownspot
/path/to/healthy2brownspot/trainA
/path/to/healthy2brownspot/testA
/path/to/healthy2brownspot/trainB
/path/to/healthy2brownspot/testB
- Masked dataset: This dataset is normal dataset + pre-generated mask images. First, you need to generate your own mask images using the prepare_mask.py. An example of the masked dataset named
healthy2brownspot_mask
/path/to/healthy2brownspot/trainA
/path/to/healthy2brownspot/trainA_mask # mask images of trainA
/path/to/healthy2brownspot/testA
/path/to/healthy2brownspot/trainB
/path/to/healthy2brownspot/trainB_mask # mask images of trainB
/path/to/healthy2brownspot/testB
- Make sure to prepare the dataset first
- Train a model (example with the dataset
healthy2brownspot
):
python train.py --dataroot /path/to/healthy2brownspot --name healthy2brownspot_leafGAN --model leaf_gan
- Train model with mask images (example with the dataset
healthy2brownspot_mask
):
python train.py --dataroot /path/to/healthy2brownspot --name healthy2brownspot_leafGAN --model leaf_gan --dataset_mode unaligned_masked
To see more intermediate results, check out ./checkpoints/healthy2brownspot_leafGAN/web/index.html
.
- Test the model:
python test.py --dataroot /path/to/healthy2brownspot --name healthy2brownspot_leafGAN --model leaf_gan
- The test results will be saved to a html file here:
./results/healthy2brownspot_leafGAN/latest_test/index.html
.
@article{cap2020leafgan,
title = {LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis},
author = {Quan Huu Cap and Hiroyuki Uga and Satoshi Kagiwada and Hitoshi Iyatomi},
journal = {IEEE Transactions on Automation Science and Engineering},
year = {2020},
doi = {10.1109/TASE.2020.3041499}
}
Our code is inspired by pytorch-CycleGAN.