Skip to content

This is a university project that aims to solve an apple leaf disease detection problem. Furthermore, an Android application is implemented for the launch of the model.

Notifications You must be signed in to change notification settings

fabio4520/AI-model-for-apple-leaf-diseases

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project has been developed by the following final year students of the Mechanical Eectrical engineering career at Piura University in Peru.

(I) Abstract

(II) Dataset used

Due to the limitations in the current context due to the COVID-19 pandemic, there is a complicated situation in data collection. Therefore, the Plant Pathology 2021 - FGVC82 dataset will be used. This dataset contains more of 18000 images with apple leaves affected by various diseases and a CSV (Comma Separated Values) format file, which contains the name of each images and its respective label. We would like to emphasize that, for the simplification of this work, we only want to use the images of healthy leaves, leaves affected by a single type of disease and leaves affected with multiple diseases, but only of a specific label, then the dataset is reduced to around 17000 images The dataset is available in the following link:

The classes used for this project are:

  • Healthy
  • Scab
  • Powdery Mildew
  • Complex
  • Rust
  • Frog Eye Leaf Spot

Complex: Frog Eye Leaf Spot: Healthy Rust: Scab: Powdery Mildew:

(III) Exploratory Data Analysis (EDA)

When you are starting a machine learning (ML) project, the most important thing to keep in mind is that data is the fundamental basis of everything. Andrew Ng, former Baidu chief science officer and founding leader of the Google Brain team, said in an interview with Wired magazine: “The analogy with deep learning [one of the key processes in creating artificial intelligence] is that the rocket engine they are deep learning models and the fuel is the huge amounts of data that we can feed to these algorithms.

We can see the number of images of each class in the data set in the following figure:

To get an overview of the data we can see the next figure:

As we see in the image avobe, we have imbalanced classes, one practical solution is to get smaller samples of the dataset then proportions are closer. Then, final proportion of classes are in the following figure:

(IV) Image Processing

There are many techniques to process images, like Otsu, Canny Edge detector, Roi and Edge ROI image. In this project for the images processing, the processing granted by the Keras tensorflow library has been used, which offers us the possibility of processing the images with the same processing with which the Mobilenet architecture was trained. What the preprocessing does is scale the image pixels to values between -1 and 1 and returns a float32 type tensor. Also, the image size has been scaled to 224 x 224 pixels as this is the image size that MobileNet has trained with.

(V) MobileNet Architecture

MobileNet features an architecture consisting of depth separable convolutions to make the network deep, yet lightweight.

The MobileNet architecture is imported through the Keras applications API. It has been imported with the following arguments:

Argument Value
Input shape 224x224
Include top False
Weights Imagenet

The input size is chosen according to the size of the images with which the model will be trained, since in the image processing the images have been scaled to 224x224, this size is chosen. In include top, False is placed because the last layers are going to be added in the final model, since MobileNet has been trained to classify 1000 classes and our image set only needs to classify 6 classes. The weights have been selected from the pre-trained model, from the Imagenet dataset.

(VI) Building Model

The final model contains the MobileNet architecture, in addition two layers are added: GlobalAveragePooling2D and a Dense layer of 6 units. The model will have all the trainable layers, the number of total, trainable and non-trainable parameters will be the following:

Argument Value
Total parameters 3,235,014
Trainable parameters 3,213,126
Non-trainable parameters 21,888

(VII) Results

The model started with a good performance, with initial values of F1 score of 0.7868 for training and 0.8731 for validation. The training stopped after 18 epochs because the validation loss stopped decreasing (the EarlyStopping object was activated). The curves for training and validation can be seen in the following figure:

The model is evaluated with the evaluate method and the following results are obtained:

Loss F1 score
Training 0.0085 0.998
Validation 0.2911 0.9206
Test 0.3382 0.9157

(VIII) Confusion Matrix

The confusion matrix has been plotted for the results of the prediction of the test set, the following figure shows the confusion matrix:

A summary table of true positives and false positives is displayed:

Number of samples TP FP Accuracy
Complex 140 121 19 86.4
Frog eye leaf spot 147 140 7 95.2
healthy 165 156 9 94.5
powdery mildew 125 116 9 92.8
rust 216 184 32 85.2
scab 151 147 4 97.4

References:

About

This is a university project that aims to solve an apple leaf disease detection problem. Furthermore, an Android application is implemented for the launch of the model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published