## Intoduction.
We will be working mostly with the high level Keras API that comes bundled with tensorflow. The workflow is pretty straight forward:
- Load the data:
To get the most out of google colab, I recommend, uploading the dataset to your drive as a zipped file or tarball, then mounting your drive on colab and running the `unzip` command on the path to the dataset. This may seem like a lot of work, but will be worth it since when you unzip the file from colab, it saves the extracted files **temporarily** to your colab runtime and thus reduces IO overhead when running the directory iterators.
- Once the files are all setup, we then load the mobilenet_v2 and do some fine tuning for it to suit our needs.
- From there, all thats left is training the model, evaluating its metrics and iterating on said metrics until we get something desirable.

**Remember to set the runtime on colab to a GPU to get faster results**


## First, all the usual imports:

In [10]:
import os
import pathlib
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import plot_model
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet import preprocess_input
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.layers import Dense, Activation, Flatten, Dropout, GlobalAveragePooling2D


## Setting global constants and mounting the data path
One everything is extarcted, we then set the directories for the training and validation data sets and set some global constants to be used in the rest of the notebook.

In [11]:
base_path = r"C:\Users\sricharan namburi\OneDrive\Desktop\DATA\Plant Disease Dataset"

train_data_dir =r"C:\Users\sricharan namburi\OneDrive\Desktop\DATA\Plant Disease Dataset\Train"
val_data_dir = r"C:\Users\sricharan namburi\OneDrive\Desktop\DATA\Plant Disease Dataset\Validation"

os.chdir(base_path)
!dir 

 Volume in drive C is Windows-SSD
 Volume Serial Number is 5E9E-5009

 Directory of C:\Users\sricharan namburi\OneDrive\Desktop\DATA\Plant Disease Dataset

02-05-2023  19:10    <DIR>          .
14-04-2023  12:31    <DIR>          ..
26-04-2023  15:07    <DIR>          Train
26-04-2023  15:07    <DIR>          Validation
               0 File(s)              0 bytes
               4 Dir(s)  70,733,713,408 bytes free


In [14]:
img_size, batch_size, shuffle_size = 256, 32, 1000
img_shape = (img_size, img_size, 3)

## Creating the Image Data Generators
Finally, we get to the fun stuff. We create the desired generators and provide some data augmentation options to it, to get a larger dataset and then feed the generators to a directory iterator that will supply the model with the images.

The output of the following cell reveals to us how many classes were discovered.

In [15]:
gen = ImageDataGenerator(
    horizontal_flip=True,
    zoom_range=0.1,
    rotation_range=10,
    width_shift_range=0.1,
    height_shift_range=0.1,
    preprocessing_function=preprocess_input
)

train_generator = gen.flow_from_directory(
    directory=train_data_dir,
    target_size=(img_size, img_size)
)

val_generator = gen.flow_from_directory(
    directory=val_data_dir,
    target_size=(img_size, img_size)
)

Found 33548 images belonging to 18 classes.
Found 8387 images belonging to 18 classes.


## Saving the dataset labels
The generators has an attribute named `class_indices` that contains the data labels that have been supplied by keras.

We save this into a variable since we will require it later.

In [16]:
labels = train_generator.class_indices
labels

{'Apple___Apple_scab': 0,
 'Apple___Black_rot': 1,
 'Apple___Cedar_apple_rust': 2,
 'Apple___healthy': 3,
 'Grape___Black_rot': 4,
 'Grape___Esca_(Black_Measles)': 5,
 'Grape___Leaf_blight_(Isariopsis_Leaf_Spot)': 6,
 'Grape___healthy': 7,
 'Potato___Early_blight': 8,
 'Potato___healthy': 9,
 'Tomato___Early_blight': 10,
 'Tomato___Leaf_Mold': 11,
 'Tomato___Septoria_leaf_spot': 12,
 'Tomato___Spider_mites Two-spotted_spider_mite': 13,
 'Tomato___Target_Spot': 14,
 'Tomato___Tomato_Yellow_Leaf_Curl_Virus': 15,
 'Tomato___Tomato_mosaic_virus': 16,
 'Tomato___healthy': 17}

## Preparing MobileNetV2
Keras comes packed with some models and for our case we will be using MobileNetV2, for this notebook. The first running this cell will fetch the model from the internet but afterwards the model becomes cached. Since we want the model to come with its weights, we set the value to imagenet and discrad the last 1000 layer with softmax activation for use in imagenet by setting the parameter include_top to false

In [17]:
mobile_net = MobileNetV2(weights='imagenet', include_top=False, input_shape=img_shape)

mobile_net.summary()

Model: "mobilenetv2_1.00_224"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 256, 256, 3  0           []                               
                                )]                                                                
                                                                                                  
 Conv1 (Conv2D)                 (None, 128, 128, 32  864         ['input_1[0][0]']                
                                )                                                                 
                                                                                                  
 bn_Conv1 (BatchNormalization)  (None, 128, 128, 32  128         ['Conv1[0][0]']                  
                                )                                              

## Fine Tuning MobileNet
In the following cell, we start fine tuning the model, through some experimentation of my own, I found the architecture below to result in pretty satisfying results with the training accuracy clocking at 0.98 and the validation accuracy not too far behind.

In [18]:
x = mobile_net.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
x = Dense(512, activation='relu')(x)
x = Dense(18, activation='softmax')(x)


model = Model(mobile_net.input, x)

for layer in model.layers[:-23]:
    layer.trainable = False

model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 256, 256, 3  0           []                               
                                )]                                                                
                                                                                                  
 Conv1 (Conv2D)                 (None, 128, 128, 32  864         ['input_1[0][0]']                
                                )                                                                 
                                                                                                  
 bn_Conv1 (BatchNormalization)  (None, 128, 128, 32  128         ['Conv1[0][0]']                  
                                )                                                             

In [19]:
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['acc'])

In [20]:
checkpoint = ModelCheckpoint("pdd_model.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', save_freq=1)
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=10, verbose=1, mode='auto')

In [21]:
step_size = train_generator.n//train_generator.batch_size

model_history = model.fit(
    train_generator,
    epochs=5,
    steps_per_epoch=step_size,
    validation_data=val_generator,
    validation_steps=25,
    callbacks = [checkpoint, early]
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [2]:
acc = model_history.history['acc']
val_acc = model_history.history['val_acc']

loss = model_history.history['loss']
val_loss = model_history.history['val_loss']

epochs_range = model_history.epoch

plt.figure(figsize = (16, 6))

plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label = 'Training Accuracy')
plt.plot(epochs_range, val_acc, label = 'Validation Accuracy')
plt.grid(True)
plt.legend(loc = 'lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label = 'Training Loss')
plt.plot(epochs_range, val_loss, label = 'Validation Loss')
plt.grid(True)
plt.legend(loc = 'upper right')
plt.title('Training and Validation Loss')

plt.suptitle('Plant Leaf Disease Detection')

NameError: name 'model_history' is not defined

In [1]:
model.save(r'C:\Users\sricharan namburi\Downloads\Plant_Disease_ML_Model2-main\Plant_Disease_ML_Model2-main\models\pddmobilenet2v1.h5')
model.summary()

NameError: name 'model' is not defined