# **Crop Disease Detection System**
To resolve the detection diseases of in plants. CNN deep-learning models are popular for image processing. However, deep CNN layers are difficult to train as this process is computationally expensive. To solve such issues, transfer learning based models have been proposed by various researchers. Popular transfer learning models
include VGG-16, ResNet, DenseNet, and Inception. Among these models, we choose VGG-16. Making it suitable for plant disease detection tasks. By using the capabilities of
VGG-16 we can develop an efficient system for plant disease detection.

### **Importing all the necessary Libraries**

In [1]:
import tensorflow as tf
import pandas as pd
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras import layers, models
from keras.callbacks import ReduceLROnPlateau
import json
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

## **1. Data Collection**
The dataset utilized for this project is taken from Kaggle repository: https://www.kaggle.com/datasets/vipoooool/new-plant-diseases-dataset. This dataset consists of about **87K rgb** images of healthy and diseased crop leaves which is categorized into **38 different classes**. The total dataset is divided into **80/20 ratio of training and validation** set preserving the directory structure. A new directory containing **33 test images** is created later for prediction purpose.

## **2. Data Pre-processing**
The dataset undergoes pre-processing steps to ensure its suitability for training crop disease
detection model. All images in the dataset are resized to a consistent dimension of 224, 224.

### **2.1 Training Images Loading and Preprocessing**

In [3]:
training_set = tf.keras.utils.image_dataset_from_directory(
    'Dataset/train',
    labels="inferred",
    label_mode="categorical",
    class_names=None,
    color_mode="rgb",
    batch_size=32,
    image_size=(224, 224),
    shuffle=True,
    seed=None,
    validation_split=None,
    subset=None,
    interpolation="bilinear",
    follow_links=False,
    crop_to_aspect_ratio=False
)

Found 70295 files belonging to 38 classes.


### **2.1 Validation Images Loading and Preprocessing**

In [4]:
validation_set = tf.keras.utils.image_dataset_from_directory(
    'Dataset/valid',
    labels="inferred",
    label_mode="categorical",
    class_names=None,
    color_mode="rgb",
    batch_size=32,
    image_size=(224, 224),
    shuffle=True,
    seed=None,
    validation_split=None,
    subset=None,
    interpolation="bilinear",
    follow_links=False,
    crop_to_aspect_ratio=False
)

Found 17572 files belonging to 38 classes.


## **3. VGG-16 Architecture**
The **VGG16** architecture is a deep convolutional neural network (CNN) designed for image
classification tasks. It was introduced by the Visual Geometry Group at the University of
Oxford. VGG-16 is characterized by its simplicity and uniform architecture, making it easy to
understand and implement.
The VGG-16 configuration typically consists of 16 layers, including 13 convolutional layers and
3 fully connected layers. These layers are organized into blocks, with each block containing
multiple convolutional layers followed by a max-pooling layer for downsampling.

### **3.1 Creating VGG-16 Base Model**

In [5]:
# Create the base model from the pre-trained model VGG16
base_model = VGG16( weights='imagenet', include_top=False, input_shape=(224, 224, 3) )
base_model.trainable = False # Freeze the base model

base_model.summary()

### **3.2 Adding Custom layers**
Adding flatten layer and dense layers to the base model

In [6]:
# Create the new layers
x = base_model.output
x = layers.Flatten()(x)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dense(512, activation='relu')(x)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.4)(x)
predictions = layers.Dense(38, activation='softmax')(x)

# Combine the base model and the new layers
model = models.Model(
    inputs = base_model.input,
    outputs = predictions
    )

# Compiling the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

### **3.3 VGG-16 Model Training Phase**

In [8]:
history = model.fit(
    training_set,
    steps_per_epoch=None,
    epochs=5,
    validation_data=validation_set,
    validation_steps=None,
    verbose=1,
    callbacks=[ReduceLROnPlateau(monitor='val_loss', factor=0.3,patience=3, min_lr=0.000001)],
    shuffle=True
    )
model.save('model_VGG16.h5')
model.save('model_VGG16.keras')

Epoch 1/5


[1m2197/2197[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22407s[0m 10s/step - accuracy: 0.1192 - loss: 4.0734 - val_accuracy: 0.1503 - val_loss: 3.1311 - learning_rate: 0.0010
Epoch 2/5
[1m1089/2197[0m [32m━━━━━━━━━[0m[37m━━━━━━━━━━━[0m [1m3:06:05[0m 10s/step - accuracy: 0.1498 - loss: 3.1401

##### **Saving Model**

In [None]:
model.save('model_VGG16.h5')
model.save('model_VGG16.keras')

##### **Training History**

In [None]:
print(training_history.history.keys())
history.history #Return Dictionary of history

#Recording History in json
with open('training_hist.json','w') as f:
  json.dump(history.history,f)

### **3.4 Training and Validation Loss/Accuracy**
The loss function is primarily used to evaluate the model’s efficacy. The loss function calculates
the amount of variation between the actual and predicted values.

In [None]:
#Training set Loss
train_loss, train_acc = cnn.evaluate(training_set)
print('Training Loss:', train_loss)

#Validation set Loss
val_loss, val_acc = cnn.evaluate(validation_set)
print('Validation Loss:', val_loss)

In [None]:
epochs = [i for i in range(1,6)]
plt.plot(epochs,training_history.history['loss'],color='red',label='Training set Loss')
plt.plot(epochs,training_history.history['val_loss'],color='blue',label='Validation set Loss')
plt.xlabel('No. of Epochs')
plt.title('Visualization of Loss')
plt.legend()
plt.show()

In [None]:
#Training set Accuracy
train_loss, train_acc = cnn.evaluate(training_set)
print('Training accuracy:', train_acc)

#Validation set Accuracy
val_loss, val_acc = cnn.evaluate(validation_set)
print('Validation accuracy:', val_acc)

In [None]:
epochs = [i for i in range(1,11)]
plt.plot(epochs,training_history.history['accuracy'],color='red',label='Training Accuracy')
plt.plot(epochs,training_history.history['val_accuracy'],color='blue',label='Validation Accuracy')
plt.xlabel('No. of Epochs')
plt.title('Visualization of Accuracy Result')
plt.legend()
plt.show()

## **5. Evaluation:**
The model is evaluated using various metrics which were such as accuracy, precision, recall, and
F1-score.

In [None]:
class_name = validation_set.class_names
test_set = tf.keras.utils.image_dataset_from_directory(
    'valid',
    labels="inferred",
    label_mode="categorical",
    class_names=None,
    color_mode="rgb",
    batch_size=1,
    image_size=(128, 128),
    shuffle=False,
    seed=None,
    validation_split=None,
    subset=None,
    interpolation="bilinear",
    follow_links=False,
    crop_to_aspect_ratio=False
)

predicted_categories = cnn.predict(test_set)
y_pred = tf.argmax(predicted_categories, axis=1)

true_categories = tf.concat([y for x, y in test_set], axis=0)
y_true = tf.argmax(true_categories, axis=1)

### **5.1 Accuracy**

In [None]:
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

### **5.2 Precision**

In [None]:
precision = precision_score(y_true, y_pred)
print("Precision:", precision)

### **5.3 Recall**

In [None]:
recall = recall_score(y_true, y_pred)
print("Recall:", recall)

### **5.4 F1-score**

In [None]:
f1 = f1_score(y_true, y_pred)
print("F1-score:", f1)

### **5.4 Classification Report**

In [None]:
report = classification_report(y_true, y_pred)
# Print classification report
print("Classification Report:")
print(report)

### **5.4 Confusion Matrix**

In [None]:
# Plot confusion matrix
print("Confusion Matrix:")
cm = confusion_matrix(y_true, y_pred)
plt.figure(figsize=(40, 40))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False, ,annot_kws={"size": 12})
plt.xlabel('Predicted Label', fontsize = 20)
plt.ylabel('True Label', fontsize = 20)
plt.title('Confusion Matrix')
plt.show()