#  Plant Disease Detection using Deep Learning (CNN)

**Author:** Gopal Khandare  
**Role:**  Generative AI Engineer  
**Project Type:** Image Classification  
**Goal:** Detect plant diseases from leaf images using a Convolutional Neural Network (CNN).


This is my **second project** on GitHub.

After completing a Machine Learning project, I wanted to move one step forward into **Deep Learning**, specifically **image-based problems**.




## Why Plant Disease Detection?

Plant disease detection is a real-world problem where:
- Image data is important
- Deep Learning performs better than traditional ML
- CNNs are the standard solution

I chose this project because it:
- Introduces image preprocessing
- Uses convolutional layers
- Helps understand how models learn visual patterns
- Its releted to my background 




In [None]:
# Basic libraries
import os
import numpy as np
import matplotlib.pyplot as plt

# Deep Learning libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator


## Dataset Information

Dataset used: **PlantVillage Dataset**

- Publicly available image dataset
- Contains healthy and diseased plant leaf images
- Widely used in research and beginner/intermediate DL projects

Each disease category is stored in its own folder, which makes it suitable for
`flow_from_directory()` in Keras.

> Note:  
> Due to dataset size and training time, this notebook focuses on **demonstrating the correct deep learning workflow**, not on re-training the model inside this notebook.


In [None]:
# Expected dataset directory structure

train_dir = "dataset/train"
val_dir = "dataset/val"

img_size = (224, 224)
batch_size = 32


In [None]:
# Image preprocessing and augmentation

train_datagen = ImageDataGenerator(
    rescale=1.0 / 255,
    rotation_range=20,
    zoom_range=0.2,
    horizontal_flip=True
)

val_datagen = ImageDataGenerator(
    rescale=1.0 / 255
)


In [None]:
train_data = train_datagen.flow_from_directory(
    train_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="categorical"
)

val_data = val_datagen.flow_from_directory(
    val_dir,
    target_size=img_size,
    batch_size=batch_size,
    class_mode="categorical"
)


### Data Generator Explanation

- Images are resized to 224×224
- Pixel values are normalized
- Data augmentation helps reduce overfitting
- Classes are automatically inferred from folder names

This approach is standard for CNN-based image classification projects.


In [None]:
model = Sequential()

model.add(Conv2D(32, (3, 3), activation="relu", input_shape=(224, 224, 3)))
model.add(MaxPooling2D(2, 2))

model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(MaxPooling2D(2, 2))

model.add(Conv2D(128, (3, 3), activation="relu"))
model.add(MaxPooling2D(2, 2))

model.add(Flatten())
model.add(Dense(128, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(train_data.num_classes, activation="softmax"))

model.summary()


### Model Architecture Notes

- Convolution layers extract visual features
- Pooling layers reduce spatial dimensions
- Deeper layers learn complex patterns
- Dropout helps prevent overfitting
- Softmax output handles multi-class classification




In [None]:
model.compile(
    optimizer="adam",
    loss="categorical_crossentropy",
    metrics=["accuracy"]
)


In [None]:
history = model.fit(
    train_data,
    validation_data=val_data,
    epochs=10
)


### Training Output (Expected Behavior)

When trained on the PlantVillage dataset for ~10 epochs:

- Training accuracy typically reaches **~90–93%**
- Validation accuracy typically reaches **~85–90%**

Exact values depend on:
- Dataset split
- Hardware
- Random initialization

The purpose of this project is to demonstrate **correct CNN workflow and understanding**, not to optimize accuracy.


In [None]:
# Example visualization of training behavior

plt.figure(figsize=(6, 4))
plt.title("Training vs Validation Accuracy (Conceptual)")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.plot([0.7, 0.8, 0.85, 0.9], label="Training Accuracy")
plt.plot([0.65, 0.75, 0.82, 0.88], label="Validation Accuracy")
plt.legend()
plt.show()


### Accuracy Plot Explanation

- Training accuracy increases steadily
- Validation accuracy follows a similar trend
- Gap between curves indicates generalization behavior






This project helped me understand:

- How CNNs process image data
- The difference between ML and DL workflows
- The importance of preprocessing and architecture design

This is a **foundation project**.
Next, I want to move into:
- Transfer Learning
- More complex neural networks
- Generative AI models


