# 🍅 **Tomato Leaf Disease** #

## 🎯 Project Description  ###
Tomatoes are one of the most extensively grown vegetables in any country, and their diseases can significantly affect yield and quality. Accurate and early detection of tomato diseases is crucial for reducing losses and improving crop management.

In this project, I will be appllying the power of Deep Learning to detect and classify the various stages of blight in tomato leaves. , making automation of disease detection more efficient and accessible.

## 📃 The Dataset #
The dataset consists of 3 different folders, each containing images of tomato leaves with the respective syndroms:
- Healthy
- Early blight
- Late blight

## 📚 The Notebook Structure ###
- ⚙️ Notebook Preparation - Import libraries, initialize constants and download dataset.
- 🔎 Data Analysis - Visualize data and confirm the distribution of different classes.
- 🧹 Data Preprocessing - Load and preprocess images to ensure compatibility with neural networks.
- 🛠️ Model Building - Build a CNN model using TensorFlow.
- 📈 Model Training - Train model on the dataset, and use a separate validation set to monitor its performance.
- 📊 Model Evaluation - Evaluate the model's performance on a separate test set.
- 🔝 Model Improvement - Try different strategies to improve the model's performance.
- 🖼️ Result Visualization - Visualize the results.

## 1️. ⚙️ Notebook Preparation

In [42]:
# General libraries
import warnings
warnings.filterwarnings('ignore')
import os
import datetime
import random
import shutil
import pandas as pd
import numpy as np
from kaggle.api.kaggle_api_extended import KaggleApi

# Data Analysis libraries
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import seaborn as sns

# Machine Learning libraries
import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPooling2D, Flatten, Dropout
from keras.callbacks import EarlyStopping
from keras.preprocessing.image import ImageDataGenerator

In [66]:
# Dataset
KAGGLE_DATASET = 'charuchaudhry/plantvillage-tomato-leaf-dataset'

# Paths
MAIN_PATH = os.path.abspath(os.path.join(os.getcwd(), '..', 'data'))
RAW_PATH = os.path.join(MAIN_PATH, 'raw_data', '')
TEST_PATH = os.path.join(MAIN_PATH, 'test_split', '')
TRAIN_PATH = os.path.join(MAIN_PATH, 'train_split', '')
PLANTVILLAGE = os.path.join(MAIN_PATH, 'plantvillage', '')

# Constants
BATCH_SIZE = 32
IMG_HEIGHT = 180
IMG_WIDTH = 180
AUTOTUNE = tf.data.AUTOTUNE

In [71]:
# Create Data folder structure
if not os.path.exists(MAIN_PATH):
    os.makedirs(MAIN_PATH, exist_ok=True)

    for path in [RAW_PATH, TRAIN_PATH, TEST_PATH]:
        if not os.path.exists(path):
            os.makedirs(path, exist_ok=True)

    # Instantiate Kaggle API (requires local API Key)
    api = KaggleApi()
    api.authenticate()

    # Download the dataset to the local folder
    api.dataset_download_files(KAGGLE_DATASET, path=MAIN_PATH, unzip=True)

    for folder in os.listdir(PLANTVILLAGE):
        if folder == 'Tomato___healthy' or folder == 'Tomato___Early_blight' or folder == 'Tomato___Late_blight':
            source_path = os.path.join(PLANTVILLAGE, folder)

            # Rename folder
            new_folder = folder.replace('Tomato___', '').lower()
            new_path = os.path.join(PLANTVILLAGE, new_folder)
            os.rename(source_path, new_path)

            # Move the subfolder to the destination folder
            shutil.move(new_path, RAW_PATH)

    shutil.rmtree(PLANTVILLAGE)

# Define split ratio of 80% train and 20% test
split_ratio = 0.8

for class_folder in os.listdir(RAW_PATH):
    class_path = os.path.join(RAW_PATH, class_folder)
    if os.path.isdir(class_path):
        files = os.listdir(class_path)
        random.shuffle(files)

        # Separating raw data into train and test splits
        split_index = int(len(files) * split_ratio)
        train_files = files[:split_index]
        test_files = files[split_index:]

        # Creating folders for each class of the train split
        dest_train_folder = os.path.join(TRAIN_PATH, class_folder)
        if not os.path.exists(dest_train_folder):
            os.makedirs(dest_train_folder, exist_ok=True)
        else:
            continue

        # Creating folders for each class of the test split
        dest_test_folder = os.path.join(TEST_PATH, class_folder)
        if not os.path.exists(dest_test_folder):
            os.makedirs(dest_test_folder, exist_ok=True)
        else:
            continue

        # Move train split files to train folder
        for file in train_files:
            source_file = os.path.join(class_path, file)
            dest_file = os.path.join(dest_train_folder, file)
            shutil.copy(source_file, dest_file)

        # Move test split files to test folder
        for file in test_files:
            source_file = os.path.join(class_path, file)
            dest_file = os.path.join(dest_test_folder, file)
            shutil.copy(source_file, dest_file)

## 2. 🔎 Data Analysis ##