# **Potato Disease Classification**

## Import Modules


In [48]:
import tensorflow as tf
from tensorflow.keras import models, layers
import matplotlib.pyplot as plt
from IPython.display import HTML


* tensorflow is imported as tf.
* The models and layers sub-modules are imported from tensorflow.keras. These sub-modules provide functions for creating and training neural network models, as well as building and configuring different types of layers.
* matplotlib.pyplot is imported as plt, which allows you to create different types of plots and visualizations to analyze your data.
* IPython.display.HTML is imported to display HTML content in the Jupyter notebook environment.

## Constants

In [49]:
IMAGE_SIZE = 256
CHANNELS = 3

## Data Argumentation

In [50]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=10,
        horizontal_flip=True
)
train_generator = train_datagen.flow_from_directory(
        'dataset/train',
        target_size=(IMAGE_SIZE,IMAGE_SIZE),
        batch_size=32,
        class_mode="sparse",
)

Found 32917 images belonging to 16 classes.


* This code uses the ImageDataGenerator class from tensorflow.keras.preprocessing.image module to perform data augmentation on image data during training of a deep learning model. Data augmentation helps to artificially increase the size of the dataset by creating new, slightly modified versions of the existing images, which helps to improve the robustness of the model and reduce overfitting.

* The train_datagen object is created with several parameters to specify the types of image augmentations to apply, such as rotating the image by a random angle between -10 and 10 degrees, and randomly flipping the image horizontally. The rescale parameter is used to normalize the pixel values in the image to be between 0 and 1.

* The train_generator object is then created using the flow_from_directory() method, which takes the path to the training directory, the target size of the images (IMAGE_SIZE x IMAGE_SIZE), the batch size, and the class mode (sparse in this case, which means that the labels are integers). The generator will read images from the directory, apply the specified augmentations, and generate batches of augmented images for training the model.

## Class Names with Indices

In [51]:
train_generator.class_indices

{'Apple___Black_rot': 0,
 'Apple___Cedar_apple_rust': 1,
 'Apple___healthy': 2,
 'Corn_(maize)___Common_rust_': 3,
 'Corn_(maize)___Northern_Leaf_Blight': 4,
 'Corn_(maize)___healthy': 5,
 'Grape___Black_rot': 6,
 'Grape___Esca_(Black_Measles)': 7,
 'Grape___healthy': 8,
 'Potato___Early_blight': 9,
 'Potato___Late_blight': 10,
 'Potato___healthy': 11,
 'Tomato___Bacterial_spot': 12,
 'Tomato___Early_blight': 13,
 'Tomato___Late_blight': 14,
 'Tomato___healthy': 15}

## Class Names

In [52]:
class_names = list(train_generator.class_indices.keys())
class_names

['Apple___Black_rot',
 'Apple___Cedar_apple_rust',
 'Apple___healthy',
 'Corn_(maize)___Common_rust_',
 'Corn_(maize)___Northern_Leaf_Blight',
 'Corn_(maize)___healthy',
 'Grape___Black_rot',
 'Grape___Esca_(Black_Measles)',
 'Grape___healthy',
 'Potato___Early_blight',
 'Potato___Late_blight',
 'Potato___healthy',
 'Tomato___Bacterial_spot',
 'Tomato___Early_blight',
 'Tomato___Late_blight',
 'Tomato___healthy']

## Validation Dataset

In [53]:
validation_datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=10,
        horizontal_flip=True)
validation_generator = validation_datagen.flow_from_directory(
        'dataset/val',
        target_size=(IMAGE_SIZE,IMAGE_SIZE),
        batch_size=32,
        class_mode="sparse"
)

Found 15016 images belonging to 16 classes.


* This code creates a validation generator similar to the train_generator, but for the validation set. The validation_datagen object is created with the same set of parameters as train_datagen for consistency.

* The validation_generator is created using the flow_from_directory() method, which takes the path to the validation directory, the target size of the images (IMAGE_SIZE x IMAGE_SIZE), the batch size, and the class mode (sparse in this case, which means that the labels are integers). The generator will read images from the directory and generate batches of images for validation during the training process.

## Test Dataset

In [54]:
test_datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=10,
        horizontal_flip=True)

test_generator = test_datagen.flow_from_directory(
        'dataset/test',
        target_size=(IMAGE_SIZE,IMAGE_SIZE),
        batch_size=32,
        class_mode="sparse"
)

Found 1073 images belonging to 16 classes.


* This code creates a test generator similar to the train_generator and validation_generator. The test_datagen object is created with the same set of parameters as train_datagen and validation_datagen for consistency.

* The test_generator is created using the flow_from_directory() method, which takes the path to the test directory, the target size of the images (IMAGE_SIZE x IMAGE_SIZE), the batch size, and the class mode (sparse in this case, which means that the labels are integers). The generator will read images from the directory and generate batches of images for testing the model after training is complete.

## Model

In [60]:
input_shape = (IMAGE_SIZE, IMAGE_SIZE, CHANNELS)
n_classes = 16

model = models.Sequential([
    layers.InputLayer(input_shape=input_shape),
    layers.Conv2D(32, kernel_size=(3,3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, kernel_size=(3,3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, kernel_size=(3,3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(256, kernel_size=(3,3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(256, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(n_classes, activation='softmax'),
])


* This code defines a convolutional neural network (CNN) model using the Sequential API from tensorflow.keras. The model architecture consists of several convolutional layers, max pooling layers, and dense layers.

* The input_shape variable specifies the shape of the input tensor to the model, which is (IMAGE_SIZE, IMAGE_SIZE, CHANNELS), where IMAGE_SIZE is the target size of the input images and CHANNELS is the number of color channels in the images (3 for RGB images).

* The n_classes variable specifies the number of classes in the dataset.

* The model architecture consists of six convolutional layers, each followed by a max pooling layer to downsample the feature maps. The first convolutional layer has 32 filters with a 3x3 kernel size and uses the relu activation function. The remaining convolutional layers have 64 filters with a 3x3 kernel size and also use the relu activation function.

* After the convolutional layers, the feature maps are flattened into a 1D array and passed through two dense layers. The first dense layer has 64 units with a relu activation function, and the second dense layer has n_classes units with a softmax activation function, which outputs a probability distribution over the classes.

* This model can be used for image classification tasks on datasets with n_classes number of classes.

## Summary

In [61]:
model.summary()

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_25 (Conv2D)          (None, 254, 254, 32)      896       
                                                                 
 max_pooling2d_25 (MaxPoolin  (None, 127, 127, 32)     0         
 g2D)                                                            
                                                                 
 conv2d_26 (Conv2D)          (None, 125, 125, 64)      18496     
                                                                 
 max_pooling2d_26 (MaxPoolin  (None, 62, 62, 64)       0         
 g2D)                                                            
                                                                 
 conv2d_27 (Conv2D)          (None, 60, 60, 128)       73856     
                                                                 
 max_pooling2d_27 (MaxPoolin  (None, 30, 30, 128)     

## Model Compile with Optimizer 

In [57]:
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    metrics=['accuracy']
)

### This code compiles the CNN model defined in the previous code block using the compile() method.

* The optimizer parameter is set to 'adam', which is an optimization algorithm that is commonly used for deep learning models.

* The loss parameter is set to tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False), which is the loss function used to measure the difference between the predicted and actual labels. SparseCategoricalCrossentropy is used because the labels are in integer form, and not one-hot encoded.

* The metrics parameter is set to 'accuracy', which specifies that the accuracy of the model will be used as the evaluation metric during training and testing.

## Training 

In [47]:
history = model.fit(
    train_generator,
    steps_per_epoch=1000,
    batch_size=32,
    validation_data=validation_generator,
    validation_steps=16,
    verbose=1,
    epochs=20,
)

Epoch 1/2

KeyboardInterrupt: 

### This code trains the CNN model on the training set using the fit() method.

* The train_generator and validation_generator variables are the generators that generate batches of training and validation data respectively. These generators use data augmentation techniques such as rotation and horizontal flip to increase the amount of training data and prevent overfitting.

* The steps_per_epoch parameter is set to 115, which is the number of batches of samples in one epoch of training data. The batch_size parameter is set to 32, which is the number of samples in each batch.

* The validation_steps parameter is set to 16, which is the number of batches of samples in one epoch of validation data.

* The verbose parameter is set to 1, which specifies the verbosity mode.

* The epochs parameter is set to 20, which is the number of times the model will be trained on the entire training dataset.

* The training progress and evaluation metrics are stored in the history variable.

## Accuracy of the Model

In [26]:
scores = model.evaluate(test_generator)



## Saving the Model

In [27]:
model.save("Model_final.h5")