**Importing necessary modules**



*   numpy for numerical computing.
*   to_categorical function from `keras.utils` for one-hot encoding labels.


*   ImageDataGenerator from `tensorflow.keras.preprocessing.image` for image data preprocessing.
*   `Sequential`, `Dropout`, `Flatten`, and `Dense` layers from `tensorflow.keras.layers` for building the neural network model.





In [None]:
import numpy as np
from keras import applications
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential
from keras.src.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import utils
utils.to_categorical


**Setting up Paths and Image Dimensions for Transfer Learning with Keras**

In [None]:
#from tensorflow.keras.
import math

# dimensions of our images.
img_width, img_height = 150, 150

top_model_weights_path = 'bottleneck_fc_model.h5'
train_data_dir = '/Users/saryuvasishat/Desktop/SOEN Project/es-crop-disease-diagnosis-master/dataset/train'
validation_data_dir = '/Users/saryuvasishat/Desktop/SOEN Project/es-crop-disease-diagnosis-master/dataset/test'


**Generating and saving bottleneck features using a pre-trained VGG16 model.**


*   Loading the VGG16 model pretrained on the ImageNet dataset, excluding the top classification layers (include_top=False).

*   Setting up an ImageDataGenerator to preprocess the image data.

*   Creating a generator for the training data directory, setting target size, batch size, and other parameters.

*   The generator is used to predict the bottleneck features for the training data using the pre-trained VGG16 model. These features are then saved to a file named 'bottleneck_features_train.npy'

*   Similarly, bottleneck features are generated for the validation data and saved to a file named 'bottleneck_features_validation.npy'

*   The function prints out some information about the filenames and class indices for both training and validation data directories.



In [None]:
def save_bottlebeck_features():
    # build the VGG16 network
    model = applications.VGG16(include_top=False, weights='imagenet')

    datagen = ImageDataGenerator(rescale=1. / 255)

    generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

    print(len(generator.filenames))
    print(generator.class_indices)
    print(len(generator.class_indices))

    nb_train_samples = len(generator.filenames)
    num_classes = len(generator.class_indices)

    predict_size_train = int(math.ceil(nb_train_samples / batch_size))

    bottleneck_features_train = model.predict(
        generator, predict_size_train)

    np.save('bottleneck_features_train.npy', bottleneck_features_train)

    generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

    nb_validation_samples = len(generator.filenames)

    predict_size_validation = int(
        math.ceil(nb_validation_samples / batch_size))

    bottleneck_features_validation = model.predict_generator(
        generator, predict_size_validation)

    np.save('bottleneck_features_validation.npy',
            bottleneck_features_validation)


**Training the model**


*   The train_top_model function starts by setting up an
ImageDataGenerator to preprocess the image data.

*   It loads bottleneck features (precomputed representations) for the training data from a file using np.load.

*   The class labels for the training data are retrieved and converted into categorical labels using one-hot encoding.

*   Similarly, bottleneck features and labels for validation data are loaded and processed.

*   A new neural network model is defined using Keras Sequential API. It consists of a Flatten layer followed by a Dense layer with ReLU activation and a Dropout layer to prevent overfitting. The output layer is a Dense layer with softmax activation for multi-class classification.

*   The model is compiled with 'rmsprop' optimizer and 'categorical_crossentropy' loss function.

*   Training the model is performed using the fit method with the training data, labels, validation data, and labels. The training history is stored in the history variable.

*   After training, the model weights are saved to a file.

Model evaluation is done using the evaluate method on the validation data.

Finally, the accuracy and loss of the model are printed.




In [None]:
import numpy as np
from keras import applications
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential
from keras.src.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import utils

utils.to_categorical

# from tensorflow.keras.
import math

# dimensions of our images.
img_width, img_height = 150, 150

top_model_weights_path = 'bottleneck_fc_model.weights.h5'
train_data_dir = '/Users/saryuvasishat/Desktop/SOEN Project/es-crop-disease-diagnosis-master/dataset/train'
validation_data_dir = '/Users/saryuvasishat/Desktop/SOEN Project/es-crop-disease-diagnosis-master/dataset/test'

# number of epochs to train top model
epochs = 50
# batch size used by flow_from_directory and predict_generator
batch_size = 16


def save_bottlebeck_features():
    # build the VGG16 network
    model = applications.VGG16(include_top=False, weights='imagenet')

    datagen = ImageDataGenerator(rescale=1. / 255)

    generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

    print(len(generator.filenames))
    print(generator.class_indices)
    print(len(generator.class_indices))

    nb_train_samples = len(generator.filenames)
    num_classes = len(generator.class_indices)

    predict_size_train = int(math.ceil(nb_train_samples / batch_size))

    bottleneck_features_train = model.predict(
        generator, predict_size_train)

    np.save('bottleneck_features_train.npy', bottleneck_features_train)

    generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

    nb_validation_samples = len(generator.filenames)

    predict_size_validation = int(
        math.ceil(nb_validation_samples / batch_size))

    bottleneck_features_validation = model.predict_generator(
        generator, predict_size_validation)

    np.save('bottleneck_features_validation.npy',
            bottleneck_features_validation)


def save_bottlebeck_features(batch_size):
    # build the VGG16 network
    model = applications.VGG16(include_top=False, weights='imagenet')

    datagen = ImageDataGenerator(rescale=1. / 255)

    generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

    print(len(generator.filenames))
    print(generator.class_indices)
    print(len(generator.class_indices))

    nb_train_samples = len(generator.filenames)
    num_classes = len(generator.class_indices)

    predict_size_train = int(math.ceil(nb_train_samples / batch_size))

    bottleneck_features_train = model.predict(
        generator, predict_size_train)

    np.save('bottleneck_features_train.npy', bottleneck_features_train)

    generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

    nb_validation_samples = len(generator.filenames)

    predict_size_validation = int(
        math.ceil(nb_validation_samples / batch_size))

    bottleneck_features_validation = model.predict(
        generator, predict_size_validation)

    np.save('bottleneck_features_validation.npy',
            bottleneck_features_validation)


def train_top_model(batch_size, epochs):
    datagen_top = ImageDataGenerator(rescale=1. / 255)
    generator_top = datagen_top.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='categorical',
        shuffle=False)

    nb_train_samples = len(generator_top.filenames)
    num_classes = len(generator_top.class_indices)

    # save the class indices to use use later in predictions
    np.save('class_indices.npy', generator_top.class_indices)

    train_data = np.load('bottleneck_features_train.npy')

    # get the class labels for the training data, in the original order
    train_labels = generator_top.classes

    train_labels = to_categorical(train_labels, num_classes=num_classes)

    generator_top = datagen_top.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

    nb_validation_samples = len(generator_top.filenames)

    validation_data = np.load('bottleneck_features_validation.npy')

    validation_labels = generator_top.classes
    validation_labels = to_categorical(
        validation_labels, num_classes=num_classes)

    model = Sequential()
    model.add(Flatten(input_shape=train_data.shape[1:]))
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(num_classes, activation='softmax'))

    model.compile(optimizer='rmsprop',
                  loss='categorical_crossentropy', metrics=['accuracy'])

    history = model.fit(train_data, train_labels,
                        epochs=epochs,
                        batch_size=batch_size,
                        validation_data=(validation_data, validation_labels))

    model.save_weights(top_model_weights_path)

    (eval_loss, eval_accuracy) = model.evaluate(
        validation_data, validation_labels, batch_size=batch_size, verbose=1)

    print("[INFO] accuracy: {:.2f}%".format(eval_accuracy * 100))
    print("[INFO] Loss: {}".format(eval_loss))


# batch_size = 16  # Define batch size here
# save_bottlebeck_features(batch_size)
# train_top_model(batch_size)

batch_size = 16  # Define batch size here
epochs = 50  # Define number of epochs here

save_bottlebeck_features(batch_size)
train_top_model(batch_size, epochs)


Found 43456 images belonging to 38 classes.
43456
{'Apple___Apple_scab': 0, 'Apple___Black_rot': 1, 'Apple___Cedar_apple_rust': 2, 'Apple___healthy': 3, 'Blueberry___healthy': 4, 'Cherry_(including_sour)___Powdery_mildew': 5, 'Cherry_(including_sour)___healthy': 6, 'Corn_(maize)___Cercospora_leaf_spot Gray_leaf_spot': 7, 'Corn_(maize)___Common_rust_': 8, 'Corn_(maize)___Northern_Leaf_Blight': 9, 'Corn_(maize)___healthy': 10, 'Grape___Black_rot': 11, 'Grape___Esca_(Black_Measles)': 12, 'Grape___Leaf_blight_(Isariopsis_Leaf_Spot)': 13, 'Grape___healthy': 14, 'Orange___Haunglongbing_(Citrus_greening)': 15, 'Peach___Bacterial_spot': 16, 'Peach___healthy': 17, 'Pepper,_bell___Bacterial_spot': 18, 'Pepper,_bell___healthy': 19, 'Potato___Early_blight': 20, 'Potato___Late_blight': 21, 'Potato___healthy': 22, 'Raspberry___healthy': 23, 'Soybean___healthy': 24, 'Squash___Powdery_mildew': 25, 'Strawberry___Leaf_scorch': 26, 'Strawberry___healthy': 27, 'Tomato___Bacterial_spot': 28, 'Tomato___Earl

**We think for our task, TensorFlow with Keras is better suited than scikit-learn**, as

*   The task involves working with CNNs, particularly
utilizing the VGG16 architecture. TensorFlow with Keras seamlessly integrates pre-trained CNN models like VGG16, making it easier to implement transfer learning and work with convolutional layers.
*   TensorFlow with Keras provides greater flexibility and customization options, allowing for easy customization of network architectures, loss functions, optimizers, and more.

Using TensorFlow with Keras provides a robust framework for training learning models to diagnose plant diseases with high accuracy and efficiency.