# Casava Leaf Disease Classification with Pre-trained Model
A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset. 

In [None]:
import pandas  as pd
import numpy as np
import matplotlib.pyplot  as plt
from sklearn.utils import shuffle
import cv2

import tensorflow as tf 
from tensorflow.keras import applications
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization, GlobalAveragePooling2D

**Define Dataset path**

In [None]:
data_path = "../input/cassava-leaf-disease-classification/"
train_csv_data_path = data_path+"train.csv"
label_json_data_path = data_path+"label_num_to_disease_map.json"
images_dir_data_path = data_path+"train_images"

In [None]:
train_csv = pd.read_csv(train_csv_data_path)
train_csv['label'] = train_csv['label'].astype('string')

label_class = pd.read_json(label_json_data_path, orient='index')
label_class = label_class.values.flatten().tolist()

We have 5 labels in our dataset

In [None]:
print("Label names :")
for i, label in enumerate(label_class):
    print(f" {i}. {label}")

In [None]:
train_csv.head()

### Data agumentation and pre-processing using Keras

In [None]:
train_gen = ImageDataGenerator(
                                rotation_range=360,
                                width_shift_range=0.1,
                                height_shift_range=0.1,
                                brightness_range=[0.1,0.9],
                                shear_range=25,
                                zoom_range=0.3,
                                channel_shift_range=0.1,
                                horizontal_flip=True,
                                vertical_flip=True,
                                rescale=1/255,
                                validation_split=0.15
                               )
                                    
    
valid_gen = ImageDataGenerator(rescale=1/255,
                               validation_split = 0.15
                              )



In [None]:
BATCH_SIZE = 18
IMG_SIZE = 224

In [None]:
train_generator = train_gen.flow_from_dataframe(
                            dataframe=train_csv,
                            directory = images_dir_data_path,
                            x_col = "image_id",
                            y_col = "label",
                            target_size = (IMG_SIZE, IMG_SIZE),
                            class_mode = "categorical",
                            batch_size = BATCH_SIZE,
                            shuffle = True,
                            subset = "training",

)

valid_generator = valid_gen.flow_from_dataframe(
                            dataframe=train_csv,
                            directory = images_dir_data_path,
                            x_col = "image_id",
                            y_col = "label",
                            target_size = (IMG_SIZE, IMG_SIZE),
                            class_mode = "categorical",
                            batch_size = BATCH_SIZE,
                            shuffle = False,
                            subset = "validation"
)

In [None]:
batch = next(train_generator)
images = batch[0]
labels = batch[1]

### Plot Images

In [None]:
plt.figure(figsize=(12,9))
for i, (img, label) in enumerate(zip(images, labels)):
    plt.subplot(2,3, i%6 +1)
    plt.axis('off')
    plt.imshow(img)
    plt.title(label_class[np.argmax(label)])
    
    if i==15:
        break

# Building The Model

# ResNet-152
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity.

An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.

The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

[Paper link](https://arxiv.org/abs/1512.03385)

![](https://i.imgur.com/nyYh5xH.jpg)

In [None]:
# Loading the ResNet152 architecture with imagenet weights as base
#base = tf.keras.applications.ResNet152(include_top=False, weights='imagenet',input_shape=[IMG_SIZE,IMG_SIZE,3])

In [None]:
#base.summary()

### Model

In [None]:
# model = tf.keras.Sequential()
# model.add(base)
# model.add(BatchNormalization(axis=-1))
# model.add(GlobalAveragePooling2D())
# model.add(Dropout(0.5))
# model.add(Dense(5, activation='softmax'))

In [None]:
model = tf.keras.models.load_model("../input/trained3/10feb.h5") #../input/trained3/trained3.h5

In [None]:
model.compile(loss=tf.keras.losses.CategoricalCrossentropy(), optimizer=tf.keras.optimizers.Adamax(learning_rate=0.01), metrics=['acc'])

In [None]:
model.summary()

In [None]:

#from tf.keras.callbacks import ModelCheckpoint,EarlyStopping,ReduceLROnPlateau

model_save = tf.keras.callbacks.ModelCheckpoint("Model", 
                             save_best_only = True, 
                             save_weights_only = True,
                             monitor = 'val_loss', 
                             mode = 'min', verbose = 1)
early_stop = tf.keras.callbacks.EarlyStopping(monitor = 'acc', min_delta = 0.001, 
                           patience = 5, mode = 'min', verbose = 1,
                           restore_best_weights = True)
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor = 'val_loss', factor = 0.3, 
                              patience = 2, min_delta = 0.001, 
                              mode = 'min', verbose = 1)
checkpoint = tf.keras.callbacks.ModelCheckpoint('model{epoch:08d}.h5', period=5)

# Training The Model

Loading the saved model


In [None]:
# history = model.fit(
#       train_generator,
#       steps_per_epoch=train_generator.samples // train_generator.batch_size,
#       epochs=30,
#       validation_data=valid_generator,
#       validation_steps = valid_generator.samples // valid_generator.batch_size,
#       batch_size=BATCH_SIZE,
#       callbacks = [checkpoint]
#       )



In [None]:
# model.save('model.h5')

### Training and validation acc/loss

Acc and val_acc are measured to evaluate your model fitting. When there is a significant difference between these two, your model is overfitting. The validation accuracy (val_acc) should be equal or slightly less than the training accuracy (acc) to be a better model. 

In [None]:
# import matplotlib.pyplot as plt

# acc = history.history['acc']
# val_acc = history.history['val_acc']


# epochs = range(len(acc))

# plt.plot(epochs, acc, 'bo', label='Training acc')
# plt.plot(epochs, val_acc, 'bo', label='Validation acc')
# plt.title('Training and validation accuracy')
# plt.legend()

# plt.figure()

# plt.show()

In [None]:
# import matplotlib.pyplot as plt

# loss = history.history['loss']
# val_loss = history.history['val_loss']

# epochs = range(len(acc))

# plt.plot(epochs, loss, 'bo', label='Training loss')
# plt.plot(epochs, val_loss, 'bo', label='Validation loss')
# plt.title('Training and validation loss')
# plt.legend()

# plt.figure()

# plt.show()

# Evaluation Metrics


### Confusion Matrix
A confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.

### Load Model

In [None]:
loaded_model = tf.keras.models.load_model("../input/trained3/10feb.h5")

### Confusion Matrix

In [None]:
from sklearn.metrics import classification_report, confusion_matrix

#num_of_test_samples = 3209
#batch_size = 1
Y_pred = loaded_model.predict_generator(valid_generator, valid_generator.samples // valid_generator.batch_size + 5)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(valid_generator.classes, y_pred))

### Classification Report

In [None]:
target_names = list(train_generator.class_indices.keys()) # Classes
print(classification_report(valid_generator.classes, y_pred, target_names=target_names))

In [None]:
import seaborn as sns

cm = confusion_matrix(valid_generator.classes, y_pred)
labels = ['Cassava Bacterial Blight (CBB)', 'Cassava Brown Streak Disease (CBSD)', 'Cassava Green Mottle (CGM)', 'Cassava Mosaic Disease (CMD)','Healthy']
plt.figure(figsize=(8,6))
sns.heatmap(cm,xticklabels=labels, yticklabels=labels, annot=True, fmt='d', cmap="Blues", vmin = 0.2);
plt.title('Confusion Matrix')
plt.ylabel('True Class')
plt.xlabel('Predicted Class')
plt.show()

### Area Under Curve (AUC)
The Area Under the Curve (AUC) is the measure of the ability of a classifier to distinguish between classes and is used as a summary of the ROC curve. The higher the AUC, the better the performance of the model at distinguishing between the positive and negative classes.

In [None]:
# from sklearn.preprocessing import LabelBinarizer
# from sklearn.metrics import roc_curve
# from sklearn.metrics import auc

# from sklearn.metrics import roc_curve, auc, roc_auc_score
# import matplotlib.pyplot as plt

# # make a prediction
# y_pred_keras = loaded_model.predict_generator(valid_generator, valid_generator.samples // valid_generator.batch_size+5) #(test_gen, steps=len(df_val), verbose=1)
# fpr_keras, tpr_keras, thresholds_keras = roc_curve(valid_generator.classes, y_pred_keras)
# auc_keras = auc(fpr_keras, tpr_keras)


# plt.figure(1)
# plt.plot([0, 1], [0, 1], 'k--')
# plt.plot(fpr_keras, tpr_keras, label='area = {:.3f}'.format(auc_keras))
# plt.xlabel('False positive rate')
# plt.ylabel('True positive rate')
# plt.title('ROC curve')
# plt.legend(loc='best')
# plt.show()

In [None]:
test_img_path = data_path+"test_images/2216849948.jpg"

img = cv2.imread(test_img_path)
resized_img = cv2.resize(img, (IMG_SIZE, IMG_SIZE)).reshape(-1, IMG_SIZE, IMG_SIZE, 3)/255

plt.figure(figsize=(8,4))
plt.title("TEST IMAGE")
plt.imshow(resized_img[0])

In [None]:
preds = []
ss = pd.read_csv(data_path+'sample_submission.csv')

for image in ss.image_id:
    img = tf.keras.preprocessing.image.load_img(data_path+'test_images/' + image)
    img = tf.keras.preprocessing.image.img_to_array(img)
    img = tf.keras.preprocessing.image.smart_resize(img, (IMG_SIZE, IMG_SIZE))
    img = tf.reshape(img, (-1, IMG_SIZE, IMG_SIZE, 3))
    prediction = loaded_model.predict(img/255)
    preds.append(np.argmax(prediction))

my_submission = pd.DataFrame({'image_id': ss.image_id, 'label': preds})
my_submission.to_csv('submission.csv', index=False) 

In [None]:
# Submission file ouput
print("Submission File: \n---------------\n")
print(my_submission.head()) # Predicted Output

### Please see my another notebook [Cassava Leaf Deisease Classification](https://www.kaggle.com/mnavaidd/casava-leaf-disease-classification)