# What is Cassava ? What are the types of disases?

As the second-largest provider of carbohydrates in Africa, cassava is a key food security crop grown by smallholder farmers because it can withstand harsh conditions. At least 80% of household farms in Sub-Saharan Africa grow this starchy root, but viral diseases are major sources of poor yields.

Existing methods of disease detection require farmers to solicit the help of government-funded agricultural experts to visually inspect and diagnose the plants. This suffers from being labor-intensive, low-supply and costly. As an added challenge, effective solutions for farmers must perform well under significant constraints, since African farmers may only have access to mobile-quality cameras with low-bandwidth.

So, in this competition through the training set we have, we will try to classify which disease type cassava caught with the help of image processing techniques and AI.


<img alt="Profit-making idea: Industrialisation of cassava one of Africa's biggest  opportunities" class="n3VNCb" src="https://www.howwemadeitinafrica.com/wp-content/uploads/2020/07/PMI-Philafrica-cassava-1200x630-1.jpg" data-noaft="1" jsname="HiaYvf" jsaction="load:XAeZkd;" style="width: 1024px; height: 2040; margin: 0px;">

# Let's look at the types of diseases:

**1 - Cassava Bacterial Blight (CBB)**

Xanthomonas axonopodis pv. manihotis is the pathogen that causes bacterial blight of cassava. Originally discovered in Brazil in 1912, the disease has followed cultivation of cassava across the world.[1] Among diseases which afflict cassava worldwide, bacterial blight causes the largest losses in terms of yield.

**Symptoms:**

* Symptoms include leaf spotting, wilting, dying, gum oozing on young shoots, and vascular coloration of mature stems and roots of susceptible varieties.

**2 - Cassava Brown Streak Disease (CBSD)**

Cassava brown streak virus disease (CBSD) is a damaging disease of cassava plants, and is especially troublesome in East Africa. It was first identified in 1936 in Tanzania, and has spread to other coastal areas of East Africa, from Kenya to Mozambique. Recently, it was found that two distinct viruses are responsible for the disease: cassava brown streak virus (CBSV) and Ugandan cassava brown streak virus (UCBSV).

**Symptoms:**

* CBSD is characterized by severe chlorosis and necrosis on infected leaves, giving them a yellowish, mottled appearance.
* Chlorosis may be associated with the veins, spanning from the mid vein, secondary and tertiary veins, or rather in blotches unconnected to veins.
* Leaf symptoms vary greatly depending on a variety of factors. 
* The growing conditions (i.e. altitude, rainfall quantity), plant age, and the virus species account for these differences. 
* Brown streaks may appear on the stems of the cassava plant. Also, a dry brown-black necrotic rot of the cassava tuber exists, which may progress from a small lesion to the whole root. 
* Finally, the roots can become constricted due to the tuber rot, stunting growth

**3 - Cassava Green Mottle (CGM)**

It has not been confirmed to be a nepovirus; these are viruses that are transmitted by nematodes - hence the name. Narrow. Only known from Solomon Islands. It was first found on Choiseul in the 1970s; more recently (2010), similar symptoms were seen on Malaita.

**Symptoms:**

* Look for yellow patterns on the leaves, from small dots to irregular patches of yellow and green. 
* Look for leaf margins that are distorted. 
* The plants may be stunted.

**4 - Cassava Mosaic Disease**

Cassava mosaic virus is the common name used to refer to any of eleven different species of plant pathogenic virus in the genus Begomovirus. African cassava mosaic virus (ACMV), East African cassava mosaic virus (EACMV), and South African cassava mosaic virus (SACMV) are distinct species of circular single-stranded DNA viruses which are transmitted by whiteflies and primarily infect cassava plants; these have thus far only been reported from Africa.

**Symptoms:**

* Initially following infection of a cassava geminivirus in cassava, systemic symptoms develop. 
* These symptoms include chlorotic mosaic of the leaves, leaf distortion, and stunted growth. 
* Leaf stalks have a characteristic S-shape.
* Infection can be overcome by the plant especially when a rapid onset of symptoms occurs. A slow onset of disease development usually correlates with death of the plant.
* affected by whiteflies
* affected by environmental factors such as temperature, wind, precipitation and plant density

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import pathlib
import imageio

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# Import Necessary Libraries

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt 
import plotly.express as px
import os
import cv2
from PIL import Image
import keras
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import tensorflow as tf
from tensorflow.keras import models, layers
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.applications import EfficientNetB0, Xception
from tensorflow.keras.optimizers import Adam

In [None]:
input_dir = "../input/cassava-leaf-disease-classification"

train_images_path = os.path.join(input_dir,"train_images")
test_images_path = os.path.join(input_dir,'test_images')

In [None]:
train = pd.read_csv('../input/cassava-leaf-disease-classification/train.csv')
train.head(10)

In [None]:
print("Total Number of Images in Training Data : ",train.shape[0])

In [None]:
image_list = train['image_id'].to_list()
label_list = train['label'].to_list()

In [None]:
import json

with open("../input/cassava-leaf-disease-classification/label_num_to_disease_map.json") as f:
    class_mapping = json.load(f)

class_mapping2 ={int(k):v for k,v in class_mapping.items()}

class_mapping2

# Distribution of Diseases:

In [None]:
plt.figure(figsize=(8,5))

sns.set_style('whitegrid')

ax=sns.countplot(data=train, x='label', palette="Pastel1")


#  '0': 'Cassava Bacterial Blight (CBB)
#  '1': 'Cassava Brown Streak Disease (CBSD)
#  '2': 'Cassava Green Mottle (CGM)
#  '3': 'Cassava Mosaic Disease (CMD)
#  '4': 'Healthy

In [None]:
train2 = train.copy()
train2.replace({"label": class_mapping2}, inplace=True)

pie_df = train2['label'].value_counts().reset_index()
pie_df.columns = ['label', 'count']
fig = px.pie(pie_df, values = 'count', names = 'label', hole=.3, color_discrete_sequence = px.colors.qualitative.Pastel1)
fig.show()

# Visualization

In [None]:
def plot_samples(class_):
    
    print(f'Some Sample Images belonging to Class {class_mapping[f"{class_}"]}')
    
    sample_images = train[train.label == class_].sample(8)
    
    plt.rcParams["axes.grid"] = False

    fig,ax = plt.subplots(nrows=2,ncols=4,figsize=(20,12))

    for e,img in enumerate(sample_images.image_id):
        image_path = os.path.join(input_dir,f'train_images/{img}')
        image = cv2.imread(image_path)
        ax[e//4][e%4].imshow(image)
    
    plt.show()

In [None]:
plot_samples(0) #Cassava Bacterial Blight

In [None]:
plot_samples(1) #Cassava Brown Streak Disease

In [None]:
plot_samples(2) #Cassava Green Mottle

In [None]:
plot_samples(3) #Cassava Mosaic Disease

# Converting from BGR to RGB

### Cassava Bacterial Blight (CBB) Samples

In [None]:
sample_images = train[train.label == 0].sample(5)
plt.figure(figsize=(35, 20))
for e,img in enumerate(sample_images.image_id):
    plt.subplot(1, 5, e + 1)
    img = cv2.imread(os.path.join(input_dir,f'train_images/{img}'))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img)
    
plt.show()

### Cassava Brown Streak Disease (CBSD) Samples

In [None]:
sample_images = train[train.label == 1].sample(5)
plt.figure(figsize=(35, 20))
for e,img in enumerate(sample_images.image_id):
    plt.subplot(1, 5, e + 1)
    img = cv2.imread(os.path.join(input_dir,f'train_images/{img}'))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img)

plt.show()

### Cassava Green Mottle (CGM) Samples

In [None]:
sample_images = train[train.label == 2].sample(5)
plt.figure(figsize=(35, 20))
for e,img in enumerate(sample_images.image_id):
    plt.subplot(1, 5, e + 1)
    img = cv2.imread(os.path.join(input_dir,f'train_images/{img}'))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img)
    
plt.show()

### Cassava Mosaic Disease (CMD) Samples

In [None]:
sample_images = train[train.label == 3].sample(5)
plt.figure(figsize=(35, 20))
for e,img in enumerate(sample_images.image_id):
    plt.subplot(1, 5, e + 1)
    img = cv2.imread(os.path.join(input_dir,f'train_images/{img}'))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img)
    
plt.show()

### Healty Samples 

In [None]:
sample_images = train[train.label == 4].sample(5)
plt.figure(figsize=(35, 20))
for e,img in enumerate(sample_images.image_id):
    plt.subplot(1, 5, e + 1)
    img = cv2.imread(os.path.join(input_dir,f'train_images/{img}'))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img)
    
plt.show()

# Model Implemetation and Augmentation

In [None]:
BATCH_SIZE =8 #Mini-Batch Gradient Descent
STEPS_PER_EPOCH = len(train)*0.8 / BATCH_SIZE
VALIDATION_STEPS = len(train)*0.2 / BATCH_SIZE
EPOCHS = 20
TARGET_SIZE = 350

In [None]:
train.label = train.label.astype('str')

train_datagen = ImageDataGenerator(validation_split = 0.2,
                                     rotation_range = 45,
                                     zoom_range = 0.3,
                                     horizontal_flip = True,
                                     vertical_flip = True,
                                     fill_mode = 'nearest',
                                     shear_range = 0.1,
                                     height_shift_range = 0.1,
                                     width_shift_range = 0.1,
                                     featurewise_center = True,
                                     featurewise_std_normalization = True)

train_generator = train_datagen.flow_from_dataframe(train,
                         directory = os.path.join('../input/cassava-leaf-disease-classification/train_images'),
                         subset = "training",
                         x_col = "image_id",
                         y_col = "label",
                         target_size = (TARGET_SIZE, TARGET_SIZE),
                         batch_size = BATCH_SIZE,
                         class_mode = "sparse",
                         shuffle= True)


validation_datagen = ImageDataGenerator(validation_split = 0.2)

validation_generator = validation_datagen.flow_from_dataframe(train,
                         directory = os.path.join('../input/cassava-leaf-disease-classification/train_images'),
                         subset = "validation",
                         x_col = "image_id",
                         y_col = "label",
                         target_size = (TARGET_SIZE, TARGET_SIZE),
                         batch_size = BATCH_SIZE,
                         class_mode = "sparse")

In [None]:
img_path = os.path.join('../input/cassava-leaf-disease-classification/train_images/1003442061.jpg')
img = image.load_img(img_path, target_size = (TARGET_SIZE, TARGET_SIZE))
img_tensor = image.img_to_array(img)
img_tensor = np.expand_dims(img_tensor, axis = 0)
img_tensor /= 255.

plt.imshow(img_tensor[0])
plt.show()

In [None]:
generator = train_datagen.flow_from_dataframe(train.iloc[17:18],
                         directory = os.path.join('../input/cassava-leaf-disease-classification/train_images'),
                         x_col = "image_id",
                         y_col = "label",
                         target_size = (TARGET_SIZE, TARGET_SIZE),
                         batch_size = BATCH_SIZE,
                         class_mode = "sparse")

aug_images = [generator[0][0][0]/255 for i in range(10)]
fig, axes = plt.subplots(2, 5, figsize = (20, 10))
axes = axes.flatten()
for img, ax in zip(aug_images, axes):
    ax.imshow(img)
plt.tight_layout()
plt.show()

In [None]:
def create_model():
    conv_base = Xception(include_top=False, input_tensor=None,
    pooling=None, input_shape=(TARGET_SIZE, TARGET_SIZE, 3), classifier_activation='softmax')
                               
    model = conv_base.output
    model = layers.GlobalAveragePooling2D()(model)
    model = layers.Dense(5, activation = "softmax")(model)
    model = models.Model(conv_base.input, model)

    model.compile(optimizer = Adam(lr = 0.001),
                  loss = "sparse_categorical_crossentropy",
                  metrics = ["acc"])
    return model

In [None]:
model = create_model()
model.summary()

In [None]:
# model_save = ModelCheckpoint('./Xception_best_weights2.h5', 
#                              save_best_only = True, 
#                              save_weights_only = True,
#                              monitor = 'val_loss', 
#                              mode = 'min', verbose = 1)
# early_stop = EarlyStopping(monitor = 'val_loss', min_delta = 0.001, 
#                            patience = 5, mode = 'min', verbose = 1,
#                            restore_best_weights = True)
# reduce_lr = ReduceLROnPlateau(monitor = 'val_loss', factor = 0.3, 
#                               patience = 2, min_delta = 0.001, 
#                               mode = 'min', verbose = 1) #reduced learning rate


# history = model.fit(
#     train_generator,
#     steps_per_epoch = STEPS_PER_EPOCH,
#     epochs = EPOCHS,
#     validation_data = validation_generator,
#     validation_steps = VALIDATION_STEPS,
#     callbacks = [model_save, early_stop, reduce_lr])

In [None]:
# model.save('./Xception_best_weights.h5')

# Load Model

In [None]:
model = keras.models.load_model('../input/xception-best-weights/Xception_best_weights.h5')

# Submission

In [None]:
submission_file = pd.read_csv(os.path.join('../input/cassava-leaf-disease-classification/sample_submission.csv'))
submission_file

In [None]:
preds = []

for image_id in submission_file.image_id:
    image = Image.open(os.path.join(f'../input/cassava-leaf-disease-classification/test_images/{image_id}'))
    image = image.resize((TARGET_SIZE, TARGET_SIZE))
    image = np.expand_dims(image, axis = 0)
    preds.append(np.argmax(model.predict(image)))

submission_file['label'] = preds
submission_file

In [None]:
submission_file.to_csv('submission.csv', index = False)

### PS: 

While creating this notebook, I was inspired by the notebook of a Kaggle member who name is Maksym Shkliarevskyi. This was my first attempt at computer vision, so his work was a good resource for me. 

Thank you to him.


Resouce: https://www.kaggle.com/maksymshkliarevskyi/cassava-leaf-disease-best-keras-cnn

My Base Model: https://www.kaggle.com/eceifter/xception-cassava-leaf-disease-classification?scriptVersionId=48693427