_Hello and welcome to my notebook!\
Feel free to leave any feedback or observation in the comment section_

![](https://jgi.doe.gov/wp-content/uploads/2016/04/w1-IMG_1213_jessenmottled.jpg)

The **Cassava** plant is the second biggest source of carbohydrates for the African population. The diseases affecting it are a distressing issue that is ought to be solved through modern techniques, such as Machine Learning. Make sure to watch this inspiring [video](https://www.youtube.com/watch?v=NlpS-DhayQA) from the TensorFlow team that tackles this specific problem.

## What can you find in this notebook
- a Deep Neural Network classifier using TensorFlow
- for some simple **EDA**, make sure to check out [my other notebook](https://www.kaggle.com/grigorelucian/simple-eda-cassava-leaf-disease)

## Why use EfficientNets
### and not juse another classic Convolutional Neural Network?

![](https://1.bp.blogspot.com/-oNSfIOzO8ko/XO3BtHnUx0I/AAAAAAAAEKk/rJ2tHovGkzsyZnCbwVad-Q3ZBnwQmCFsgCEwYBhgL/s1600/image3.png)

Well, the answer is simple and consists of two parts.\
**Firstly**, in the most common researches in the field, CNNs are overtaken by EfficientNets in every category possible. Thus, this new type of networks:
* have less parameters
* achieve better accuracy on 5 out of 8 widely used datasets
* use less hardware (and have better latency)

_Make sure to refer to [this article](https://arxiv.org/pdf/1905.11946.pdf)_

**Secondly**, I have personally observed that classic CNNs will just bottleneck at some point on large datasets. Of course, without using 12 GPUs and 64-core CPUs.
A previous model of mine achieved on this particular dataset no more than ~61% accuracy on the train set, which has been actually achieved before first epoch ended. The model was not able to improve this accuracy by the end of the training.

_You can find that model [here](https://www.kaggle.com/grigorelucian/classic-cnn-cassava-leaf-disease)_

## Let's get to work!

In [1]:
!pip show tensorflow

Name: tensorflow
Version: 2.3.1
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /opt/conda/lib/python3.7/site-packages
Requires: astunparse, tensorflow-estimator, google-pasta, numpy, six, opt-einsum, keras-preprocessing, wrapt, wheel, absl-py, tensorboard, grpcio, gast, termcolor, protobuf, h5py
Required-by: tensorflow-cloud, fancyimpute


In [2]:
# libraries
import os
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.data import TFRecordDataset
from tensorflow.keras import Model
from tensorflow.keras import Sequential
from tensorflow.keras.applications import EfficientNetB5
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.activations import swish
from tensorflow.keras.layers import GlobalAveragePooling2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.callbacks import LearningRateScheduler
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import EarlyStopping

In [3]:
# constants
IMAGE_WIDTH = IMAGE_HEIGHT = 300
no_batches = 32
no_classes = 5
decay = 0.9
momentum = 0.9
batch_norm_momentum = 0.99
weight_decay = 1e-5
initial_lr = 0.256
lr_decay = 0.97
lr_decay_freq = 2.4 # epochs
dropout_rate = 0.4

In [4]:
# storing the tfrecords filenames
tfrecs_path = '../input/cassava-leaf-disease-classification/train_tfrecords'
records = os.listdir(tfrecs_path)
def _fmap(elem):
    return tfrecs_path + '/' + elem

# personally avoided using map() because it returns a <map object> that cannot be used later
for i in range(0, len(records)):
    records[i] = _fmap(records[i])
for record in records:
    print(record)

../input/cassava-leaf-disease-classification/train_tfrecords/ld_train00-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train12-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train09-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train05-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train13-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train07-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train06-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train01-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train11-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train14-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train08-1338.tfrec
../input/cassava-leaf-disease-classification/train_tfrecords/ld_train03-1338.tfrec
../i

In [5]:
# # creating raw train dataset
# raw_dataset = TFRecordDataset(records)
# raw_dataset

Let's see how one entry looks like (I will do it here with some lines from the first image)
> for entry in raw_dataset.take(1):\
>     print(repr(entry))

> <tf.Tensor: shape=(), dtype=string, numpy=b'\n\xc5\xef\x06\n\x1f\n\nimage_name\x12\x11\n\x0f\n\r499934842.jpg\n\x0f\n\x06target\x12\x05\x1a\x03\n\x01\x04\n\x8f\xef\x06\n\x05image\x12\x84\xef\x06\n\x80\xef\x06\n\xfc\xee\x06\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x01,\x01,\x00\x00\xff\xdb\x00C\x00\x02\x01\x01\x01\x01\x01\x02\x01\x01\x01\x02\x02\x02\x02\x02\x04\x03\x02\x02\x02\x02\x05\x04\x04\x03\x04\x06\x05\x06\x06\x06\x05\x06\x06\x06\x07\t\x08\x06\x07\t\x07\x06\x06\x08\x0b\x08\t\n\n\n\n\n\x06\x08\x0b\x0c\x0b\n\x0c\t\n\n\n\xff\xdb\x00C\x01\x02\x02\x02\x02\x02\x02\x05\x03\x03\x05\n\x07\x06\x07\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\xff\xc0\x00\x11\x08\x02\x00\x02\x00\x03\x01\x11\x00\x02\x11\x01\x03\x11\x01\xff\xc4\x00\x1e\x00\x00\x02\x02\x03\x01\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x04\x05\x03\x06\x02\x07\x08\x01\x00\t\n\xff\xc4\x00O\x10\x00\x01\x03\x02\x04\x04\x04\x04\x04\x03\x06\x04\x04\x04\x02\x0b\x01\x02\x03\x04\x05\x11\x00\x06\x12!\x07\x131A\x08"Qa\x142q\x81\x15#B\x91R\xa1\xb1\t\x16$3b\xc1Cr\x82\xd14\xe1\xf0\xf1\x17%S\x92\xa25s\x18&Dc\xb2\'\x83\xa3\xc2\xd2\xff\xc4\x00\x1c\x01\x00\x02\x03\x01\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x04\x01\x02\x05\x00\x06\x07\x08\xff\xc4\x009\x11\x00\x02\x02\x02\x02\x02\x02\x01\x04\x00\x05\x03\x02\x06\x02\x03\x01\x02\x00\x03\x04\x11\x12!\x051\x13"A\x06\x142Q#3Baq\x07\x15\x81$R\x16\x174\xa1\xc1\xf0%\x91C\xb1\xe1\xff\xda\x00\x0c\x03\x01\x00\x02\x11\x03\x11\x00?\x00\xef\xd4\xf8|r:\xd0\xec\xfa\xfcB\xa4u\xfc\xe4\xd8\x7f<|\xc5\xef\x13\xc3\xff\x00\xd8\xf0\x7f\xb8MK\x87T8\xd1\xd3\x19\x9c\xcb\r\xa1n\xef$\x8b\xf7\xef\x8c\xdb\xef\xb46\xd7\xd4\xb2\xf8\xcf\x1bWD\xc54\xac\x97\x95\xe2\xcaP\x9b\x9ei\xae%F\xe0%\xd1\xb7\xf3\xc2/\xe4\xb2\x14\xea2\x98>/\xff\x00tx\xceE\xcb\x08\x90\x87\xe2q\x02\x9c\x95\x84$\x84\x19(\xb0 \xfdp\x0e\x18\xc8\xdb\xdcg\xf6\x9e7\xfb\x93\xd48=\x94\xb3J\xd0\x99\\H\x80\xa0\x8d\xd4\x94\xbe\x92\t\xfd\xf1\xb7\x8d\x9fEi\xf52\x1b\x07\xc6\xd85\xb8\xd6\x99\xe1\xc3#<\xd8z6y\x82\xee\x825$<\x91\xb5\xfe\xb8c\xfe\xe4%\xab\xf0\x9e1\xbd\x99t\xa0p\x92\x87L\xa6=O\xa6\xd7\xe2\xa3\x9c\x82\x95\x10\xe87\xf7\xc4\x8c\xca-;x\xdax\n5\xaa\x1c\x01\x11O\xf0\xa1^\x9b\xf9\xf034r\x95\x1d\x88#\x0fU\x99\x88\xbf\x99C\xfaJ\xcb\xbb\xf9\x04\xc1\xbf\x0c\x9cK\xa1\xb6]\x89Si\xe2>Ts6#\xd7\x14\xbb"\xab\x07F$\x7fJ\xdb[\x1f\xcc\x90\xf0\x97\x89pc\xf2\x1a\xa1\x95\xb8\xb4\xeaR\xf5\x02\x95c\x1e\xfc5\xb3\xd4\xb3x,\x85B$QrF}\xa58\x99K\xcbn4\xe2Up\xa4\xa4\xd8\x9cgY\x81r\xfa\x81\xab\xc6\xe4\xd3\xf8\x96\xac\xbfU\x

As we can see, one entry in these tfrecords is described through:
* target
* image_name
* image

We now need to parse these entries.

In [6]:
# # let's parse these records
# features = {
#     'target': tf.io.FixedLenFeature([],
#                                     tf.int64,
#                                     default_value = 0),
#     'image_name': tf.io.FixedLenFeature([],
#                                         tf.string,
#                                         default_value = ''),
#     'image': tf.io.FixedLenFeature([],
#                                    tf.string,
#                                    default_value = '')}

# def _parse_function(proto):    
#     return tf.io.parse_single_example(proto, features)

# parsed_dataset = raw_dataset.map(_parse_function,
#                                  num_parallel_calls = 4)
# # parsed_dataset.repeat()
# # parsed_dataset.batch(no_batches)

Let's see how a parsed entry looks like
> for entry in parsed_dataset.take(1):\
>     print(repr(entry))

> {'image': <tf.Tensor: shape=(), dtype=string, numpy=b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x01,\x01,\x00\x00\xff\xdb\x00C\x00\x02\x01\x01\x01\x01\x01\x02\x01\x01\x01\x02\x02\x02\x02\x02\x04\x03\x02\x02\x02\x02\x05\x04\x04\x03\x04\x06\x05\x06\x06\x06\x05\x06\x06\x06\x07\t\x08\x06\x07\t\x07\x06\x06\x08\x0b\x08\t\n\n\n\n\n\x06\x08\x0b\x0c\x0b\n\x0c\t\n\n\n\xff\xdb\x00C\x01\x02\x02\x02\x02\x02\x02\x05\x03\x03\x05\n\x07\x06\x07\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\xff\xc0\x00\x11\x08\x02\x00\x02\x00\x03\x01\x11\x00\x02\x11\x01\x03\x11\x01\xff\xc4\x00\x1e\x00\x00\x02\x02\x03\x01\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x04\x05\x03\x06\x02\x07\x08\x01\x00\t\n\xff\xc4\x00O\x10\x00\x01\x03\x02\x04\x04\x04\x04\x04\x03\x06\x04\x04\x04\x02\x0b\x01\x02\x03\x04\x05\x11\x00\x06\x12!\x07\x131A\x08"Qa\x142q\x81\x15#B\x91R\xa1\xb1\t\x16$3b\xc1Cr\x82\xd14\xe1\xf0\xf1\x17%S\x92\xa25s\x18&Dc\xb2\'\x83\xa3\xc2\xd2\xff\xc4\x00\x1c\x01\x00\x02\x03\x01\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x04\x01\x02\x05\x00\x06\x07\x08\xff\xc4\x009\x11\x00\x02\x02\x02\x02\x02\x02\x01\x04\x00\x05\x03\x02\x06\x02\x03\x01\x02\x00\x03\x04\x11\x12!\x051\x13"A\x06\x142Q#3Baq\x07\x15\x81$R\x16\x174\xa1\xc1\xf0%\x91C\xb1\xe1\xff\xda\x00\x0c\x03\x01\x00\x02\x11\x03\x11\x00?\x00\xef\xd4\xf8|r:\xd0\xec\xfa\xfcB\xa4u\xfc\xe4\xd8\x7f<|\xc5\xef\x13\xc3\xff\x00\xd8\xf0\x7f\xb8MK\x87T8\xd1\xd3\x19\x9c\xcb\r\xa1n\xef$\x8b\xf7\xef\x8c\xdb\xef\xb46\xd7\xd4\xb2\xf8\xcf\x1bWD\xc54\xac\x97\x95\xe2\xcaP\x9b\x9ei\xae%F\xe0%\xd1\xb7\xf3\xc2/\xe4\xb2\x14\xea2\x98>/\xff\x00tx\xceE\xcb\x08\x90\x87\xe2q\x02\x9c\x95\x84$\x84\x19(\xb0 \xfdp\x0e\x18\xc8\xdb\xdcg\xf6\x9e7\xfb\x93\xd48=\x94\xb3J\xd0\x99\\H\x80\xa0\x8d\xd4\x94\xbe\x92\t\xfd\xf1\xb7\x8d\x9fEi\xf52\x1b\x07\xc6\xd85\xb8\xd6\x99\xe1\xc3#<\xd8z6y\x82\xee\x825$<\x91\xb5\xfe\xb8c\xfe\xe4%\xab\xf0\x9e1\xbd\x99t\xa0p\x92\x87L\xa6=O\xa6\xd7\xe2\xa3\x9c\x82\x95\x10\xe87\xf7\xc4\x8c\xca-;x\xdax\n5\xaa\x1c\x01\x11O\xf0\xa1^\x9b\xf9\xf034r\x95\x1d\x88#\x0fU\x99\x88\xbf\x99C\xfaJ\xcb\xbb\xf9\x04\xc1\xbf\x0c\x9cK\xa1\xb6]\x89Si\xe2>Ts6#\xd7\x14\xbb"\xab\x07F$\x7fJ\xdb[\x1f\xcc\x90\xf0\x97\x89pc\xf2\x1a\xa1\x95\xb8\xb4\xeaR\xf5\x02\x95c\x1e\xfc5\xb3\xd4\xb3x,\x85B$QrF}\xa58\x99K\xcbn4\xe2Up\xa4\xa4\xd8\x9cgY\x81r\xfa\x81\xab\xc6\xe4\xd3\xf8\x96\xac\xbfU\xcf\xd1\x0f&\xa7\x05\xc7\x12~f\xe47\xb7\xf3\xc2\x8e\xdeC\x10\xee\xa1\xeeiQ~v7J=\xc7i\xe1\xfeI\xce\xcb13\x16Xm\x97Kw\x1f\x93a{\xfa\xfa\xe1\xca<\x8eH\x1f\xe2\x89\xa0q\xd3\xc

Now we have, for each image, a dictionary with key:value pairs for each of the features! Awesome!\
Every value is a **Tensor**, which, unfortunately, cannot be fed into EfficientNets, as these are made to deal with images.

In [7]:
# in_image_tensor = []
# in_label_tensor = []
# for elem in parsed_dataset.take(-1):
# #     img = tf.io.decode_and_crop_jpeg(elem['image'],
# #                                      crop_window = 
# #                                      channels = 3)
#     #img = tf.image.resize(img, (IMAGE_HEIGHT, IMAGE_WIDTH, 3))
# #     img = elem['image']
#     img = tf.io.decode_jpeg(elem['image'], channels = 3)
#     img = tf.image.resize_with_pad(img,
#                                    target_width = IMAGE_WIDTH,
#                                    target_height = IMAGE_HEIGHT,
#                                    method = 'bicubic')
#     in_image_tensor.append(img)
#     lbl = elem['target']
#     in_label_tensor.append(lbl)
    
# print(len(in_label_tensor))
# print(len(in_image_tensor))
# print(in_label_tensor[0])
# print(in_image_tensor[0])

# # with tf.compat.v1.Session():
# #     print(in_image_tensor[0].numpy())

In [8]:
# function that parses one single 
def _fparse(proto):
    features = {
        'target': tf.io.FixedLenFeature([],
                                        tf.int64,
                                        default_value = 0),
        'image_name': tf.io.FixedLenFeature([],
                                            tf.string,
                                            default_value = ''),
        'image': tf.io.FixedLenFeature([],
                                       tf.string,
                                       default_value = '')}
    
    parsed_entry = tf.io.parse_single_example(proto, features)
    image = tf.io.decode_jpeg(parsed_entry['image'], channels = 3)
    image = tf.image.resize_with_pad(image, target_width = IMAGE_WIDTH, target_height = IMAGE_HEIGHT)
    return image, parsed_entry['target']

def gen_dataset(filenames, batch_size = no_batches):
    dataset = TFRecordDataset(filenames)
    dataset = dataset.map(_fparse, num_parallel_calls = 4)
    dataset = dataset.repeat()
    dataset = dataset.batch(batch_size)
    iterator = tf.compat.v1.data.make_one_shot_iterator(dataset)
    image, label = iterator.get_next()
    image = tf.reshape(image, [batch_size, IMAGE_HEIGHT, IMAGE_WIDTH, 3])
    label = tf.one_hot(label, no_classes)
    return image, label

in_image_tensor, in_label_tensor = gen_dataset(records)

In [9]:
# importing baseline network
base_model = EfficientNetB5(include_top = False,
                            weights = 'imagenet',
                            pooling = 'avg',
                            input_shape = (IMAGE_HEIGHT, IMAGE_WIDTH, 3))
for layer in base_model.layers:
    layer.trainable = True

# building model
classifier = Sequential()
classifier.add(base_model)
classifier.add(Dense(units = 256, activation = 'swish'))
classifier.add(Dropout(dropout_rate))
classifier.add(Dense(no_classes, activation = 'softmax'))

# let's compile the model
classifier.compile(optimizer = RMSprop(lr = initial_lr,
                                      momentum = momentum,
                                      rho = decay,
                                      centered = True,
                                      epsilon = weight_decay),
                   loss = 'categorical_crossentropy',
                   metrics = ['accuracy'])

classifier.summary()
# base_model = EfficientNetB5(include_top=False, weights='imagenet')
# x = base_model.output
# x = GlobalAveragePooling2D()(x)
# predictions = Dense(no_classes, activation='softmax')(x)
# model = Model(inputs=base_model.input, outputs=predictions)
# model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.01), loss='categorical_crossentropy', metrics=['accuracy'])

# model.fit(x = in_image_tensor,
#           y = in_label_tensor,
#         epochs=10,
#         steps_per_epoch=no_batches,    
#         verbose=1
#     )

Downloading data from https://storage.googleapis.com/keras-applications/efficientnetb5_notop.h5
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
efficientnetb5 (Functional)  (None, 2048)              28513527  
_________________________________________________________________
dense (Dense)                (None, 128)               262272    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 645       
Total params: 28,776,444
Trainable params: 28,603,701
Non-trainable params: 172,743
_________________________________________________________________


In [10]:
def _schedule(epoch, lr):
    '''
    We reduce learning rate by 0.97 every 2.4 epochs, as suggested in the article.
    '''
#     if epoch % lr_decay_freq == 0.0 and epoch > 2.4:
#         lr -= lr_decay
    return lr

learning_rate_scheduler = LearningRateScheduler(_schedule,
                                                verbose = 1)

callbacks_used = [
    learning_rate_scheduler,
    EarlyStopping(monitor = 'accuracy',
                  patience = 3),
    ModelCheckpoint(filepath = 'cassava_model.h5',
                    monitor = 'accuracy',
                    save_best_only = True)]

In [None]:
no_steps = len(in_image_tensor) / no_batches
print(len(in_image_tensor[0]))
# training our model
classifier.fit(x = in_image_tensor,
               y = in_label_tensor,
               epochs = 25,
               verbose = 1,
               callbacks = callbacks_used,
               batch_size = no_batches,
              steps_per_epoch = no_steps)

32

Epoch 00001: LearningRateScheduler reducing learning rate to 0.25600001215934753.
Epoch 1/25


*References:*\
http://digital-thinking.de/tensorflow-vs-keras-or-how-to-speed-up-your-training-for-image-data-sets-by-factor-10/
https://arxiv.org/pdf/1905.11946.pdf