# Bengali.AI Handwritten Grapheme Classification
---
Author: Bhaveshkumar Thaker

---
---
Public Score: 0.9490

---

![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1095143%2Fa9a48686e3f385d9456b59bf2035594c%2Fdesc.png?generation=1576531903599785&alt=media)

Bengali is the 5th most spoken language in the world with hundreds of million of speakers. It’s the official language of Bangladesh and the second most spoken language in India. Considering its reach, there’s significant business and educational interest in developing AI that can optically recognize images of the language handwritten. This challenge hopes to improve on approaches to Bengali recognition.

Optical character recognition is particularly challenging for Bengali. While Bengali has 49 letters (to be more specific 11 vowels and 38 consonants) in its alphabet, there are also 18 potential diacritics, or accents. This means that there are many more graphemes, or the smallest units in a written language. The added complexity results in ~13,000 different grapheme variations (compared to English’s 250 graphemic units).

## Classify the components of handwritten Bengali
This dataset contains images of individual hand-written Bengali characters. Bengali characters (graphemes) are written by combining three components: a grapheme_root, vowel_diacritic, and consonant_diacritic. Your challenge is to classify the components of the grapheme in each image. There are roughly 10,000 possible graphemes, of which roughly 1,000 are represented in the training set. The test set includes some graphemes that do not exist in train but has no new grapheme components. It takes a lot of volunteers filling out sheets like this to generate a useful amount of real data; focusing the problem on the grapheme components rather than on recognizing whole graphemes should make it possible to assemble a Bengali OCR system without handwriting samples for all 10,000 graphemes.

## Files
**train.csv**
image_id: the foreign key for the parquet files

*   `grapheme_root`: the first of the three target classes
*   `vowel_diacritic`: the second target class
*   `consonant_diacritic`: the third target class
*   `grapheme`: the complete character. Provided for informational purposes only, you should not need to use this.

**(train/test).parquet**
Each parquet file contains tens of thousands of 137x236 grayscale images. The images have been provided in the parquet format for I/O and space efficiency. Each row in the parquet files contains an `image_id` column, and the flattened image.

**class_map.csv**
Maps the class labels to the actual Bengali grapheme components.

# Setup notebook

In [None]:
#%%
import time
notebookstart = time.time()

import warnings
warnings.filterwarnings('ignore')

In [None]:
#%%
runningOnColab = False
runningOnKaggle = True
runningLocal = False

## Connect to Google Drive or Kaggle if require

In [None]:
#%%
root_dir = ''
'''
try:
    from google.colab import drive, files
    drive.mount('/content/drive')
    root_dir = '/content/drive/My Drive/Colab Notebooks/tensorflow-2-projects/'

    # To upload kaggle.json to mounted directory
    # files.upload();

    # Create directory for kaggle.json key file and copy file
    get_ipython().system('mkdir -p ~/.kaggle')
    # In my case, 'Colab Notebooks' directory already contains my kaggle.json
    get_ipython().system("cp '{root_dir}../kaggle.json' ~/.kaggle/")

    # Change the permission
    get_ipython().system('chmod 600 ~/.kaggle/kaggle.json')
except:
    print('No GOOGLE DRIVE connection. Using local dataset(s).')
'''

## Install libraries

## Import libraries

In [None]:
#%%
if runningOnColab == True:
    print('Enabling TensorFlow 2.x')
    get_ipython().run_line_magic('tensorflow_version', '2.x')

In [None]:
#%%
import os, sys, random, gc, shutil
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mimage
import matplotlib.colors as mcolors
import seaborn as sns

import cv2

import tensorflow as tf

from collections import defaultdict
from PIL import Image, ImageFont
from skimage.transform import resize
from zipfile import ZipFile
from matplotlib import font_manager as fm
from tqdm.autonotebook import trange, tqdm

from sklearn.model_selection import train_test_split, KFold, StratifiedKFold

In [None]:
#%%
print(tf.version.VERSION)
print('GPU is', 'available' if
      tf.config.experimental.list_physical_devices('GPU') else 'NOT AVAILABLE')
device_name = tf.test.gpu_device_name()
print('Found GPU at: {}'.format(device_name))

## Setup defaults

In [None]:
#%%
seed = 321
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)

# Config the matplotlib backend as plotting inline in IPython
get_ipython().run_line_magic('matplotlib', 'inline')

plt.figure(figsize=(14, 11))
plt.style.use('seaborn')

colors = mcolors.TABLEAU_COLORS
sns.set(
    rc={
        'figure.figsize': (14, 11),
        'figure.facecolor': 'whitesmoke',
        'axes.facecolor': 'whitesmoke',
        'grid.color': 'slategrey',
    })
sns.color_palette(palette=list(colors.values()));

# Download and load data

In [None]:
#%%
if runningOnKaggle == True:
    root_dir = '/kaggle/input'
    for dirname, _, filenames in os.walk(root_dir):
        for filename in filenames:
            print(os.path.join(dirname, filename))

    root_dir = '/kaggle/input/bengaliai-cv19'
else:
    root_dir = ''

In [None]:
#%%
if runningOnColab == True:
    get_ipython().system('kaggle competitions download -c bengaliai-cv19')

In [None]:
#%%
if runningOnColab == True:
    get_ipython().system('kaggle datasets download -d kaushal2896/bengali-fonts')

In [None]:
#%%
if runningOnColab == True:
    get_ipython().system('kaggle datasets download -d kaushal2896/kalpurush-fonts')

In [None]:
#%%
if runningOnColab == True:
    if os.path.exists('train.csv'):
        print('Zip file is extracted already')
    else:
        print('Extracting the Zip file(s) ...')
        for file in tqdm(os.listdir()):
            if file.endswith('.zip'):
                with ZipFile(file, 'r') as datazipfile:
                    datazipfile.extractall()
                    datazipfile.close()
                    print(f'{file} Zip file extracted successfully')

In [None]:
#%%
if runningOnKaggle == True:
    bengalifont = ImageFont.truetype(
        '/kaggle/input/kalpurush-fonts/kalpurush-2.ttf')
    fontproperties = fm.FontProperties(
        fname='/kaggle/input/kalpurush-fonts/kalpurush-2.ttf')
    print(fontproperties.get_name())
else:
    bengalifont = ImageFont.truetype(root_dir + 'kalpurush-2.ttf')
    fontproperties = fm.FontProperties(fname=root_dir + 'kalpurush-2.ttf')
    print(fontproperties.get_name())

### Load the data

In [None]:
#%%
df_traincsv = pd.read_csv(os.path.join(root_dir, 'train.csv'))

In [None]:
#%%
df_classmap = pd.read_csv(os.path.join(root_dir, 'class_map.csv'))

## Data insights

In [None]:
#%%
df_traincsv.shape

In [None]:
#%%
df_traincsv.head(10)

In [None]:
#%%
print(
    f'Number of unique grapheme_root: {df_traincsv["grapheme_root"].nunique()}'
)
print(
    f'Number of unique vowel_diacritic: {df_traincsv["vowel_diacritic"].nunique()}'
)
print(
    f'Number of unique consonant_diacritic: {df_traincsv["consonant_diacritic"].nunique()}'
)

In [None]:
#%%
df_classmap.shape

In [None]:
#%%
df_classmap.head(15)

In [None]:
#%%
df_classmap.tail(15)

In [None]:
#%%
df_classmap.loc[df_classmap['component_type'] == 'grapheme_root',
                'component'].values

In [None]:
#%%
df_classmap.loc[df_classmap['component_type'] == 'vowel_diacritic',
                'component'].values

In [None]:
#%%
df_classmap.loc[df_classmap['component_type'] == 'consonant_diacritic',
                'component'].values

## Plot data insights

In [None]:
#%%
sns.countplot(x=df_classmap['component_type'])
plt.xticks(
    np.arange(3), ('grapheme_root', 'vowel_diacritic', 'consonant_diacritic'))
plt.title('Bengali Language Component Type(s) - Distribution', fontsize=18)
plt.show()

In [None]:
#%%
sns.countplot(
    y=df_traincsv['grapheme_root'],
    order=df_traincsv['grapheme_root'].value_counts().index[:40],
    orient='h',
    palette=sns.color_palette())
plt.xticks(rotation=45)
plt.show()

In [None]:
#%%
sns.countplot(
    y=df_traincsv['vowel_diacritic'],
    order=df_traincsv['vowel_diacritic'].value_counts().index[:15],
    orient='h',
    palette=sns.color_palette())
plt.xticks(rotation=45)
plt.show()

In [None]:
#%%
sns.countplot(
    y=df_traincsv['consonant_diacritic'],
    order=df_traincsv['consonant_diacritic'].value_counts().index[:10],
    orient='h')
plt.xticks(rotation=45)
plt.show()

# Data preprocessing

## Check for missing data

In [None]:
#%%
df_traincsv.isnull().sum(axis=0)

In [None]:
#%%
df_classmap.isnull().sum(axis=0)

## Drop non-required data column(s)

In [None]:
#%%
df_traincsv.drop(labels=['image_id',  'grapheme'], axis=1, inplace=True)
gc.collect()

# Build and train the model

## Define defaults

In [None]:
#%%
ORIGINAL_IMAGE_HEIGHT = 137
ORIGINAL_IMAGE_WIDTH = 236

IMAGE_SIZE = 64

NO_OF_CHANNELS = 1

## Define generic methods

In [None]:
#%%
def sizeof_fmt(num, suffix='B'):
    ''' by Fred Cirera,  https://stackoverflow.com/a/1094933/1870254, modified'''
    for unit in ['', 'Ki', 'Mi', 'Gi', 'Ti', 'Pi', 'Ei', 'Zi']:
        if abs(num) < 1024.0:
            return "%3.1f %s%s" % (num, unit, suffix)
        num /= 1024.0
    return "%.1f %s%s" % (num, 'Yi', suffix)


def merge_dict(dict_list):
    dd = defaultdict(list)
    for d in dict_list:
        for key, value in d.items():
            if not hasattr(value, '__iter__'):
                value = (value, )
            [dd[key].append(v) for v in value]
    return dict(dd)

In [None]:
#%%
class AllImageDataReader():
    def __init__(self):
        pass

    def getImages(in_image_height,
                  in_image_width,
                  out_image_size=128,
                  n_channels=3,
                  return_index=False):
        df_trainimagedata = np.empty(
            (0, IMAGE_SIZE, IMAGE_SIZE, 1), dtype=np.uint8)
        for i in tqdm(range(4), desc='Reading parquet'):
            if return_index == True:
                df_imagedata_tmp = pd.read_parquet(
                    os.path.join(root_dir,
                                 'train_image_data_{}.parquet'.format(i)))
                df_indexdata_tmp = df_imagedata_tmp.pop('image_id')
            else:
                df_imagedata_tmp = pd.read_parquet(
                    os.path.join(root_dir,
                                 'train_image_data_{}.parquet'.format(i)),
                    columns=[str(x) for x in range(32332)])
            df_imagedata_tmp = 255 - df_imagedata_tmp.values.reshape(
                -1, in_image_height, in_image_width).astype(np.uint8)
            imgarr_tmp = []
            for row in tqdm(
                    range(df_imagedata_tmp.shape[0]), desc='Loading images'):
                imgarr = cv2.resize(
                    df_imagedata_tmp[row], (out_image_size, out_image_size),
                    interpolation=cv2.INTER_CUBIC)
                imgarr[imgarr < 28] = 0
                imgarr_tmp.append(imgarr)
            df_trainimagedata = np.vstack(
                (df_trainimagedata,
                 np.asarray(imgarr_tmp, dtype=np.uint8).reshape(
                     -1, out_image_size, out_image_size, n_channels)))
            del imgarr_tmp
            del df_imagedata_tmp
            gc.collect()

        print(
            f'Returning images of shape {df_trainimagedata.shape}, type {type(df_trainimagedata)} of {df_trainimagedata.dtype}'
        )
        if return_index == True:
            return df_indexdata_tmp, df_trainimagedata
        else:
            return df_trainimagedata


df_trainimagedata = AllImageDataReader.getImages(
    in_image_height=ORIGINAL_IMAGE_HEIGHT,
    in_image_width=ORIGINAL_IMAGE_WIDTH,
    out_image_size=IMAGE_SIZE,
    n_channels=NO_OF_CHANNELS,
    return_index=False)

Y_consonant_diacritic = tf.keras.utils.to_categorical(
    df_traincsv['consonant_diacritic'], 7)
Y_grapheme_root = tf.keras.utils.to_categorical(df_traincsv['grapheme_root'],
                                                168)
Y_vowel_diacritic = tf.keras.utils.to_categorical(
    df_traincsv['vowel_diacritic'], 11)

*   [ImageDataGenerator for multiple output and single input](https://github.com/keras-team/keras/issues/12639#issuecomment-506338552)
*   [How to use fit_generator with multiple outputs of different type](https://stackoverflow.com/a/41872896)

In [None]:
#%%
class MultiOutputDataGenerator(
        tf.keras.preprocessing.image.ImageDataGenerator):
    def flow(
            self,
            x,
            y=None,
            batch_size=32,
            shuffle=True,
            sample_weight=None,
            seed=None,
            save_to_dir=None,
            save_prefix='',
            save_format='png',
            subset=None,
    ):

        targets = None
        target_lengths = {}
        ordered_outputs = []

        for (output, target) in y.items():
            if targets is None:
                targets = target
            else:
                targets = np.concatenate((targets, target), axis=1)

            target_lengths[output] = target.shape[1]
            ordered_outputs.append(output)

        for (flowx, flowy) in super().flow(
                x, targets, batch_size=batch_size, shuffle=shuffle):

            target_dict = {}
            i = 0

            for output in ordered_outputs:
                target_length = target_lengths[output]
                target_dict[output] = flowy[:, i:i + target_length]
                i += target_length

            yield (flowx, target_dict)


datagenerator = MultiOutputDataGenerator(
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    rotation_range=20,
    shear_range=0.1,
    fill_mode="nearest")

In [None]:
#%%
class ConvolutionModelGenerator():
    def __init__(self):
        pass

    def buildModel(image_size=128, n_channels=1):
        base_model_input = tf.keras.layers.Input(
            shape=(image_size, image_size, n_channels))
        base_model = tf.keras.layers.Conv2D(
            filters=64, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model_input)
        base_model = tf.keras.layers.ZeroPadding2D((1, 1))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=64, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.MaxPool2D(pool_size=(2, 2))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=64, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.Dropout(rate=0.35)(base_model)

        base_model = tf.keras.layers.Conv2D(
            filters=128, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.ZeroPadding2D((1, 1))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=128, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.MaxPool2D(pool_size=(2, 2))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=128, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.Dropout(rate=0.35)(base_model)

        base_model = tf.keras.layers.Conv2D(
            filters=256, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.ZeroPadding2D((1, 1))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=256, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.MaxPool2D(pool_size=(2, 2))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=256, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.Dropout(rate=0.35)(base_model)

        base_model = tf.keras.layers.Conv2D(
            filters=512, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.ZeroPadding2D((1, 1))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=512, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.MaxPool2D(pool_size=(2, 2))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=512, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.Dropout(rate=0.35)(base_model)

        base_model = tf.keras.layers.Conv2D(
            filters=512, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.ZeroPadding2D((1, 1))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=512, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.MaxPool2D(pool_size=(2, 2))(base_model)
        base_model = tf.keras.layers.Conv2D(
            filters=512, kernel_size=(3, 3), padding='same',
            activation='relu')(base_model)
        base_model = tf.keras.layers.BatchNormalization(
            momentum=0.15)(base_model)
        base_model = tf.keras.layers.Dropout(rate=0.35)(base_model)

        base_model = tf.keras.layers.Flatten()(base_model)
        base_model = tf.keras.layers.Dense(2048, activation="relu")(base_model)
        base_model = tf.keras.layers.Dropout(rate=0.35)(base_model)
        base_model = tf.keras.layers.Dense(1024, activation="relu")(base_model)
        base_model = tf.keras.layers.Dropout(rate=0.35)(base_model)
        base_model = tf.keras.layers.Dense(512, activation="relu")(base_model)

        prediction_layer_consonant_diacritic = tf.keras.layers.Dense(
            units=7, activation='softmax',
            name='consonant_diacritic')(base_model)
        prediction_layer_grapheme_root = tf.keras.layers.Dense(
            units=168, activation='softmax', name='grapheme_root')(base_model)
        prediction_layer_vowel_diacritic = tf.keras.layers.Dense(
            units=11, activation='softmax', name='vowel_diacritic')(base_model)

        model = tf.keras.models.Model(
            inputs=base_model_input,
            outputs=[
                prediction_layer_consonant_diacritic,
                prediction_layer_grapheme_root,
                prediction_layer_vowel_diacritic
            ])

        model.compile(
            optimizer='adam',
            loss='categorical_crossentropy',
            metrics=['accuracy'])

        plot_model = tf.keras.utils.plot_model(
            model,
            to_file='model.png',
            show_shapes=True,
            show_layer_names=True,
            rankdir='TB',
            expand_nested=True,
            dpi=96)

        return model, model.summary(), plot_model


(model, summary, plot) = ConvolutionModelGenerator.buildModel(
    image_size=IMAGE_SIZE, n_channels=NO_OF_CHANNELS)

In [None]:
#%%
plot

## Verify data

In [None]:
#%%
no_of_images = 7
randomint = [
    random.randrange(0, df_trainimagedata.shape[0])
    for iter in range(no_of_images * no_of_images)
]

for i in range(no_of_images * no_of_images):
    plt.subplot(no_of_images, no_of_images, i + 1)
    plt.imshow(X=df_trainimagedata[randomint[i]].reshape((IMAGE_SIZE,
                                                          IMAGE_SIZE)))
    gr = df_classmap.loc[(df_classmap['component_type'] == 'grapheme_root') & (
        df_classmap['label'] == df_traincsv.iloc[randomint[i]]['grapheme_root']
    )]['component'].values[0]
    vd = df_classmap.loc[
        (df_classmap['component_type'] == 'vowel_diacritic')
        & (df_classmap['label'] == df_traincsv.iloc[randomint[i]]
           ['vowel_diacritic'])]['component'].values[0]
    cd = df_classmap.loc[
        (df_classmap['component_type'] == 'consonant_diacritic')
        & (df_classmap['label'] == df_traincsv.iloc[randomint[i]]
           ['consonant_diacritic'])]['component'].values[0]
    plt.title(f'{gr}  {vd}  {cd}', fontproperties=fontproperties, fontsize=18)
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])

plt.tight_layout()
plt.show()

del (no_of_images, randomint, gr, vd, cd, i)
gc.collect()

In [None]:
#%%
for name, size in sorted(
    ((name, sys.getsizeof(value)) for name, value in globals().items()),
        key=lambda x: -x[1])[:10]:
    print("{:>30}: {:>8}".format(name, sizeof_fmt(size)))

## Training the model in KFold

In [None]:
#%%
n_splits = 3
epochs = 100
batch_size = 128

histories = []

model_name = 'bengaliai_classification_model.h5'

modelcheckpoint = tf.keras.callbacks.ModelCheckpoint(
    filepath=model_name,
    monitor='val_grapheme_root_accuracy',
    save_best_only=True,
    save_weights_only=False,
    verbose=1)
early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_grapheme_root_accuracy',
    patience=2,
    restore_best_weights=True)
reduceLR_grapheme_root = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='grapheme_root_accuracy',
    patience=2,
    verbose=1,
    factor=0.15,
    min_lr=1e-5)  #0.00001
reduceLR_vowel_diacritic = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='vowel_diacritic_accuracy',
    patience=2,
    verbose=1,
    factor=0.15,
    min_lr=1e-5)
reduceLR_consonant_diacritic = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='consonant_diacritic_accuracy',
    patience=2,
    verbose=1,
    factor=0.15,
    min_lr=1e-5)

for train_index, test_index in tqdm(
        KFold(n_splits, random_state=seed).split(df_trainimagedata),
        total=n_splits,
        desc="KFold"):
    x_train, x_test = df_trainimagedata[train_index], df_trainimagedata[
        test_index]
    cd_train, cd_test = Y_consonant_diacritic[
        train_index], Y_consonant_diacritic[test_index]
    gr_train, gr_test = Y_grapheme_root[train_index], Y_grapheme_root[
        test_index]
    vd_train, vd_test = Y_vowel_diacritic[train_index], Y_vowel_diacritic[
        test_index]

    history = model.fit(
        datagenerator.flow(
            x_train, {
                'consonant_diacritic': cd_train,
                'grapheme_root': gr_train,
                'vowel_diacritic': vd_train
            },
            batch_size=batch_size,
            seed=seed),
        epochs=epochs,
        steps_per_epoch=x_train.shape[0] // batch_size,
        validation_data=(x_test, [cd_test, gr_test, vd_test]),
        callbacks=[
            modelcheckpoint, early_stopping, reduceLR_grapheme_root,
            reduceLR_vowel_diacritic, reduceLR_consonant_diacritic
        ],
        verbose=2)
    histories.append(history)

    del (x_train, cd_train, gr_train, vd_train, x_test, cd_test, gr_test,
         vd_test)
    gc.collect()

In [None]:
#%%
history_list = [histories[x].history for x in range(len(histories))]
finalhistory = merge_dict(history_list)
pd.DataFrame(finalhistory).to_csv('history.csv', index=False)

In [None]:
#%%
del (df_trainimagedata)
gc.collect()

## Plot model performances

### Plot model loss performances

In [None]:
#%%
plt.subplot(2, 2, 1)
plt.plot(finalhistory['loss'])
plt.plot(finalhistory['val_loss'])
plt.title('Model Loss', fontsize=18)
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper right')

plt.subplot(2, 2, 2)
plt.plot(finalhistory['grapheme_root_loss'])
plt.plot(finalhistory['val_grapheme_root_loss'])
plt.title('Model Loss (grapheme_root)', fontsize=18)
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper right')

plt.subplot(2, 2, 3)
plt.plot(finalhistory['vowel_diacritic_loss'])
plt.plot(finalhistory['val_vowel_diacritic_loss'])
plt.title('Model Loss (vowel_diacritic)', fontsize=18)
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper right')

plt.subplot(2, 2, 4)
plt.plot(finalhistory['consonant_diacritic_loss'])
plt.plot(finalhistory['val_consonant_diacritic_loss'])
plt.title('Model Loss (consonant_diacritic)', fontsize=18)
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper right')

plt.tight_layout()
plt.show()

### Plot model accuracy performances

In [None]:
#%%
plt.subplot(2, 2, 1)
plt.plot(finalhistory['grapheme_root_accuracy'])
plt.plot(finalhistory['val_grapheme_root_accuracy'])
plt.title('Model Accuracy (grapheme_root)', fontsize=18)
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')

plt.subplot(2, 2, 2)
plt.plot(finalhistory['vowel_diacritic_accuracy'])
plt.plot(finalhistory['val_vowel_diacritic_accuracy'])
plt.title('Model Accuracy (vowel_diacritic)', fontsize=18)
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')

plt.subplot(2, 2, 3)
plt.plot(finalhistory['consonant_diacritic_accuracy'])
plt.plot(finalhistory['val_consonant_diacritic_accuracy'])
plt.title('Model Accuracy (consonant_diacritic)', fontsize=18)
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')

plt.tight_layout()
plt.show()

# Load best model and make predictions

## Load best model

In [None]:
#%%
if runningOnKaggle == True:
    model_name = '/kaggle/input/bengaliai-grapheme-classification-64x64-kfold/bengaliai_classification_model_64_conv_allimgatonce_v10.h5'


model = tf.keras.models.load_model(model_name)

## Load test data

In [None]:
#%%
class ImageDataReader():
    def __init__(self):
        pass

    def getImages(filepath,
                  in_image_height,
                  in_image_width,
                  out_image_size=128,
                  n_channels=3,
                  return_index=False):
        df_imagedata = np.empty(
            (0, out_image_size, out_image_size, n_channels), dtype=np.uint8)
        if return_index == True:
            df_imagedata_tmp = pd.read_parquet(filepath)
            df_indexdata = df_imagedata_tmp.pop('image_id')
        else:
            df_imagedata_tmp = pd.read_parquet(
                filepath, columns=[str(x) for x in range(32332)])
        df_imagedata_tmp = 255 - df_imagedata_tmp.values.reshape(
            -1, in_image_height, in_image_width).astype(np.uint8)
        imgarr_tmp = []
        for row in tqdm(
                range(df_imagedata_tmp.shape[0]), desc='Loading images'):
            imgarr = cv2.resize(
                df_imagedata_tmp[row], (out_image_size, out_image_size),
                interpolation=cv2.INTER_CUBIC)
            imgarr[imgarr < 28] = 0
            imgarr_tmp.append(imgarr)
        df_imagedata = np.vstack(
            (df_imagedata, np.asarray(imgarr_tmp, dtype=np.uint8).reshape(
                -1, out_image_size, out_image_size, n_channels)))
        del imgarr_tmp
        del df_imagedata_tmp
        gc.collect()

        print(
            f'Returning images of shape {df_imagedata.shape}, type {type(df_imagedata)} of {df_imagedata.dtype}'
        )

        if return_index == True:
            return df_indexdata, df_imagedata
        else:
            return df_imagedata

## Submit predictions

In [None]:
#%%
preds_dict = {
    'consonant_diacritic': [],
    'grapheme_root': [],
    'vowel_diacritic': []
}
components = ['consonant_diacritic', 'grapheme_root', 'vowel_diacritic']

target = []  # model predictions placeholder
row_id = []  # row_id place holder

for i in trange(4):
    df_testindexdata, df_testimagedata = ImageDataReader.getImages(
        filepath=os.path.join(root_dir,
                              'test_image_data_{}.parquet'.format(i)),
        in_image_height=ORIGINAL_IMAGE_HEIGHT,
        in_image_width=ORIGINAL_IMAGE_WIDTH,
        out_image_size=IMAGE_SIZE,
        n_channels=NO_OF_CHANNELS,
        return_index=True)

    preds = model.predict(df_testimagedata)

    for i, p in enumerate(preds_dict):
        preds_dict[p] = np.argmax(preds[i], axis=1)

    for k, id in enumerate(df_testindexdata.values):
        for i, comp in enumerate(components):
            id_sample = id + '_' + comp
            row_id.append(id_sample)
            target.append(preds_dict[comp][k])
    del df_testindexdata
    del df_testimagedata
    gc.collect()

df_sample = pd.DataFrame(
    {
        'row_id': row_id,
        'target': target
    }, columns=['row_id', 'target'])
df_sample.to_csv('submission.csv', index=False)
df_sample.head(12)

In [None]:
#%%
print("Notebook Runtime: %0.2f Minutes" % ((time.time() - notebookstart) / 60))