> **Problem overview**

"Quick, Draw!" was released as an experimental game to educate the public in a playful way about how AI works. The game prompts users to draw an image depicting a certain category, such as ”banana,” “table,” etc. The game generated more than 1B drawings, of which a subset was publicly released as the basis for this competition’s training set. That subset contains 50M drawings encompassing 340 label categories.

Sounds fun, right? Here's the challenge: since the training data comes from the game itself, drawings can be incomplete or may not match the label. You’ll need to build a recognizer that can effectively learn from this noisy data and perform well on a manually-labeled test set from a different distribution.

Your task is to build a better classifier for the existing Quick, Draw! dataset. By advancing models on this dataset, Kagglers can improve pattern recognition solutions more broadly. This will have an immediate impact on handwriting recognition and its robust applications in areas including OCR (Optical Character Recognition), ASR (Automatic Speech Recognition) & NLP (Natural Language Processing).

In [None]:
# import python standard library
import json, gc, os

# import data manipulation library
import numpy as np
import pandas as pd

# import data visualization library
import matplotlib.pyplot as plt
from tqdm import tqdm

# import image processing library
import cv2

# import tensorflow model class
from tensorflow import keras
from tensorflow.keras.applications import MobileNet
from tensorflow.keras.layers import Conv2D, Dense, Dropout, Flatten, MaxPooling2D
from tensorflow.keras.models import load_model, Sequential

# import sklearn model selection
from sklearn.model_selection import train_test_split

# import tensorflow model evaluation classification metrics
from tensorflow.keras.metrics import top_k_categorical_accuracy

In [None]:
# numpy options
np.random.seed(seed=58)

> **Acquiring training and testing data**

We start by acquiring the training and testing datasets into Pandas DataFrames.

In [None]:
# list training and testing data directory
os.listdir('../input/')

In [None]:
# acquiring training and testing data
df_train = pd.read_csv('../input/quick-draw-doodle-recognition-challenge-shufflecsv/train_k0.csv.gz', nrows=100)
df_test = pd.read_csv('../input/quickdraw-doodle-recognition/test_simplified.csv', nrows=2)

In [None]:
# visualize head of the training data
df_train.head(n=5)

In [None]:
# visualize tail of the testing data
df_test.tail(n=5)

In [None]:
# dataframe columns name
names = ['countrycode', 'drawing', 'key_id', 'recognized', 'timestamp', 'word']

# class files and dictionary
files = sorted(os.listdir('../input/quickdraw-doodle-recognition/train_simplified/'), reverse=False)
class_dict = {file[:-4].replace(" ", "_"): i for i, file in enumerate(files)}
classreverse_dict = {v: k for k, v in class_dict.items()}

# combine training and testing dataframe
df_train = df_train.drop(['shuffle'], axis=1)
df_train['datatype'], df_test['datatype'] = 'training', 'testing'
df_train = df_train[['key_id', 'countrycode', 'drawing', 'datatype', 'word', 'recognized']]
df_test['word'], df_test['recognized'] = '', True
df_data = pd.concat([df_train, df_test], ignore_index=True)

In [None]:
# data dimensions
chunksize = 680
img_size = 64
num_channels = 1
num_classes = 340
num_shuffles = 50

# flat dimensions
img_size_flat = img_size * img_size * num_channels

> **Feature exploration, engineering and cleansing**

Here we generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution together with exploring some data.

In [None]:
def drawplot(draw: list, label: list, figsize: tuple = (4, 3), ncols: int = 5, nrows: int = None) -> plt.figure:
    """ Return a draw image plot applied for an image data in vector format.
    
    Args:
        draw (list): The draw image data.
        label (list): The label of an image data.
        figsize (tuple): The matplotlib figure size width and height in inches. Default to (4, 3).
        ncols (int): The number of columns for axis in the figure. Default to 5.
        nrows (int): The number of rows for axis in the figure. Default to None.
    
    Returns:
        plt.figure: The plot figure.
    """
    
    if nrows is None: nrows = (len(label) - 1) // ncols + 1
    
    fig, axes = plt.subplots(figsize=(figsize[0]*ncols , figsize[1]*nrows), ncols=ncols, nrows=nrows)
    axes = axes.flatten()
    for i in label.index:
        for j in range(len(draw[i])): _ = axes[i - label.index[0]].plot(draw[i][j][0], draw[i][j][1])
        axes[i - label.index[0]].invert_yaxis()
        axes[i - label.index[0]].set_title(label[i])
    return fig

In [None]:
def imageplot(image: list, label: list, size: tuple, figsize: tuple = (4, 3), ncols: int = 5, nrows: int = None) -> plt.figure:
    """ Return an image plot applied for an image data in grayscale picture (m, n) format, RGB picture (m, n, 3) format and RGBA picture (m, n, 4) format.
    
    Args:
        image (list): The image data.
        label (list): The label of an image data.
        size (tuple): The tuple of an image size.
        figsize (tuple): The matplotlib figure size width and height in inches. Default to (4, 3).
        ncols (int): The number of columns for axis in the figure. Default to 5.
        nrows (int): The number of rows for axis in the figure. Default to None.
    
    Returns:
        plt.figure: The plot figure.
    """
    
    if nrows is None: nrows = (len(label) - 1) // ncols + 1
    
    fig, axes = plt.subplots(figsize=(figsize[0]*ncols , figsize[1]*nrows), ncols=ncols, nrows=nrows)
    axes = axes.flatten()
    _ = [axes[i].imshow(image[i].reshape(size), interpolation='spline16') for i in range(len(label))]
    return fig

In [None]:
def draw2pixel(draw: list) -> np.ndarray:
    """ Return a draw image to pixel image data.
    
    Args:
        draw (list): The draw image data.
    
    Returns:
        np.ndarray: The draw image to pixel image data.
    """
    
    image, xmin, xmax, ymin, ymax = np.zeros((256, 256)), 255, 0, 255, 0
    for k, stroke in enumerate(draw):
        for i in range(len(stroke[0])-1):
            xmin, xmax = min(xmin, stroke[0][i], stroke[0][i + 1]), max(xmax, stroke[0][i], stroke[0][i + 1])
            ymin, ymax = min(ymin, stroke[1][i], stroke[1][i + 1]), max(ymax, stroke[1][i], stroke[1][i + 1])
            color = (255.0 - min(k, 10) * 13) / 255.0
            _ = cv2.line(image, (stroke[0][i], stroke[1][i]), (stroke[0][i + 1], stroke[1][i + 1]), color=color, thickness=5)
    if xmin == xmax: xmin, xmax = 0, 255
    if ymin == ymax: ymin, ymax = 0, 255
    return cv2.resize(image[ymin:ymax, xmin:xmax], (img_size, img_size))

In [None]:
def feature_extraction(df_data: pd.DataFrame) -> pd.DataFrame:
    """ Return the feature exploration, engineering and cleansing.
    
    Args:
        df_data (pd.DataFrame): The data to extract features.
    
    Returns:
        pd.DataFrame: The extracted features dataframe.
    """
    
    # feature extraction: drawing
    df_data['drawing'] = df_data['drawing'].apply(lambda x: json.loads(x))
    
    # feature extraction: word
    df_data['word'] = df_data['word'].apply(lambda x: -1 if x == '' else class_dict[x.replace(' ', '_')])
    
    # feature extraction: drawing to pixel
    df_data['pixel'] = df_data['drawing'].apply(lambda x: draw2pixel(x))
    
    return df_data

In [None]:
def feature_extraction2(df_data: pd.DataFrame) -> pd.DataFrame:
    """ Return the feature exploration, engineering and cleansing.
    
    Args:
        df_data (pd.DataFrame): The data to extract features.
    
    Returns:
        pd.DataFrame: The extracted features dataframe.
    """
    
    # feature extraction: remove countrycode, drawing and datatype
    df_data = df_data.drop(['countrycode', 'drawing', 'datatype'], axis=1)
    
    return df_data

In [None]:
# feature extraction: step 1
df_data = feature_extraction(df_data)

In [None]:
# feature exploration: image
_ = drawplot(df_data.loc[:19, 'drawing'], df_data.loc[:19, 'word'])

In [None]:
# feature exploration: image
_ = imageplot(df_data.loc[:19, 'pixel'], df_data.loc[:19, 'word'], (img_size, img_size))

After extracting all features, it is required to convert category features to numerics features, a format suitable to feed into our Machine Learning models.

In [None]:
# feature extraction: step 2
df_data = feature_extraction2(df_data)

In [None]:
# describe data dataframe
df_data.describe(include='all')

In [None]:
# verify dtypes object
df_data.info()

In [None]:
# memory clean-up
del df_data, df_train, df_test
gc.collect()

> **Model, predict and solve the problem**

Now, it is time to feed the features to Machine Learning models.

In [None]:
def train_generator() -> tuple:
    """ Return training data generator.
    
    Returns:
        tuple: The training data tuple.
    """
    
    while True:
        for k in np.random.permutation(range(num_shuffles - 1)):
            for df_data in pd.read_csv('../input/quick-draw-doodle-recognition-challenge-shufflecsv/train_k%d.csv.gz' %k, chunksize=chunksize):
                # feature extraction: drawing
                df_data['drawing'] = df_data['drawing'].apply(lambda x: json.loads(x))
                
                # feature extraction: word
                df_data['word'] = df_data['word'].apply(lambda x: -1 if x == '' else class_dict[x.replace(' ', '_')])
                
                # feature extraction: drawing to pixel
                x = np.zeros((df_data.shape[0], img_size, img_size, 1))
                for i, drawing in enumerate(df_data['drawing'].values): x[i, :, :, 0] = draw2pixel(drawing)
                y = keras.utils.to_categorical(df_data['word'], num_classes=num_classes)
                yield x, y

# training data generator
gen_train = train_generator()

In [None]:
# testing (validating) data
df_data = pd.read_csv('../input/quick-draw-doodle-recognition-challenge-shufflecsv/train_k%d.csv.gz' %(num_shuffles - 1), nrows=34000)

# feature extraction: drawing
df_data['drawing'] = df_data['drawing'].apply(lambda x: json.loads(x))

# feature extraction: word
df_data['word'] = df_data['word'].apply(lambda x: -1 if x == '' else class_dict[x.replace(' ', '_')])

# feature extraction: drawing to pixel
x_validate = np.zeros((df_data.shape[0], img_size, img_size, 1))
for i, drawing in enumerate(df_data['drawing'].values): x_validate[i, :, :, 0] = draw2pixel(drawing)
y_validate = keras.utils.to_categorical(df_data['word'], num_classes=num_classes)

In [None]:
# memory clean-up
del df_data
gc.collect()

A TensorFlow graph consists of the following parts which will be detailed below:

* Placeholder variables used for inputting data to the graph.
* Variables that are going to be optimized so as to make the convolutional network perform better.
* The mathematical formulas for the convolutional network.
* A loss measure that can be used to guide the optimization of the variables.
* An optimization method which updates the variables.

In [None]:
def top_3_categorical_accuracy(y_true: np.ndarray, y_pred: np.ndarray) -> float:
    """ Return top 3 categorical accuracy.
    
    Args:
        y_true (np.ndarray): The ground truth (correct) labels.
        y_pred (np.ndarray): The predicted labels.
    
    Returns:
        float: The top 3 categorical accuracy.
    """
    
    return top_k_categorical_accuracy(y_true, y_pred, k=3)

In [None]:
# mobilenet model setup
model_mobilenet = MobileNet(input_shape=(img_size, img_size, 1), alpha=1.0, dropout=1e-3, weights=None, classes=num_classes)
model_mobilenet.summary()

In [None]:
# mobilenet model setup
model_mobilenet.compile(optimizer='adam', loss='categorical_crossentropy', metrics=[top_3_categorical_accuracy])

# mobilenet model fit
hist = model_mobilenet.fit_generator(gen_train, steps_per_epoch=800, epochs=96, verbose=2, validation_data=(x_validate, y_validate))
model_hist = pd.DataFrame(hist.history)

# mobilenet model metrics
model_mobilenet_score = model_mobilenet.evaluate(x_validate, y_validate, verbose=1)
print('mobilenet\n  top 3 categorical accuracy score: %0.4f' %model_mobilenet_score[1])

In [None]:
# plot the model history
fig, axes = plt.subplots(figsize=(20, 10), ncols=1, nrows=2)
axes = axes.flatten()
model_hist.plot(y='top_3_categorical_accuracy', kind='line', ax=axes[0])
model_hist.plot(y='val_top_3_categorical_accuracy', kind='line', ax=axes[0])
model_hist.plot(y='loss', kind='line', ax=axes[1])
model_hist.plot(y='val_loss', kind='line', ax=axes[1])
for axis in axes: axis.set_xlabel('epoch')

In [None]:
# mobilenet model save
model_mobilenet.save('model_mobilenet.h5')

In [None]:
# memory clean-up
del x_validate, y_validate
gc.collect()

> **Supply or submit the results**

Our submission to the competition site Kaggle is ready. Any suggestions to improve our score are welcome.

In [None]:
# acquiring testing data
df_test = pd.read_csv('../input/quickdraw-doodle-recognition/test_simplified.csv')

# feature extraction: drawing
df_test['drawing'] = df_test['drawing'].apply(lambda x: json.loads(x))

In [None]:
# prepare testing data and compute the observed value
x_test = np.zeros((df_test.shape[0], img_size, img_size, 1))
for i, drawing in enumerate(df_test['drawing'].values): x_test[i, :, :, 0] = draw2pixel(drawing)
y_test = np.argsort(-model_mobilenet.predict(x_test, verbose=1))[:, 0:3]
df_word = pd.DataFrame({'top 1': y_test[:, 0], 'top 2': y_test[:, 1], 'top 3': y_test[:, 2]})
df_word = df_word.replace(classreverse_dict)
df_word['submission'] = df_word['top 1'] + ' ' + df_word['top 2'] + ' ' + df_word['top 3']

In [None]:
# submit the results
out = pd.DataFrame({'key_id': df_test['key_id'], 'word': df_word['submission']})
out.to_csv('submission.csv', index=False)

In [None]:
# visualize head of the submitted results
out.head(n=5)