# MNIST classification with KerasTuner HyperModels
## Table of Contents
- Summary
- Importing necessary Libraries
- Loading the data
- Exploratory Data Analysis
- Model Development
- Submission

## Summary
In this notebook I will build a MNIST Classifer based on KerasTuner HyperModels. KerasTuner offers two implementations:
- HyperResNet
- HyperXception

You can also make your own implementation. You can find more details [here](https://keras.io/api/keras_tuner/hypermodels/).
## Importing necessary Libraries

In [None]:
!pip install keras-tuner --upgrade

In [None]:
import numpy as np
import pandas as pd
import os
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split, KFold
import keras_tuner as kt

## Loading the data

In [None]:
train_pd = pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
test_pd = pd.read_csv('/kaggle/input/digit-recognizer/test.csv')
train = np.array(train_pd)
test = np.array(test_pd)
train_images = train[:, 1:].reshape((train.shape[0], 28, 28, 1))
train_labels = train[:, 0].astype(np.uint8)
test_images = test.reshape((test.shape[0], 28, 28, 1))

## Exploratory Data Analysis

In [None]:
train_pd.head()

In [None]:
train_pd.describe()

**Correlation Score**

In [None]:
correlation_score = train_pd.corr()

In [None]:
correlated_features = correlation_score["label"].sort_values(ascending=False).dropna()
correlated_columns = list(correlated_features[correlated_features.abs() > 0.2].index)
correlated_columns.remove("label")
print(correlated_columns)

**Label Distribution**

In [None]:
train_pd.groupby("label")["label"].count().plot(kind="pie")

**Mean image for different labels**

Calcuate mean image for different labels, they look exactly the label they belong to.

In [None]:
mean_images = [np.mean(train_images[train_labels == i].reshape(-1, 28, 28), axis=0) for i in range(10)]
concat_image = np.concatenate(mean_images, axis=1)
print(concat_image.shape)
plt.imshow(concat_image)
plt.show()

## Model Development

In [None]:
train_images, val_images, train_labels, val_labels = train_test_split(train_images, train_labels)

In [None]:
train_labels = keras.utils.to_categorical(train_labels, num_classes=10)
val_labels = keras.utils.to_categorical(val_labels, num_classes=10)

In [None]:
tuner = kt.RandomSearch(
    kt.applications.HyperResNet(input_shape=(28, 28, 1), classes=10),
    objective='val_loss',
    max_trials=5)

In [None]:
tuner.search(train_images, train_labels, epochs=5, validation_data=(val_images, val_labels))

In [None]:
best_model = tuner.get_best_models()[0]

In [None]:
best_model.summary()

## Submission

In [None]:
test_labels = np.argmax(best_model.predict(test_images), axis=-1)
print(test_labels.shape)

In [None]:
sample_submission = pd.read_csv("/kaggle/input/digit-recognizer/sample_submission.csv")
sample_submission["Label"] = test_labels
sample_submission.to_csv("submission.csv", index=False)

In [None]:
image_ids = np.arange(1, test_labels.shape[0]+1)
result = np.concatenate((image_ids.reshape(image_ids.shape[0], 1), test_labels.reshape(test_labels.shape[0], 1)), axis=1)
df = pd.DataFrame(result, columns=["ImageId", "Label"], dtype='int')
df.to_csv("submission.csv", index=False)