## What is Classification?
* Classification is a task that uses machine learning algorithms which learn how to assign a class label to examples from the problem domain. Example: classifying an email as 'spam' or 'not spam'.
* There are many classification algorithms in ML.

## What is an MLP Classifier?
* Fullform - Multi-layer Perceptron
* Library - Scikit-Learn
* Easy to learn & implement
* Relies on an underlying Neural Network to perform classification

In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

### Preprocessing

In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import confusion_matrix,classification_report

In [None]:
# Load training and validation sets
df_train = pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
df_test = pd.read_csv('/kaggle/input/digit-recognizer/test.csv')

In [None]:
df_train.head()

In [None]:
#Split images & labels as X & y
X_train = df_train.drop('label', axis=1)
y_train = df_train['label']

X_test = df_test

In [None]:
#Convert from Pandas DataFrame to Numpy Array to be able to perform reshape operations in the next step
X_train = X_train.to_numpy()
y_train = y_train.to_numpy()

X_test = X_test.to_numpy()

In [None]:
X_train = X_train.reshape((-1,28,28))
X_test = X_test.reshape((-1,28,28))

In [None]:
import seaborn as sns
sns.countplot(x=y_train)

* numpy.reshape(array, newshape, order__optional)

* A picture has a height, a width, and a channel. The MNIST dataset is a monochronic picture with a 28x28 size. We set the batch size to -1 in the shape argument so that it takes the shape of the features ["x"].
* Here -1 for input x specifies that this dimension should be dynamically computed based on the number of input values in x, holding the size of all other dimensions constant. This allows us to treat batch_size(parameter with value -1) as a hyperparameter that we can tune.

In [None]:
for i in range(16):
    plt.subplot(4,4,i+1)
    plt.imshow(X_train[i])
plt.show()

In [None]:
X_train = X_train.reshape((-1,28*28))
X_test = X_test.reshape((-1,28*28))

### Build Model

In [None]:
model = MLPClassifier(solver='adam', activation='relu', hidden_layer_sizes = (64,64), early_stopping=True, verbose=True)
model.fit(X_train, y_train)

### Prediction

In [None]:
pred = model.predict(X_test)

### View Prediction

In [None]:
plt.figure(figsize=(15,6))
for i in range(40):  
    plt.subplot(4, 10, i+1)
    plt.imshow(X_test[i].reshape((28,28)),cmap=plt.cm.binary)
    plt.title("predict=%d" % pred[i],y=0.9)
    plt.xticks([])
    plt.yticks([])
plt.subplots_adjust(wspace=0.3, hspace=-0.1)
plt.show()

### Submission

In [None]:
sample = pd.read_csv('/kaggle/input/digit-recognizer/sample_submission.csv')
submit = pd.DataFrame()
submit['ImageId'] = sample['ImageId']
submit['label'] = pred

submit.to_csv('Submission.csv', index=False)

Feel free to upvote if you this notebook helped you!