# Perceptron
Load the `mnist` dataset. Split it into training and test sets. Train and test a perceptron model using scikit-learn. Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

## Importing Modules

In [4]:
import pandas as pd
import sklearn.metrics
import sklearn.linear_model
import sklearn.model_selection
import plotly.express as px

## Loading the Dataset

In [2]:
df = pd.read_csv("../../datasets/mnist.csv")
df = df.set_index("id")
df.head(3)

Unnamed: 0_level_0,class,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
31953,5,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
34452,8,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
60897,5,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## Splitting the Data into Training and Test Sets

In [3]:
x = df.drop(["class"], axis=1)
y = df["class"]
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y)

## Model Selection and Hyperparameter Tuning

In [6]:
parameters_grid = {
    "penalty": ["l1", "l2", "elasticnet"],
    "alpha": [0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1, 10],
    "l1_ratio": [0.0, 0.25, 0.5, 0.75, 1.0],
    "max_iter": [10, 100, 1000, 10000, 100000]
}
model_1 = sklearn.model_selection.GridSearchCV(sklearn.linear_model.Perceptron(), 
                                               parameters_grid, scoring="accuracy", cv=5, n_jobs=-1)
model_1.fit(x_train, y_train)
print("Accuracy of best Random Forest classfier = {:.2f}".format(model_1.best_score_))
print("Best found hyperparameters of Random Forest classfier = {}".format(model_1.best_params_))

Accuracy of best Random Forest classfier = 0.85
Best found hyperparameters of Random Forest classfier = {'alpha': 1e-05, 'l1_ratio': 0.25, 'max_iter': 100, 'penalty': 'elasticnet'}


## Testing the Best Model

In [7]:
y_predicted = model_1.predict(x_test)
accuracy = sklearn.metrics.accuracy_score(y_test, y_predicted)
cm = sklearn.metrics.confusion_matrix(y_test, y_predicted)
precision, recall, f1, support = sklearn.metrics.precision_recall_fscore_support(y_test, y_predicted)

print("Accuracy =", accuracy)
print("Precision =", precision)
print("Recall =", recall)
print("F1-Score =", f1)
print("Confusion Matrix:\n", cm)

Accuracy = 0.864
Precision = [0.90909091 0.94630872 0.94666667 0.66153846 0.84090909 0.95081967
 0.90740741 0.85964912 0.79120879 0.89411765]
Recall = [0.97826087 0.97916667 0.78888889 0.91489362 0.84090909 0.66666667
 0.96078431 0.9245283  0.8        0.71028037]
F1-Score = [0.94240838 0.96245734 0.86060606 0.76785714 0.84090909 0.78378378
 0.93333333 0.89090909 0.79558011 0.79166667]
Confusion Matrix:
 [[ 90   0   0   1   0   0   0   0   1   0]
 [  0 141   0   2   0   0   0   0   1   0]
 [  1   5  71   6   1   0   5   0   1   0]
 [  1   1   2  86   0   1   0   1   2   0]
 [  0   0   0   2  74   0   1   2   5   4]
 [  5   0   0  13   2  58   1   1   7   0]
 [  1   0   0   0   1   1  98   1   0   0]
 [  0   0   0   2   1   0   0  98   0   5]
 [  1   2   1  10   1   0   3   0  72   0]
 [  0   0   1   8   8   1   0  11   2  76]]


























