# Multiclass Perceptron
**Name: Akshay Reddy Akkati**

In [1]:
import pandas as pd
import numpy as np

In [2]:
# loading data file to dataframe
df = pd.read_csv("mnist_data.txt", header=None, delimiter=r"\s+")

In [3]:
# inserting bias term
# df['']=1
df.insert(loc=784, column=784,value=1,allow_duplicates=False)
# print(df)

In [4]:
# loading labels
labels = pd.read_csv("mnist_labels.txt", header=None)
# print(labels)

In [5]:
# splitting data to train and test (into halves)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df, labels, test_size=0.5)

In [6]:
# constructing weight vector with default values as zeros
weight_vector=pd.DataFrame(index=np.arange(10), columns=np.arange(785)).fillna(0)
# print(weight_vector)

In [7]:
from sklearn.metrics import accuracy_score, classification_report
# Number of iterations for outer loop
num_of_iterations=10
for itr in range(num_of_iterations):
    for i in X_train.index:
        activation = weight_vector.dot(X_train.loc[i])
        max_value_index = activation.idxmax()
# When a prediction is wrong, the weights for the correct class are increased by x and the weights for the 
# incorrectly predicted class are decreased by x
        if(max_value_index != y_train.loc[i,0]):
            weight_vector.loc[max_value_index] = weight_vector.loc[max_value_index] - X_train.loc[i]
            weight_vector.loc[y_train.loc[i,0]] = weight_vector.loc[y_train.loc[i,0]] + X_train.loc[i]
            
# defining dataframes for train and test predictions
    y_train_predict=pd.DataFrame(index=list(y_train.index),columns=np.arange(1)).fillna(0)
    y_test_predict=pd.DataFrame(index=list(y_test.index),columns=np.arange(1)).fillna(0)
    
# Filling the prediction dataframes of training and testing
    for i in X_train.index:
        y_train_predict.loc[i,0]=(weight_vector.dot(X_train.loc[i])).idxmax()    

    for i in X_test.index:
        y_test_predict.loc[i,0]=(weight_vector.dot(X_test.loc[i])).idxmax()

# Calculating accuracies
    train_acc=accuracy_score(y_train,y_train_predict)
    test_acc=accuracy_score(y_test,y_test_predict)
    print('Accuracies after iteration: ' + str(itr))
    print('training:' +str(train_acc))
    print('testing:' + str(test_acc))
print('The accuracy of the classifier on the training set: ' + str(train_acc*100) + '%')
print('The accuracy of the classifier on the test set: ' + str(test_acc*100) + '%')

Accuracies after iteration: 0
training:0.8278
testing:0.8144
Accuracies after iteration: 1
training:0.8332
testing:0.8152
Accuracies after iteration: 2
training:0.8332
testing:0.809
Accuracies after iteration: 3
training:0.9128
testing:0.881
Accuracies after iteration: 4
training:0.8806
testing:0.8452
Accuracies after iteration: 5
training:0.8868
testing:0.8446
Accuracies after iteration: 6
training:0.9094
testing:0.8698
Accuracies after iteration: 7
training:0.9024
testing:0.8606
Accuracies after iteration: 8
training:0.8964
testing:0.8508
Accuracies after iteration: 9
training:0.921
testing:0.879
The accuracy of the classifier on the training set: 92.10000000000001%
The accuracy of the classifier on the test set: 87.9%


# Implementation and Convergence criterion
**Loading data**
* We have MNIST digits dataset
* Each row in the data file is 784 integers that represent the grayscale values of a 28x28 handwritten digit. Each row in the labels file is a single digit in the range 0-9 and is in a 1-to-1 correspondence with rows in the data file.
* Initially load the data from given text files to the dataframes
* I have used python pandas libraries for doing that\
**Including bias term**
* I inserted a bias term in the dataframe of data (added extra column) with value '1'.\
**Splitting the data**
* I have to split the data and labels randomly into halves as X_train, X_test, y_train, y_test\
**Weight vector**
* Construct weight vector of size 10 rows (since 10 class labels) and 785 columns and initialize values to zeros\
**Running the Perceptron**
* First calculate the dot product of weight vector and X_train and find the index value of the greatest value.
* Now compare the index of maximum value in weight vector with the corresponding value in labels.
* When a prediction is wrong, the weights for the correct class are increased by x and the weights for the incorrectly predicted class are decreased by x
* We stop the application till we reach the convergence, This convergence can be choosen in different ways. One is till we reach greater accuracy and other is choosing some arbitrary number of iterations.
* I have used some arbitrary number of iterations for reaching the convergence\
**Testing accuracy**
* I have defined two dataframes for train and test predicted labels.
* Compared them with the actual labels and finally printed the accuracy.
* In the above code shown, I have displayed accuracies for 10 iterations.
* My final accuracies obtained after 100 iterations are: train: **98.84%** test: **87.08%**

