# Linear Classifier

Using the SGD classifier from scikit to create a two-dimensional linear classifier for 3 classes

In [3]:
import csv

Xv = []
Yv = []

with open('spiral.csv') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    line_count = 0
    for row in csv_reader:
        if line_count != 0:
            Xv.append([float(row[0]), float(row[1])])
            Yv.append(int(row[2]))
        line_count+=1

In [4]:
import numpy as np
from sklearn import linear_model

# Build the SGDClassifier
X = np.array(Xv)
Y = np.array(Yv)
clf = linear_model.SGDClassifier(max_iter=1000, tol=1e-3)
clf.fit(X, Y)

print(clf.predict([[0.115,0.554]]))
print(clf.predict([[0, 0.022]]))
print(clf.predict([[-0.115,-0.554]]))
print(clf.predict([[0.215,0.554]]))

[1]
[2]
[2]
[1]


## Confusion Matrix


In [66]:
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
import random

# Get confusion matrix
y_res = []
x_pred = []

for i in range(8):
    ri = random.choice(np.arange(Y.size))
    y_res.append(Y[ri])
    x_pred.append(X[ri])

y_true = np.array(y_res)
y_pred = clf.predict(x_pred)

print(f'y_true: {y_true}')
print(f'y_pred: {y_pred}')

print('\nConfusion Matrix (y_true, y_pred):')
print(confusion_matrix(y_true, y_pred))


print('\nAccuracy Score:')
print(accuracy_score(y_true, y_pred))


y_true: [1 2 2 1 0 0 2 2]
y_pred: [1 0 2 1 0 0 0 0]

Confusion Matrix (y_true, y_pred):
[[2 0 0]
 [0 2 0]
 [3 0 1]]

Accuracy Score:
0.625


From the confusion matrix we can view what the system is good at determining and what it is not good at. In this example, we can see that the system is pretty good at determining 1´s and 2´s, but not very good at 0´s.

We can interpret the matrix and get the accuracy score for 0's.

```
+-----------+--------------+-----------+
|           |           Actual         |
+===========+==============+===========+
|           |     TP=2     |    FP=0   |
+Predictions+--------------+-----------+
|           |     FN=3     |    TN=3   |
+-----------+--------------+-----------+
````

Accuracy Score:

\begin{equation*}
ACC = \frac{TP + TN}{TP + TN + FP + FN}
\end{equation*}

\begin{equation*}
ACC = \frac{2 + 3}{2 + 3 + 0 + 3} = \frac{5}{8} = 0.625
\end{equation*}