# Brandon Vasquez

### The following work was done without collaboration from other peers within the class but with aid from the SciKit library documentation page for the various methods, variables, and functions used.

## Sources used (among the documentation of the various libraries included):
* [Plot multi-class SGD on the iris dataset from SciKit Documentation](https://scikit-learn.org/stable/auto_examples/linear_model/plot_sgd_iris.html)
* [Basically what we have to do...but for the SVM classifier](https://scikit-learn.org/stable/auto_examples/svm/plot_iris_svc.html)



In [111]:
from sklearn import datasets, svm
from sklearn.linear_model import SGDClassifier, LogisticRegression
from sklearn.inspection import DecisionBoundaryDisplay
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pathlib import Path

In [112]:
iris_data = datasets.load_iris()
features = ["Sepal Length", "Sepal Width", "Petal Length", "Petal Width"]

match_targets = np.array([iris_data.target_names[target] for target in iris_data.target]) # match each target name to the respective target index
iris_df = pd.DataFrame(iris_data.data, columns=features, index=match_targets)

compiled_accurracies = []
print(type(iris_df), iris_df.head(10))
print(match_targets, len(match_targets))

<class 'pandas.core.frame.DataFrame'>         Sepal Length  Sepal Width  Petal Length  Petal Width
setosa           5.1          3.5           1.4          0.2
setosa           4.9          3.0           1.4          0.2
setosa           4.7          3.2           1.3          0.2
setosa           4.6          3.1           1.5          0.2
setosa           5.0          3.6           1.4          0.2
setosa           5.4          3.9           1.7          0.4
setosa           4.6          3.4           1.4          0.3
setosa           5.0          3.4           1.5          0.2
setosa           4.4          2.9           1.4          0.2
setosa           4.9          3.1           1.5          0.1
['setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa'
 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa'
 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa'
 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa' 'setosa'

In [113]:
# helper functions here....
def addScore(clf: str, score: float) -> None:
    compiled_accurracies.append({
        'clf': clf,
        'score': score
    })
    return None

def implementation_of_score_func(*arrays) -> float: #just for fun implementing the score function for SGD using predict
    out = [1 if pre == arrays[-1][i] else 0 for i, pre in enumerate(arrays[0])]
    return out.count(1) / len(out)

In [114]:
sep_feats = iris_df[['Sepal Length', 'Sepal Width']] # plot this
print(type(sep_feats), sep_feats[::10])

<class 'pandas.core.frame.DataFrame'>             Sepal Length  Sepal Width
setosa               5.1          3.5
setosa               5.4          3.7
setosa               5.4          3.4
setosa               4.8          3.1
setosa               5.0          3.5
versicolor           7.0          3.2
versicolor           5.0          2.0
versicolor           5.9          3.2
versicolor           5.5          2.4
versicolor           5.5          2.6
virginica            6.3          3.3
virginica            6.5          3.2
virginica            6.9          3.2
virginica            7.4          2.8
virginica            6.7          3.1


In [115]:
X_train,X_test,y_train,y_test = train_test_split(sep_feats, match_targets, test_size=0.2, random_state=805)

In [116]:
print(X_train.shape, X_train[::5], '\n\nTraining features\n\n') # plot this
print(y_train.shape, y_train[::5], '\n\nTraining labels')

(120, 2)             Sepal Length  Sepal Width
versicolor           4.9          2.4
virginica            6.8          3.2
setosa               5.1          3.5
setosa               5.3          3.7
virginica            6.0          3.0
setosa               4.9          3.6
setosa               4.7          3.2
virginica            6.7          3.0
versicolor           5.6          3.0
setosa               5.1          3.8
versicolor           6.8          2.8
virginica            4.9          2.5
versicolor           6.4          2.9
virginica            6.5          3.0
virginica            5.7          2.5
virginica            7.1          3.0
setosa               5.2          3.4
versicolor           6.3          2.3
versicolor           5.8          2.7
virginica            6.9          3.1
virginica            7.3          2.9
versicolor           5.0          2.0
setosa               4.6          3.1
versicolor           5.7          2.9 

Training features


(120,) ['versicolor

In [117]:
print(X_test.shape, X_test[::2], '\n\nTraining features\n\n') # plot this
print(y_test.shape, y_test[::2], '\n\nTraining labels')

(30, 2)             Sepal Length  Sepal Width
versicolor           5.6          2.7
versicolor           6.6          3.0
setosa               4.4          2.9
versicolor           5.5          2.6
virginica            7.7          3.8
setosa               5.4          3.9
setosa               5.1          3.8
virginica            7.2          3.2
setosa               5.7          4.4
setosa               5.1          3.7
versicolor           5.8          2.7
virginica            7.2          3.6
virginica            6.4          2.7
setosa               5.0          3.6
virginica            7.4          2.8 

Training features


(30,) ['versicolor' 'versicolor' 'setosa' 'versicolor' 'virginica' 'setosa'
 'setosa' 'virginica' 'setosa' 'setosa' 'versicolor' 'virginica'
 'virginica' 'setosa' 'virginica'] 

Training labels


In [118]:
#SGD with hinge loss function
clf_hinge = SGDClassifier(loss="hinge", tol=1e-4, random_state=0)
clf_hinge.fit(X_train, y_train)
score = clf_hinge.score(X_test, y_test)
addScore('SGD_hinge', score)
print(score)


print(implementation_of_score_func(clf_hinge.predict(X_test), y_test))
print(f'It is {implementation_of_score_func(predict, y_test) == score} that the built-in score and implemented version are equal')

0.7
0.7
It is False that the built-in score and implemented version are equal


In [119]:
#SGD with modified huber loss function
clf_mod_huber = SGDClassifier(loss="modified_huber", tol=1e-4, random_state=0)
clf_mod_huber.fit(X_train, y_train)
score = clf_mod_huber.score(X_test, y_test)
addScore('SGD_mod_huber', score)
print(score)

print(implementation_of_score_func(clf_mod_huber.predict(X_test), y_test))
print(f'It is {implementation_of_score_func(predict, y_test) == score} that the built-in score and implemented version are equal')

0.6666666666666666
0.6666666666666666
It is True that the built-in score and implemented version are equal


In [120]:
print(compiled_accurracies)

[{'clf': 'SGD_hinge', 'score': 0.7}, {'clf': 'SGD_mod_huber', 'score': 0.6666666666666666}]


# Discussion

## Large Delta of two feature classifications when using Perceptron loss function
* Choosing two distinct sets of features it was observed that the loss functions effect on the classifying the data drastically changed as observed in the 'Scores across the various # of features' it can be seen that when we compare the accurracy of the perceptron loss function on the two different sets of features this was where the greatest disparity was seen. When now reflecting on the plot featuring the decision regions of the respective classes we see that for the set consisting of the features 'Petal Width, Sepal Width' there was a gross misclassification for the Versicolor class where the plotted training points extend into the decision regions--not to be confused with the hyperplane--of both setosa and virginica. Now comparing this classification plot to that of the set with 'Petal Width, Petal Length' it can be observed that the higher score of the classification is due to the more accurate split of the decision regions where the training data--while still not completely linearly seperable. This delta between the two further shows the effect that linearly inseperable data has on the percetron loss function and thus the classifier as a whole.

## Accurracy Across the chosen combination of features...
* When looking across all of the features in the plot 'Scores across the various combination of features' tt is also noted that using the perceptron loss function on the three-feature combination as well as the two distinct two-feature combinations resulted in the lowest accuracy score, however, the four-feature combination exhibited the second highest accuracy score with the perceptron loss function. The highest accuracy score recorded using the four loss functions for the ‘Sepal Width, Petal Width’ two-feature and four-feature combination was using the hinge loss function meanwhile for the ‘Petal Width, Petal Length’ and the three-feature loss function the highest accuracy score was recorded with the squared hinge loss function. 

* Inclusively the 'Petal Width, Sepal Width' two-feature combination exhibited similar deltas in accurracy and even slightly converged on the loss function that resulted in the highest accuracy score for both--squared hinge--this could have possibly been due to the similarity in the distribution of the training data, when observing the 'loss function: squared_hinge with score 0.933' from the 3D plot of the three feature classification combination and the 'loss function: squared_hinge with score 0.933' in 2D featuring the 'Petal Width, Sepal Width' two-feature combination it can be observed that the distribution of the three species is quite similar with the setosa class being linearly seperable from the other two class meanwhile versicolor and virginica are not linearly seperable.







