### single layer perceptron implemented by scikit-learn

in this notebook, we will implement a simple perceptron neuron using scikit-learn library, and train it on Iris dataset.

we will assign the petal length and petal width of the 150 flower examples to the feature matrix, X, and the corresponding class labels of the flower species
to the vector array, y:

In [2]:
from sklearn import datasets
import numpy as np 

iris = datasets.load_iris()
X = iris.data[:, [2, 3]]
y = iris.target

print('class labels :', np.unique(y))

class labels : [0 1 2]


considering the output of `print('class labels :', np.unique(y))` command, we know that there is 3 classes in our dataset.

to evaluate how well a trained model performs on unseen data, we will further split the dataset into seperate training and test datasets.

In [3]:
from sklearn.model_selection import train_test_split 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, stratify=y)

In [4]:
print(np.bincount(y))
print(np.bincount(y_train))
print(np.bincount(y_test))

[50 50 50]
[35 35 35]
[15 15 15]


as we know, many machine learning and optimization algorithms also require feature scaling for optimal performance. in order to fulfill this purpose, we can use `StandardScaler` class from `preprocessing` module of sklearn.

In [5]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

as we standardizing out `X_train` and `X_test` datasets, we can see now according to standard normal distribution, the *mean* of our datasets is equal to *0* and *standard deviation* is equal to *1*.  

In [22]:
print("X_train_std standard deviation =", round(X_train_std.std()),", and X_train_std mean is equal to :", round(X_train_std.mean()))
print("X_test_std standard deviation =", round(X_test_std.std()),", and X_test_std mean is equal to :", round(X_test_std.mean()))


X_train_std standard deviation = 1 , and X_train_std mean is equal to : 0
X_test_std standard deviation = 1 , and X_test_std mean is equal to : 0


having standardized the training data, we can now train a perceptron model.

In [40]:
from sklearn.linear_model import Perceptron

ppn = Perceptron(eta0=0.1, random_state=1)
ppn.fit(X_train_std, y_train)

Perceptron(eta0=0.1, random_state=1)

having trained a model in scikit-learn, we can make predictions via `predict` method.

In [41]:
y_pred = ppn.predict(X_test_std)
print('misclassified examples: %d' % (y_test != y_pred).sum())

misclassified examples: 1


another way to measure our accuracy is using built-in scikit-learn performance metrics from `metrices` module.

In [42]:
from sklearn.metrics import accuracy_score

print('Accuracy: %.3f' % accuracy_score(y_test, y_pred))

Accuracy: 0.978


alternatively, each classifier in scikit-learn has a `score` method, which computes a classifir's prediction accuracy by combining `predict` call with `accuracy_score` methods. 

In [43]:
print("accuracy: %.3f" % ppn.score(X_test_std, y_test))

accuracy: 0.978
