#### Chossing a classification algorithm

In practice, it is always recommended that you compare the performance of at least a handful of different learning algorithms to select the best model for the particular problem. 

The five main steps that are involved in trainig a supervised machine learning algorithm can be summarized as follows:

1. Selecting features and collecting labeled training examples.
1. Choosing a performance metric.
1. Choossing a classifier and optimization algorithm.
1. Evaluating the performance of the model.
1. Tuning the algorithm.


Implementation of Perceptron using sckit-learn

In [1]:
from sklearn import datasets
import numpy as np

iris = datasets.load_iris()
X = iris.data[:, [2,3]]
y = iris.target
print('Class labels: ', np.unique(y))

Class labels:  [0 1 2]


To evaluate how well a trained model performs on unseen data, we will further split the dataset into seperate training and test datasets

In [5]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, stratify=y)

We took the advantage of the built-in support for staratification via `stratify=y`. In this context, stratification means that the `train_test_split` method returns trainig and test subsets that have the same proportions of class labels as the input dataset.

We can use NumPy's `bincount` function, which counts the number of occurances of each value in an array.

In [11]:
print('Labels counts in y: ', np.bincount(y))
print('Labels counts in y_train: ', np.bincount(y_train))
print('Labels counts in y_test:', np.bincount(y_test))

Labels counts in y:  [50 50 50]
Labels counts in y_train:  [35 35 35]
Labels counts in y_test: [15 15 15]


Many machine learning and optimization algorithms also require feature scaling for optimal performance. Here, we will standarize the features using the `StandardScaler` class of scikit-learn's `preprocessing` module.

In [12]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler()

sc.fit(X_train)

X_train_std = sc.transform(X_train)

X_test_std = sc.transform(X_test)

Using the `fit` method, `StandardScaler` estimated the parameters $\mu$ (sample mean) and $\sigma$ (standard deviation), for each feature dimension from the training data. By calling the `transform` method, we then standardized the training data using those estimated parameters, $\mu$ and $\sigma$. We standardize the test dataset so that both the values in the training and test dataset are comparable to each other.

We now train the Perceptron model. Most algorithms in scikit-learn already supports multiclass classification by default via the **one-vs.-rest (OvR)** method.

In [14]:
from sklearn.linear_model import Perceptron

ppn = Perceptron(eta0=0.1)
ppn.fit(X_train_std, y_train)

In [15]:
y_pred = ppn.predict(X_test_std)
print('Misclasssified examples: %d' %(y_test != y_pred).sum())

Misclasssified examples: 3


In [16]:
from sklearn.metrics import accuracy_score
print('Accuracy: %.3f' %accuracy_score(y_test, y_pred))

Accuracy: 0.933
