# Choosing a classification algorithm

When applying a classifier it is important to consider the problem at hand. There is always a balance to be considered between ***computational performance*** and ***predicitve performance***. The five main steps in training a supervised machine learning model are:

1. Selecting features and collecting labeled training examples.

2. Choosing a performance metric

3. Choosing a learning algorithm and training a model

4. Evaluating the performance of a model

5. Changing the settings of the algortihm and tuning the model.

In [5]:
from sklearn import datasets
import numpy as np

iris = datasets.load_iris()
X = iris.data[:,[2,3]]
y = iris.target

print("Class labels: ", np.unique(y))

Class labels:  [0 1 2]


In [12]:
from sklearn.model_selection import train_test_split
# stratify ensures that training and test sets contain the same proportion of class labels as the input dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, stratify=y)

In [13]:
print("Label counts in y: ", np.bincount(y))
print("Label counts in y_train: ", np.bincount(y_train))
print("Label counts in y_test: ", np.bincount(y_test))

Label counts in y:  [50 50 50]
Label counts in y_train:  [35 35 35]
Label counts in y_test:  [15 15 15]


#### Standardizing using SciKit-Learn
scikit-learn's preprocessing model allows us to feature scale to optimize algorithmic performance@


In [17]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

#### Training the dataset
Using scikit-learn's built in Perceptron model we can do multiclass classification by OvR (one-vs-rest) method such that we can feed all three classes at once

In [20]:
from sklearn.linear_model import Perceptron
ppn = Perceptron(eta0 = 0.01, random_state=1)
ppn.fit(X_train_std, y_train)
y_pred = ppn.predict(X_test)
print('Misclassified examples: %d' % (y_test != y_pred).sum())

Misclassified examples: 30
