# Machine Learning Classifiers Using scikit-learn

"No Free Lunch Theorem" by David H. Wolpert

No single classifier works best across all possible scenarios

## 5 main steps that are involved in training a supervised machine learning algorithm can be summarized as follows:

- Selecting features and collecting labeled training examples.
- Choosing a performance metric.
- Choosing a classifier and optimization of algorithm.
- Evaluating the performance of the model.
- Tuning the algorithm.

## First steps with scikit-learn - training a perceptron

In [2]:
from sklearn import datasets
import numpy as np

iris = datasets.load_iris()
X = iris.data[:, [2, 3]]
y = iris.target
print('Class labels:', np.unique(y))

Class labels: [0 1 2]


In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size = 0.3,
    random_state = 13,
    stratify = y 
)

### Important Note

Please notice that 'stratify' returns training and test subsets that have the same proportion of class labels as the input dataset.

We can use NumPy's "bincount" function, which counts the number of occurrences of each value in an array, to verifu that this is indeed the case

In [5]:
print('Labels counts in y:', np.bincount(y))
print('Labels counts in y_train:', np.bincount(y_train))
print('Labels counts in y_test:', np.bincount(y_train))

Labels counts in y: [50 50 50]
Labels counts in y_train: [35 35 35]
Labels counts in y_test: [35 35 35]


## Feature Scaling using `StandardScaler`

### Procedure:

1. StandardScaler object needs to be created.
2. `fit` method shall be used on the created object in order to estimate the following parameters for each feature dimension from the training data:
    - sample mean
    - standard deviation
3. `transform` method will then standardized the training data using those estimated parameters.

In [6]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler() # StandardScaler object
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

## Training preceptron model

Most alforithms in scikit-learn aleady support multiclass classification by default via the **one-vs.-rest (OvR)** method.

`Perceptron` class will be created with the following parameters:
    - eta0: learning rate
    - random_state

Please note that the learning rate requires some experimentation. If the learning rate is too large, the algorithm will overshoot the global cost minimum. If the learning rate is too small, the algorith will require more epochs until convergence, which can make the learning slow (especially for large datasets)


In [7]:
from sklearn.linear_model import Perceptron

ppn = Perceptron(
    eta0 = 0.1,
    random_state = 13
    )

ppn.fit(X_train_std, y_train)

Perceptron(alpha=0.0001, class_weight=None, early_stopping=False, eta0=0.1,
           fit_intercept=True, max_iter=1000, n_iter_no_change=5, n_jobs=None,
           penalty=None, random_state=13, shuffle=True, tol=0.001,
           validation_fraction=0.1, verbose=0, warm_start=False)