# Machine Learning Classifiers Using scikit-learn

"No Free Lunch Theorem" by David H. Wolpert

No single classifier works best across all possible scenarios

## 5 main steps that are involved in training a supervised machine learning algorithm can be summarized as follows:

- Selecting features and collecting labeled training examples.
- Choosing a performance metric.
- Choosing a classifier and optimization of algorithm.
- Evaluating the performance of the model.
- Tuning the algorithm.

## First steps with scikit-learn - training a perceptron

In [2]:
from sklearn import datasets
import numpy as np

iris = datasets.load_iris()
X = iris.data[:, [2, 3]]
y = iris.target
print('Class labels:', np.unique(y))

Class labels: [0 1 2]


In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size = 0.3,
    random_state = 13,
    stratify = y 
)

### Important Note

Please notice that 'stratify' returns training and test subsets that have the same proportion of class labels as the input dataset.

We can use NumPy's "bincount" function, which counts the number of occurrences of each value in an array, to verifu that this is indeed the case

In [4]:


print('Labels counts in y:', np.bincount(y))

Labels counts in y: [50 50 50]
