In [None]:
%matplotlib inline

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn import neighbors, datasets

# import some data to play with
iris = datasets.load_iris()

# Index

* [About classification algorithms](#About-classification-algorithms)
* [K nearest neigbors](#KNN)
* [Decision Trees](#Decision-Trees)

## About classification algorithms
Some classification algorithms can only distinguish between two classes, how can we use them in multi class problems? There are two approaches to this:
    
* **One vs one:** is the approach where we evaluate the classes in pairs. Say we have three classes, A, B and C. The OVO ensemble will be composed of 3 (= 3 * (3 - 1) / 2) binary classifiers. The first will discriminante A from B, the second A from C, and the third B from C. At prediction stage, the class that got the highest number of "+1" predictions is our winner. Notice that this is a $O(n^2)$ problem
      
* **One vs rest:** (aka one-vs-all)is the strategy that involves training one classifier (estimator) for class and then taking the one which gives the highest confidence.

[wiki](https://en.wikipedia.org/wiki/Multiclass_classification#Transformation_to_binary)

## KNN
K nearest neigbors is a *lazy* algorithm which does not learn and makes computations in classification time, that is, find a predefined number of training samples (k) closest in distance to the new point, and predict the label from these.

Notice that KNN takes by default the k closest samples regardless how far they are, to mitigate this effect a weight parameter can be added.

KNN can also be applied to time series but they're pretty much regression problems we'll see them in due time. 

**Key features of KNN:**
* Easy to understand and implement.
* Computationally efficient in general (with small datasets).
* Defining similarities.
* The first thing that should be tried when approaching a ML problem.
* They suffer especially the [curse of dimensionality](./../Glossary.ipynb/#C).

In [None]:
##### PART #1, Load an preprocess the data #####

# we only take the first two features in the dataset
X = iris.data[:, :2]
cols = iris['feature_names'][:2]
y = iris.target


##### PART 2, create the model #####

# Number of neighbors and weight
k, w = 30, 'distance'

# we create an instance of Neighbours Classifier and fit the data.
clf = neighbors.KNeighborsClassifier(n_neighbors=k, weights=w)
clf.fit(X, y)


##### PART #3, plot the outcome ####

# We are about to create a mesh of points that will represent a bunch of predictions 
h = .01  # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))

# Once created the mesh, drop all the points into the model and predict the values for them
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

# Plot the prediction areas (background)
cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA', '#AAAAFF'])
Z = Z.reshape(xx.shape)  # reshape to match the grid, same as yy.shape
plt.figure()
plt.pcolormesh(xx, yy, Z, cmap=cmap_light)

# Plot also the training points (real points)
cmap_bold =  ListedColormap(['#FF0000', '#00FF00', '#0000FF'])
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold, edgecolor='k', s=20)
plt.xlabel(cols[0])
plt.ylabel(cols[1])
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.title("3-Class classification (k = %i)"% (k))

plt.show()

## Decision Trees