![@mikegchambers](../../images/header.png)

# K-Nearest Neighbor

In this notebook, we explore K-Nearest neighbor using scikit-learn.

![Clusters](clusters.png)

In [None]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import make_blobs

import numpy as np
import matplotlib.pyplot as plt

from matplotlib import style
style.use('ggplot') or plt.style.use('ggplot')

# Data

In [None]:
X, y = make_blobs(n_samples=100, centers=4, cluster_std=1.5)

Here we do some manual color mapping.  This helps us later on when we plot a prediction to our graph.

In [None]:
colors = ['red', 'green', 'blue', 'yellow']
mapped_colors = []
for yc in y:
    mapped_colors.append(colors[yc])

Let's plot our data:

In [None]:
axes = plt.axes()
axes.scatter(X[:,0], X[:,1], c=mapped_colors)

plt.show()

# Model

https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html

In [None]:
K = 10
model = KNeighborsClassifier(n_neighbors=K)

In [None]:
model.fit(X, y)

# Testing

Let's make a prediction for a new datapoint:

In [None]:
t = [0,0]

p = model.predict([t])
p_mapped_color = colors[p[0]]

Let's get the nearset neighbours that were used to make that classification:

In [None]:
neighbors = model.kneighbors([t])

And lets plot that on a graph:

In [None]:
axes = plt.axes()

# Plot the original points
axes.scatter(X[:,0], X[:,1], c=mapped_colors)

# Plot the test point
axes.scatter(t[0], t[1], s=300, linewidth=3, facecolors='none', edgecolors=p_mapped_color)
axes.scatter(t[0], t[1], c='black', marker="x", s=200)

# Plot a circle around the nearest K points
axes.scatter(X[neighbors[1]][0][:,0], X[neighbors[1]][0][:,1], s=150, linewidth=1, facecolors='none', edgecolors='black')

plt.show()