### Classification Example
K nearest neighbors (kNN) is one of the simplest learning strategies: given a new, unknown observation, look up in your reference database which ones have the closest features and assign the predominant class.

Let's try it out on our iris classification problem:

In [1]:
from sklearn import neighbors, datasets
iris = datasets.load_iris() #load_iris object into iris var
X, y = iris.data, iris.target #get out data and target object in iris

# create the model
knn = neighbors.KNeighborsClassifier(n_neighbors=5)

# fit the model
knn.fit(X, y)

# What kind of iris has 3cm x 5cm sepal and 4cm x 2cm petal?
# call the "predict" method:
#Use numpy array or Python 2D list
# result = knn.predict(np.array([3, 5, 4, 2]).reshape(1, -1))
result = knn.predict([[3, 5, 4, 2]])

#For normal Python list, put two [[]], even if just one row [3,5,4,2] 
#as it has to be 2D matrix
#Alternatively, use .reshape(1,-1) for single row and .reshape(-1,1) for single feature
# -1 means unknown dimension, only one -1 allowed in a reshape

In [2]:
print(iris.target_names[result])

['versicolor']


In [3]:
#Illustration of reshape
import numpy as np
print(np.array([3, 5, 4, 2]).reshape(1, -1)) #2D array with 1 row
print(np.array([[3, 5, 4, 2],[1,2,3,4]]).reshape(2, -1)) #2D array with 2 rows

[[3 5 4 2]]
[[3 5 4 2]
 [1 2 3 4]]


In [4]:
results2 = knn.predict(np.array([[3, 5, 4, 2],[1,2,3,4]]).reshape(2, -1))
results3 = knn.predict([[3, 5, 4, 2],[1,2,3,4]])
print(results2)
print(iris.target_names[results2])
print(results3)
print(iris.target_names[results3])

[1 1]
['versicolor' 'versicolor']
[1 1]
['versicolor' 'versicolor']


You can also do probabilistic predictions:

In [5]:
knn.predict_proba([[3, 5, 4, 2],])

array([[0. , 0.8, 0.2]])

In [10]:
from fig_code import plot_iris_knn
plot_iris_knn()