# K-Nearest Neighbour

The KNN algorithm assumes that similar things exist in close proximity. In other words, similar things are near to each other.

<b>How KNN works ?</b>

Firstly, we will choose the number of neighbors, so we will choose the k=5.

Next, we will calculate the Euclidean distance between the data points. The Euclidean distance is the distance between two points, which we have already studied in geometry. It can be calculated as:

![title](knn.png)

Let’s take below wine example. Two chemical components called Rutime and Myricetin. Consider a measurement of Rutine vs Myricetin level with two data points, Red and White wines. They have tested and where then fall on that graph based on how much Rutine and how much Myricetin chemical content present in the wines.

![title](knn1.png)

‘k’ in KNN is a parameter that refers to the number of nearest neighbours to include in the majority of the voting process.

Suppose, if we add a new glass of wine in the dataset. We would like to know whether the new wine is red or white?

![title](knn2.png)

So, we need to find out what the neighbours are in this case. Let’s say k = 5 and the new data point is classified by the majority of votes from its five neighbours and the new point would be classified as red since four out of five neighbours are red.

![title](knn3.png)

In [1]:
#importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, accuracy_score

base svc running
svc running


In [2]:
#importing dataset
df = pd.read_csv('Social_Network_Ads.csv')


In [3]:
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

In [4]:
#Splitting the dataset into training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

In [5]:
#Feature Scalling
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [6]:
#Training the KNN model on the training set
classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
classifier.fit(X_train, y_train)

KNeighborsClassifier()

In [7]:
# Predicting a new result
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)


0.95

In [8]:
cm = confusion_matrix(y_test, y_pred)
print(cm)

[[55  3]
 [ 1 21]]
