<center><font size=6 color="#00416d">K-Nearest Neighbors (KNN)</font></center>

### Importing Dependencies

In [None]:
import pandas as pd
import numpy as np

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# StandardScaler model required for scaling feature values
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix

### Reading Data

In [None]:
df = pd.read_csv("Classified Data", index_col=[0])
df.head()

### Performing Transformation on DataFrame

Because KNN classifier predicts class of a test observation by identifying observations that are near to it, so, the scale of the variables matter a lot. If any variable that is at very large actually effect the distance between the observations. To avoid that need to perform transformation on that data.

In [None]:
# Creating an object
scalar = StandardScaler()
scalar.fit(df.drop("TARGET CLASS", axis=1))

In [None]:
# Perform standardization by centering and scaling.
scaled_features = scalar.transform(df.drop("TARGET CLASS", axis=1))
scaled_features

In [None]:
# we will create df with these scaled features
df_scaled_feat = pd.DataFrame(scaled_features, columns=df.columns[:-1])

In [None]:
df_scaled_feat

### Building KNN Model

In [None]:
X = df_scaled_feat
y = df["TARGET CLASS"]
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)

In [None]:
knn = KNeighborsClassifier(n_neighbors=1)

In [None]:
knn.fit(x_train, y_train)

In [None]:
predicted = knn.predict(x_test)

### Evaluating model

In [None]:
print(confusion_matrix(y_test, predicted))

In [None]:
print(classification_report(y_test, predicted))

### Elbow Method

The most important step in k-Nearest Neigborhood supervised machine learning is to determine the optimal value of K; that is. Elbow method helps data scientists to select the optimal number of clusters for KNN clustering

In [None]:
error_rate = []
for i in range(1, 40):
    knn = KNeighborsClassifier(n_neighbors=i)
    knn.fit(x_train, y_train)
    predicted = knn.predict(x_test)
    # Whereever mean value is less which means we have good match between y_test and predicted values
    error_rate.append(np.mean(y_test != predicted))

In [None]:
fig, ax = plt.subplots(figsize=(20,7))
x = list(range(1, 40))
y = error_rate
plt.plot(x, y, marker='o',
         color='green', linestyle='dashed', linewidth=2,
         markersize=10, markerfacecolor="red")
for index in range(len(x)):
  ax.text(x[index], y[index], x[index], size=12)
plt.show()

In [None]:
# From above graph we can pick k value
knn = KNeighborsClassifier(n_neighbors=34)
knn.fit(x_train, y_train)
predicted = knn.predict(x_test)
print("Confustion Matrix:")
print(confusion_matrix(y_test, predicted))
print("\nClassification Report:")
print(classification_report(y_test, predicted))