<a href="https://colab.research.google.com/github/theclassofai/DataScience_Nuggets/blob/main/KNN_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### K-Nearest Neighbors
KNN, or K-Nearest Neighbors, is a simple yet powerful supervised machine learning algorithm used for classification and regression tasks. In KNN, the class of a new data point is determined by the majority class among its k nearest neighbors in the feature space. It relies on the assumption that similar data points tend to belong to the same class or have similar output values.

<img title="a title" alt="Alt text" src="https://miro.medium.com/v2/resize:fit:1358/0*ItVKiyx2F3ZU8zV5" width="700">

In [2]:
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter
from sklearn.metrics import classification_report

In [6]:
# Generate Dataset
np.random.seed(0)
x = np.random.rand(100,2) * 10
y = np.random.choice([0,1], size=100)


In [8]:
# Split data into training and test
X_train, X_test = x[:80], x[80:]
y_train, y_test = y[:80], y[80:]

In [9]:
# define Euclidean distance
def edu_dist(x1,x2):
  return np.sqrt(np.sum((x1-x2)**2))

In [11]:
# KNN Algorithm

def KNN_pred(X_train, y_train, X_test, k):
  distance = [edu_dist(X_test, x_train) for x_train in X_train]
  k_indices = np.argsort(distance)[:k]
  k_nearest_labels = [y_train[i] for i in k_indices]
  most_common = Counter(k_nearest_labels).most_common(1)
  return most_common[0][0]


In [13]:
# Prediction Test Data
y_pred = [KNN_pred(X_train, y_train, x_test, k=3) for x_test in X_test]
y_pred

[0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1]

In [14]:
# Classification report
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.36      0.40      0.38        10
           1       0.33      0.30      0.32        10

    accuracy                           0.35        20
   macro avg       0.35      0.35      0.35        20
weighted avg       0.35      0.35      0.35        20

