# K-Nearest Neighbors (KNN)

### KNN Algorithm Overview

KNN is a lazy learning algorithm used for classification and regression. It works by finding the K nearest neighbors of a data point in the feature space and making predictions based on their labels (for classification) or values (for regression).

### Steps in KNN:

1) Calculate Distance: Compute the distance between the query point and all other points in the dataset.

2) Find Nearest Neighbors: Select the K points with the smallest distances.

3) Make Prediction:

       i) For classification: Use majority voting among the neighbors.

       ii) For regression: Use the average value of the neighbors.

In [1]:
import numpy as np 
from collections import Counter

In [2]:
# Step 1: Define a distance metric (Euclidean distance)
def euclidean_distance(x1, x2):
    return np.sqrt(np.sum(x1 -x2) ** 2)

In [6]:
# Step 2: Implement the KNN algorithm 

class KNN:
    def __init__(self, k=3):
        self.k=k # number of neighbour 

    def fit(self, X_train, y_train):
        # store the training data
        self.X_train = X_train 
        self.y_train = y_train 

    def predict(self, X_test):
        predictions = [self._predict(x) for x in X_test]
        return np.array(predictions)

    def _predict(self, x):
        # Step 2a: compute distance between x and all examples in the training set
        distances = [euclidean_distance(x, x_train) for x_train in self.X_train]

        # Step 2b: Sort by distance and get the indices of the k nearest neighbours
        k_indices = np.argsort(distances)[:self.k]

        # Step 2c: Extract the labels of the k nearest neighbours 
        k_nearest_labels = [self.y_train[i] for i in k_indices]

        # Step 2d: Majority voting for classification 
        most_common = Counter(k_nearest_labels).most_common(1)
        return most_common[0][0] 


In [7]:
# Step 3: Test the KNN implementation
if __name__ == "__main__":
    # Example dataset (features and labels)
    X_train = np.array([[1, 2], [2, 3], [3, 4], [6, 7], [7, 8], [8, 9]])
    y_train = np.array([0, 0, 0, 1, 1, 1])  # Binary classification

    # Create a KNN classifier
    knn = KNN(k=3)
    knn.fit(X_train, y_train)

    # New data point to classify
    X_test = np.array([[5, 5]])
    prediction = knn.predict(X_test)

    print(f"Predicted class for {X_test}: {prediction}")

Predicted class for [[5 5]]: [0]
