![title](KNN.png)

Although KNN can be used both for regression, and classification, it is widely used for classification. The algorithm adopts the lazy learning method. In other words, instead of learning the training data set, it memorizes it and decides according to the nearest neighbor when a prediction is requested. The **K** is the number of values to look around. When a value is determined, the nearest k values are taken and the distance is calculated with the Euclidean equation. Manhattan, and Minkowski functions can also be used when calculating the distance.

> 1. In k-NN classification, the output is a class membership. An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.

> 2. In k-NN regression, the output is the property value for the object. This value is the average of the values of k nearest neighbors.

![image](https://www.saedsayad.com/images/KNN_similarity.png)
<a href = "https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.saedsayad.com%2Fk_nearest_neighbors.htm&psig=AOvVaw0oM7BHmV-PgxsO4TpoAhlN&ust=1623933152509000&source=images&cd=vfe&ved=0CA0QjhxqFwoTCJCpjciUnPECFQAAAAAdAAAAABAD">Image Source</a>

In [2]:
import numpy as np

class KNearestNeighbor():
    def __init__(self, k):
        self.k = k
        
    def train(self, X, y):
        self.X_train = X
        self.y_train = y
        
    def predict(self, X_test):
        distances = self.compute_distance(X_test)
        return self.predict_labels(distances)
        
    def compute_distance(self, X_test):
        num_test = X_test.shape[0]
        num_train = self.X_train.shape[0]
        distances = np.zeros((num_test, num_train))
        
        for i in range(num_test):
            for j in range(num_train):
                # euclidean distance
                distances[i, j] = np.sqrt(np.sum((X_test[i, :] - self.X_train[j, :])**2))
                
        return distances
        
    def predict_labels(self, distances):
        num_test = distances.shape[0]
        y_pred = np.zeros(num_test)
        
        for i in range(num_test):
            # giving the indices of the elements after sorting
            y_indices = np.argsort(distances[i, :])
            k_closest_classes = self.y_train[y_indices[:self.k]].astype(int)
            # count number of occurrences of each value in array of non-negative ints
            y_pred[i] = np.argmax(np.bincount(k_closest_classes))
        return y_pred
        
if __name__ == "__main__":
    X = np.loadtxt("data.txt",
                  delimiter = ",")
    y = np.loadtxt("targets.txt",
                  delimiter = ",")
    KNN = KNearestNeighbor(k = 3)
    KNN.train(X, y)
    y_pred = KNN.predict(X)
    print(f"Accuracy: {sum(y_pred == y)/ y.shape[0]}")

Accuracy: 0.9333333333333333


### References
1. <a href = "https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm#:~:text=In%20statistics%2C%20the%20k%2Dnearest,training%20examples%20in%20data%20set.">Wikipedia</a>
2. <a href = "https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/">Analytics Vidhya</a>
3. <a href = "https://medium.com/@ekrem.hatipoglu/machine-learning-classification-k-nn-k-en-yak%C4%B1n-kom%C5%9Fu-part-9-6f18cd6185d">Medium</a>
4. <a href = "https://www.youtube.com/watch?v=QzAaRuDskyc">Aladdin Persson</a>