## K-Nearest Neighbors (KNN) from Scratch using Python

Overview:
This notebook demonstrates how to implement the K-Nearest Neighbors (KNN) algorithm manually (without using built-in scikit-learn functions). It covers dataset handling, distance calculation, neighbor selection, and final classification.

#### Import Libraries and data set

In [1]:
import numpy as np
import pandas as pd

In [2]:
data = pd.read_csv('Ice-Cream-Dataset for K-NN.csv')
df=data.values

#### Example

In [3]:
Temperature = 26
Humidity = 72

In [4]:
example = [Temperature,Humidity]

#### Calculate Distance
compute the Euclidean distance between the test instance and all rows in the dataset.

In [5]:
def CalcualateDistance(trainingSet, ExampleInstance):
    Distance=[]
    
    for i in range(len(trainingSet)):
        DistPerTrain=[]
        for j in range(len(ExampleInstance)):
            l = ExampleInstance[j]-trainingSet[i][j]
            l=l**2
            DistPerTrain.append(l)
        k=np.sum(DistPerTrain)
        k=np.sqrt(k)
        Distance.append(k)
    return Distance        

#### Smallest Distance
Identify the k smallest distances and their corresponding indices.

In [6]:
def NearestNeighbors(k,Distance):
    arr = np.zeros(k)
    minimum=np.zeros(k)
    for i in range(k):
        minimum[i]=Distance[0]
    for i in range(len(Distance)):
        for j in range(k):
            if minimum[j]>Distance[i]:
                minimum[j]=Distance[i]
                arr[j]=i
                break
       
    return minimum,arr

In [7]:
distance = CalcualateDistance(df,example)
distance

[48.877397639399746,
 52.773099207835045,
 49.64876634922564,
 56.462376853972415,
 44.294469180700204,
 57.584720195551874,
 45.18849411078001,
 53.14132102234569,
 48.104053883222775,
 55.362442142665635]

In [8]:
minimumDistances,Indices = NearestNeighbors(3,distance)
minimumDistances

array([44.29446918, 45.18849411, 48.10405388])

In [9]:
Indices

array([4., 6., 8.])

#### Predict Class Label

Predict the class label by taking the majority vote among the nearest neighbors.

In [10]:
def MajorityVoteOfNeighbors(indices,df,OutputColIndex):
    ballotBox = []
    for i in range(len(indices)):
        ballotBox.append(df[i][OutputColIndex])
    values, counts = np.unique(ballotBox, return_counts=True)
    Winner = values[np.argmax(counts)]
    return Winner

In [11]:
Prediction = MajorityVoteOfNeighbors(Indices,df,3)
Prediction

1

#### K-Nearest Neighbors function
Combines all the previous steps into a single reusable function knn() that implements the full KNN process.

In [12]:
def knn(data,k,OutputColIndex,ExampleInstance):
    # Calculate Distance between example instance and training instances
    distance = CalcualateDistance(data, ExampleInstance)
    # Find nearest neighbors to example instance
    minimumDistances,Indices=NearestNeighbors(k,distance)
    # Class of majority of nearest neighbor instances
    output = MajorityVoteOfNeighbors(Indices,df,3)

    return output

In [13]:
newExample=[23,72]
output=knn(df,3,3,newExample)

In [14]:
output

1