# KNN from Scratch: Module Demo & Testing


##1. Introduction
K-Nearest Neighbors (KNN) is a simple yet effective supervised learning algorithm that can be used for both classification and regression. In the regression setting, KNN predicts the value of a data point by averaging the values of its k nearest neighbors in the feature space.

KNN is intuitive, non-parametric, and makes no assumptions about the underlying data distribution. It is particularly suitable for low-dimensional data or when the relationship between features and the target is complex or non-linear.

In this notebook, we implement a custom KNN regressor from scratch and apply it to the Diabetes dataset to evaluate its performance in predicting disease progression based on clinical features.

## 2. Import Libraries

In [18]:
import numpy as np
from sklearn import datasets
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split

In [2]:
# Connect to Google Drive and access my custom KNN model
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
import sys
sys.path.append('/content/drive/MyDrive/scratch/')
from models.knn import KNearestNeighbor

## 3. Load Dataset

In [13]:
# Load diabetes dataset

cancer =  datasets.load_diabetes()
X = cancer.data
y = cancer.target

## 4. Train-Test Split


In [14]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print(X_train.shape , X_test.shape)

(353, 10) (89, 10)


## 5. Train the model

In [32]:
model = KNearestNeighbor(k=5)
model.fit(X_train, y_train)

## 6. predict

In [33]:
y_pred = model.predict(X_test)
y_pred

array([125.6, 160.2, 153. , 238. , 153.4, 150.4, 246.2, 170. , 106.6,
       104.6,  93.2, 151.6, 104. , 166.6,  61.4, 105.4, 263.8, 252. ,
       173.6, 215.4, 184.6,  86.6, 106.6, 174. , 158.2, 168. , 196.6,
       145.6,  67.6, 117.2, 158. , 166.6,  89. , 158.4, 166.4, 243.6,
        73.8, 136.8, 148.6, 106.6,  90. , 100.6, 134.2, 147.6, 193.4,
        81.6,  83.6,  90.8,  84.4, 119.2, 117.8,  88.2, 150.6, 112. ,
       210.2, 130.8,  79.4, 175.8, 120.4,  73. , 155. , 117.6,  58. ,
        94.6, 169.4, 187.4, 214.4, 146.4, 145.2, 118.8, 119.2, 195. ,
       204.4,  98.6, 107.8, 209.4, 151. , 163.6, 210.4, 218.2, 150.8,
       139. ,  67.4,  82.4, 105.4, 113.2, 110.4,  84.2, 159.2])

## 7. Evaluate

In [29]:
# Evaluate
print("MAE:", mean_absolute_error(y_test, y_pred))
print("MSE:", mean_squared_error(y_test, y_pred))
print("R² Score:", r2_score(y_test, y_pred))

MAE: 42.648475120385235
MSE: 2986.97500573263
R² Score: 0.436223206221125
