## K Nearest Neighbour Classifier
K-nearest neighbors (KNN) is a simple, supervised machine learning algorithm that can be used for both classification and regression problems. It works by finding the k most similar instances to a new instance, and then assigning the label of the majority of those instances to the new instance.<br>
In this post, I will walk you through the steps of implementing KNN in Python, using the codes you provided.



## First, we need to import the necessary libraries: 

In [1]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline



## Next, we create a classification dataset:



In [2]:
from sklearn.datasets import make_classification
X,y=make_classification(
 n_samples=1000, # 1000 observations 
    n_features=3, # 3 total features
    n_redundant=1, #creates an redundant feature(it does not contribute new info or calcuted from existed features)
    n_classes=2, # binary target/label 
    random_state=999 
)


**The X variable contains the features of the dataset, and the y variable contains the labels.**


In [3]:
X

array([[-0.33504974,  0.02852654,  1.16193084],
       [-1.37746253, -0.4058213 ,  0.44359618],
       [-1.04520026, -0.72334759, -3.10470423],
       ...,
       [-0.75602574, -0.51816111, -2.20382324],
       [ 0.56066316, -0.07335845, -2.15660348],
       [-1.87521902, -1.11380394, -4.04620773]])

## We can now split the data into a training set and a test set:

 

In [4]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(
    X,y, test_size=0.33,random_state=42)

The training set will be used to train the KNN model, and the test set will be used to evaluate the model's performance.

## Now, we can create a KNN classifier and Train:

In [5]:
from sklearn.neighbors import KNeighborsClassifier
classifier=KNeighborsClassifier(n_neighbors=5,algorithm='auto')
classifier.fit(X_train,y_train)

The n_neighbors parameter specifies the number of neighbors to use.

## We can now predict the KNN model:

In [6]:
y_pred=classifier.predict(X_test)

### Finally, we can evaluate the model's performance on the test set:

In [7]:
from sklearn.metrics import confusion_matrix,accuracy_score,classification_report
print(confusion_matrix(y_pred,y_test))
print(accuracy_score(y_pred,y_test))
print(classification_report(y_pred,y_test))

[[158  20]
 [ 11 141]]
0.906060606060606
              precision    recall  f1-score   support

           0       0.93      0.89      0.91       178
           1       0.88      0.93      0.90       152

    accuracy                           0.91       330
   macro avg       0.91      0.91      0.91       330
weighted avg       0.91      0.91      0.91       330



**The output of the confusion_matrix function is a table that shows the number of instances that were correctly classified and the number of instances that were incorrectly classified.<br>The accuracy score is the fraction of instances that were correctly classified.<br>
The f1-score is a harmonic mean of the precision and recall.<br><br>In this case, the KNN model has an accuracy of 0.906, a precision of 0.93, a recall of 0.89, and an f1-score of 0.91. This means that the model is very good at predicting the correct class for the test instances.**

## Perform a GridSearchCV to find the best parameters

In [8]:
## Task 
#GridsearchCV\
from sklearn.model_selection import GridSearchCV
kclassifier1=KNeighborsClassifier(algorithm='auto')
param_grid = {
    'n_neighbors': [1,2,3,4,5,6,7,8,9,10],
}

# Create an instance of GridSearchCV
grid_search = GridSearchCV(estimator=kclassifier1, param_grid=param_grid, cv=5)

# Fit the grid search to the data
grid_search.fit(X_train,y_train)



In [9]:
grid_search.best_params_

{'n_neighbors': 9}

## Make predictions and Evaluate the performance of the model with the best parameters:

In [10]:
y_pred=grid_search.predict(X_test)
from sklearn.metrics import confusion_matrix,accuracy_score,classification_report
print(confusion_matrix(y_pred,y_test))
print(accuracy_score(y_pred,y_test))
print(classification_report(y_pred,y_test))

[[156  16]
 [ 13 145]]
0.9121212121212121
              precision    recall  f1-score   support

           0       0.92      0.91      0.91       172
           1       0.90      0.92      0.91       158

    accuracy                           0.91       330
   macro avg       0.91      0.91      0.91       330
weighted avg       0.91      0.91      0.91       330



**The best parameters for the KNN model are n_neighbors=9. The model has an accuracy of 0.912, a precision of 0.92, a recall of 0.91, and an f1-score of 0.91. This means that the model is very good at predicting the correct class for the test instances.**

## K Nearest Neighbour Regression

In [11]:
## K Nearest Neighbour Regression
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=1000, n_features=2, noise=10, random_state=42)

In [12]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.33, random_state=42)

In [13]:
from sklearn.neighbors import KNeighborsRegressor
regressor=KNeighborsRegressor(n_neighbors=6,algorithm='auto')
regressor.fit(X_train,y_train)
y_pred=regressor.predict(X_test)

In [14]:
from sklearn.metrics import r2_score,mean_absolute_error,mean_squared_error
print(r2_score(y_test,y_pred))
print(mean_absolute_error(y_test,y_pred))
print(mean_squared_error(y_test,y_pred))

0.9189275159979495
9.009462452972217
127.45860414317289
