# K-Nearest Neighbors

One of the simplest machine learning algorithms, is non-parametric and lazy in nature. Non-parametric means that there is no assumption for the underlying data distribution i.e. the model structure is determined from the dataset. Lazy or instance-based learning means that for the purpose of model generation, it does not require any training data points and whole training data is used in the testing phase.

In [None]:
from sklearn.neighbors import NearestNeighbors
from sklearn.neighbors import KNeighborsRegressor

import numpy as np

# Unsupervised KNN

## Data

In [None]:
data = np.array([[-1, 1], [-2, 2], [-3, 3], [1, 2], [2, 3], [3, 4],[4, 5]])

## KNN on data

In [None]:
nrst_neigh = NearestNeighbors(n_neighbors = 3, algorithm='ball_tree')
nrst_neigh.fit(data)

## Distances and Indices of data points

In [None]:
distances, indices = nrst_neigh.kneighbors(data)
print(indices, '\n\n', distances)

## K-Neighbors graphs

In [None]:
nrst_neigh.kneighbors_graph(data).toarray()

# Supervised KNN

### Data

In [None]:
# dataset
from sklearn.datasets import load_iris

# loading dataset
X, Y = load_iris(return_X_y = True)

### Creating a Train test split

In [None]:
# importing train test split
from sklearn.model_selection import train_test_split

# creating a train-test split
X_train, X_test, Y_train, Y_test = train_test_split( X, Y, 
                    test_size = 0.4, random_state = 1 )



### Scaling

In [None]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

### KNN training

In [None]:
knnr = KNeighborsRegressor(n_neighbors = 8)
knnr.fit(X_train, Y_train)

### Accuracy


In [None]:
# accuracy 
print('Accuracy on Train : ', round(knnr.score(X_train, Y_train)*100, 2))
print('Accuracy on Test : ', round(knnr.score(X_test, Y_test)*100, 2))