<a href="https://colab.research.google.com/github/MohammadrezaPourreza/Scikit-learn-tutorial/blob/main/NearestNeighbor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

sklearn.neighbors.NearestNeighbors is the module used to implement unsupervised nearest neighbor learning.

**PARAMETERS**

	
n_neighbors − int, optional

The number of neighbors to get. The default value is 5.

	
radius − float, optional

It limits the distance of neighbors to returns. The default value is 1.0.

algorithm − {‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, optional

This parameter will take the algorithm (BallTree, KDTree or Brute-force) you want to use to compute the nearest neighbors. If you will provide ‘auto’, it will attempt to decide the most appropriate algorithm based on the values passed to fit method.

leaf_size − int, optional

It can affect the speed of the construction & query as well as the memory required to store the tree. It is passed to BallTree or KDTree. Although the optimal value depends on the nature of the problem, its default value is 30.

	
metric − string or callable

It is the metric to use for distance computation between points. We can pass it as a string or callable function. In case of callable function, the metric is called on each pair of rows and the resulting value is recorded. It is less efficient than passing the metric name as a string.

We can choose from metric from scikit-learn or scipy.spatial.distance. the valid values are as follows −

Scikit-learn − [‘cosine’,’manhattan’,‘Euclidean’, ‘l1’,’l2’, ‘cityblock’]

Scipy.spatial.distance −

[‘braycurtis’,‘canberra’,‘chebyshev’,‘dice’,‘hamming’,‘jaccard’, ‘correlation’,‘kulsinski’,‘mahalanobis’,‘minkowski’,‘rogerstanimoto’,‘russellrao’, ‘sokalmicheme’,’sokalsneath’, ‘seuclidean’, ‘sqeuclidean’, ‘yule’].

The default metric is ‘Minkowski’.

	
P − integer, optional

It is the parameter for the Minkowski metric. The default value is 2 which is equivalent to using Euclidean_distance(l2).

In [1]:
from sklearn.neighbors import NearestNeighbors
import numpy as np

In [2]:
Input_data = np.array([[-1, 1], [-2, 2], [-3, 3], [1, 2], [2, 3], [3, 4],[4, 5]])

In [3]:
nrst_neigh = NearestNeighbors(n_neighbors = 3, algorithm = 'ball_tree')

In [4]:
nrst_neigh.fit(Input_data)

NearestNeighbors(algorithm='ball_tree', n_neighbors=3)

In [7]:
distances, indices = nrst_neigh.kneighbors(Input_data)
indices

array([[0, 1, 3],
       [1, 2, 0],
       [2, 1, 0],
       [3, 4, 0],
       [4, 5, 3],
       [5, 6, 4],
       [6, 5, 4]])

In [8]:
distances

array([[0.        , 1.41421356, 2.23606798],
       [0.        , 1.41421356, 1.41421356],
       [0.        , 1.41421356, 2.82842712],
       [0.        , 1.41421356, 2.23606798],
       [0.        , 1.41421356, 1.41421356],
       [0.        , 1.41421356, 1.41421356],
       [0.        , 1.41421356, 2.82842712]])

In [9]:
nrst_neigh.kneighbors_graph(Input_data).toarray()

array([[1., 1., 0., 1., 0., 0., 0.],
       [1., 1., 1., 0., 0., 0., 0.],
       [1., 1., 1., 0., 0., 0., 0.],
       [1., 0., 0., 1., 1., 0., 0.],
       [0., 0., 0., 1., 1., 1., 0.],
       [0., 0., 0., 0., 1., 1., 1.],
       [0., 0., 0., 0., 1., 1., 1.]])

In [10]:
from sklearn.datasets import load_iris
iris = load_iris()

In [11]:
X = iris.data[:, :4]
y = iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)

In [12]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

In [13]:
import numpy as np
from sklearn.neighbors import KNeighborsRegressor
knnr = KNeighborsRegressor(n_neighbors = 8)
knnr.fit(X_train, y_train)

KNeighborsRegressor(n_neighbors=8)

In [15]:
print ("The MSE is:",format(np.power(y-knnr.predict(X),2).mean()))

The MSE is: 1.5052083333333333


In [16]:
import numpy as np
from sklearn.neighbors import RadiusNeighborsRegressor
knnr_r = RadiusNeighborsRegressor(radius=1)
knnr_r.fit(X_train, y_train)

RadiusNeighborsRegressor(radius=1)

In [17]:
print ("The MSE is:",format(np.power(y-knnr_r.predict(X),2).mean()))

The MSE is: 1.6666666666666667
