# Nearest Neighbors

The principle behind nearest neighbor methods is to find a predefined number of training samples closest in distance to the new point, and predict the label from these. The number of samples can be a user-defined constant (k-nearest neighbor learning), or vary based on the local density of points (radius-based neighbor learning). The distance can, in general, be any metric measure: standard Euclidean distance is the most common choice. Neighbors-based methods are known as non-generalizing machine learning methods, since they simply “remember” all of its training data.

Despite its simplicity, nearest neighbors has been successful in a large number of classification and regression problems, including handwritten digits and satellite image scenes. Being a non-parametric method, it is often successful in classification situations where the decision boundary is very irregular.

Scikit-learn provides a number of nearest neighbors algorithms in the modeule `neighbors`.

## Nearest Neighbor Algorithms

`NearestNeighbors` implements unsupervised nearest neighbors learning. It acts as a uniform interface to three different nearest neighbors algorithms: `BallTree`, `KDTree`, and a brute-force algorithm based on routines in `sklearn.metrics.pairwise`. The choice of neighbors search algorithm is controlled through the keyword `algorithm`, which must be one of `['auto', 'ball_tree', 'kd_tree', 'brute']`. When the default value `'auto'` is passed, the algorithm attempts to determine the best approach from the training data.

### Brute Force

Fast computation of nearest neighbors is an active area of research in machine learning. The most naive neighbor search implementation involves the brute-force computation of distances between all pairs of points in the dataset: for $N$ samples in $D$ dimensions, this approach scales as $O[D N^2]$. Efficient brute-force neighbors searches can be very competitive for small data samples. However, as the number of samples $N$ grows, the brute-force approach quickly becomes infeasible. In the classes within sklearn.neighbors, brute-force neighbors searches are specified using the keyword `algorithm = 'brute'`.

### KDTree

To address the computational inefficiencies of the brute-force approach, a variety of tree-based data structures have been invented. In general, these structures attempt to reduce the required number of distance calculations by efficiently encoding aggregate distance information for the sample. The basic idea is that if point $A$ is very distant from point $B$, and point $B$ is very close to point $C$, then we know that points $A$ and $B$ are very distant, without having to explicitly calculate their distance. In this way, the computational cost of a nearest neighbors search can be reduced to $O[D N \log(N)]$ or better. This is a significant improvement over brute-force for large $N$.

Though the KD tree approach is very fast for low-dimensional ($D < 20$) neighbors searches, it becomes inefficient as $D$ grows very large: this is one manifestation of the so-called “curse of dimensionality”. In scikit-learn, KD tree neighbors searches are specified using the keyword `algorithm = 'kd_tree'`, and are computed using the class `KDTree`.

See the [paper](https://dl.acm.org/citation.cfm?doid=361002.361007) for more details.

### Ball Tree

To address the inefficiencies of KD Trees in higher dimensions, the ball tree data structure was developed. Where KD trees partition data along Cartesian axes, ball trees partition data in a series of nesting hyper-spheres. This makes tree construction more costly than that of the KD tree, but results in a data structure which can be very efficient on highly structured data, even in very high dimensions.

With this setup, a single distance calculation between a test point and the centroid is sufficient to determine a lower and upper bound on the distance to all points within the node. Because of the spherical geometry of the ball tree nodes, it can out-perform a KD-tree in high dimensions, though the actual performance is highly dependent on the structure of the training data. In scikit-learn, ball-tree-based neighbors searches are specified using the keyword `algorithm = 'ball_tree'`, and are computed using the class `BallTree`.

See the [paper](https://arxiv.org/abs/1511.00628) for more details.

## Supervised Nearest Neighbors

### Nearest Neighbors Classification

scikit-learn implements two different nearest neighbors classifiers: `KNeighborsClassifier` implements learning based on the $k$ nearest neighbors of each query point, where $k$ is an integer value specified by the user. `RadiusNeighborsClassifier` implements learning based on the number of neighbors within a fixed radius $r$ of each training point, where $r$ is a floating-point value specified by the user.

The $k$-neighbors classification in `KNeighborsClassifier` is the most commonly used technique. The optimal choice of the value $k$ is highly data-dependent: in general a larger $k$ suppresses the effects of noise, but makes the classification boundaries less distinct.

In cases where the data is not uniformly sampled, radius-based neighbors classification in `RadiusNeighborsClassifier` can be a better choice. The user specifies a fixed radius $r$, such that points in sparser neighborhoods use fewer nearest neighbors for the classification. For high-dimensional parameter spaces, this method becomes less effective due to the so-called “curse of dimensionality”.

The basic nearest neighbors classification uses uniform weights: that is, the value assigned to a query point is computed from a simple majority vote of the nearest neighbors. Under some circumstances, it is better to weight the neighbors such that nearer neighbors contribute more to the fit. This can be accomplished through the `weights` keyword. The default value, `weights = 'uniform'`, assigns uniform weights to each neighbor. `weights = 'distance'` assigns weights proportional to the inverse of the distance from the query point. Alternatively, a user-defined function of the distance can be supplied to compute the weights.

![](https://scikit-learn.org/stable/_images/sphx_glr_plot_classification_001.png)

![](https://scikit-learn.org/stable/_images/sphx_glr_plot_classification_002.png)

### Nearest Neighbors Regression

Neighbors-based regression can be used in cases where the data labels are continuous rather than discrete variables. The label assigned to a query point is computed based on the mean of the labels of its nearest neighbors.

Same as with classification, scikit-learn implements two different nearest neighbors classifiers: `KNeighborsRegressor` implements learning based on the $k$ nearest neighbors of each query point, where $k$ is an integer value specified by the user. `RadiusNeighborsRegressor` implements learning based on the number of neighbors within a fixed radius $r$ of each training point, where $r$ is a floating-point value specified by the user.

Also, similar to the classification algorithms, user can provide the `weights` keyword to control the weights assigned to each neighbor.

![](https://scikit-learn.org/stable/_images/sphx_glr_plot_regression_001.png)

### KNeighborsClassifier

#### Parameters of `KNeighborsClassifier`

Here are some of the parameters of the `KNeighborsClassifier` class:

* **n_neighbors**: int, optional (default=5)
  
    The number of neighbors to use for classification
* **weights**: {‘uniform’, ‘distance’} or callable, default=’uniform’
  
    Weight function used in prediction. Possible values:
    
    * 'uniform' : uniform weights. All points in each neighborhood are weighted equally.
    * 'distance' : weight points by the inverse of their distance. In this case, closer neighbors of a query point will have a greater influence over the predicted outcome.
    * [callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.
* **algorithm**: {‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, default=’auto’
  
    Algorithm used to compute the nearest neighbors:
    
    * ‘ball_tree’ will use `BallTree`
    * ‘kd_tree’ will use a `KDTree`
    * ‘brute’ will use a brute-force search.
    * ‘auto’ will will attempt to decide the most appropriate algorithm based on the values passed to fit method.
* **leaf_size**: int, optional (default=30)
  
    Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem.
* **p**: int, optional (default=2)
  
    Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.
* **metric**: string or callable, default=’minkowski’

    The distance metric to use for the tree. The default metric is minkowski, and with p=2 is equivalent to the standard Euclidean metric. See the documentation of the `DistanceMetric` class for a list of available metrics.

#### Attributes of `KNeighborsClassifier`

Here are some of the attributes of the `KNeighborsClassifier` class:

* **classes_**: array of shape (n_classes,)
  
    The classes labels.
* **effective_metric_**: string
  
    The effective distance metric used for the training data.

The class also has a method called `kneighbors` which Find the K-neighbors of a point. It returns indices of and distances to the neighbors of each point.

### RadiusNeighborsClassifier

#### Parameters of `RadiusNeighborsClassifier`

The parameters of the `RadiusNeighborsClassifier` class are similar to those of the `KNeighborsClassifier` class with the exceptions that the `RadiusNeighborsClassifier` class uses the keyword `radius` instead of `n_neighbors`.
The `radius` keywords accepts a floating-point value representing the radius of the neighborhood. Default value is 1.0.

#### Attributes of `RadiusNeighborsClassifier`

Same as with the `KNeighborsClassifier`.

### KNeighborsRegressor

#### Parameters of `KNeighborsRegressor`

Same as with the `KNeighborsClassifier`.

#### Attributes of `KNeighborsRegressor`

Same as with the `KNeighborsClassifier` but it does not have the `classes_` attribute, obviusly.

### RadiusNeighborsRegressor

#### Parameters of `RadiusNeighborsRegressor`

Same as with the `RadiusNeighborsClassifier`.

#### Attributes of `RadiusNeighborsRegressor`

Same as with the `RadiusNeighborsClassifier` but it does not have the `classes_` attribute.