K Nearest neighbours

KNN classifies data according to the majority of labels in the nearest neighbourhood, according to some underlying distance function d(x, x′).

For k = 1, the label for a test point x^* is predicted to be the same as for its closest training point x_k, i.e. y_k, where

$$k=\argmin_j d(x^*, x_j).$$

See Chapter 14 in barber2012bayesian for a detailed introduction.

See 2996 for known issues.

Example

Imagine we have files with training and test data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) and CMulticlassLabels as

knn.sg:create_features

In order to run CKNN, we need to choose a distance, for example CEuclideanDistance, or other sub-classes of CDistance. The distance is initialized with the data we want to classify.

knn.sg:choose_distance

Once we have chosen a distance, we create an instance of the CKNN classifier, passing it training data and labels.

knn.sg:create_instance

Then we run the train KNN algorithm, apply it to test data, and print the predictions.

knn.sg:train_and_apply

References

K-nearest_neighbors_algorithm

../../references.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

knn.rst

knn.rst

K Nearest neighbours

Example

References

Files

knn.rst

Latest commit

History

knn.rst

File metadata and controls

K Nearest neighbours

Example

References