# k-Nearest Neighbors Classifier (k-NN)

The k-NN is one of the simplest classifiers in Machine Learning. Differently from other common supervised techniques, it does not perform a **learning**; instead, the algorithm checks the distance between the instance that will be classified and other feature vectors from the dataset. Due to its simplicity, it is often used in benchmarks of complex classifiers, like Artificial Neural Network (**ANN**) and Suport Vector Machine (**SVM**).

This notebook shows how the dimensionality reduction algorithms required for this work affects the 
accuracy of the k-NN classifier in the classification task posed by the Covertype dataset. Sections
are organized as follows: first, we import all necessary libraries to run k-NN from the
`sklearn` library and load the dataset, separating data from targets; then we load the results
datasets and define a function to perform k-NN over them, running also cross-validation
to validate the results; finally we compare the performance of the classifier according
to each reduced or extracted set of attributes.

## Implementation

### Import libraries 
Let's first import the libraries, mainly `pandas`, `numpy` and k-NN implementation from `sklearn`:

In [1]:
# Import libraries
import pandas as pd
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
import random

### Loading the datasets

Now, load the train and test datasets, taking the targets apart from the features:

In [4]:
# Train datasets
original_train = pd.read_csv('../datasets/covertype_train.csv')
original_norm_train = pd.read_csv('../datasets/covertype_norm_train.csv')
lda_train = pd.read_csv('../datasets/covertype_lda_train_raw.csv')
lda_norm_train = pd.read_csv('../datasets/covertype_lda_train.csv')

# Targets
target_original_train = original_train.iloc[:,-1]
target_original_norm_train = original_norm_train.iloc[:,-1]
target_lda_train = lda_train.iloc[:,-1]
target_lda_norm_train = lda_norm_train.iloc[:,-1]

# Dataset without classes
data_original_train = original_train.iloc[:,:-1]
data_original_norm_train = original_norm_train.iloc[:,:-1]
data_lda_train = lda_train.iloc[:,:-1]
data_lda_norm_train = lda_norm_train.iloc[:,:-1]