# KNN (K-Nearest-Neighbors)
Data-driven (not model driven) algorithm. Works in the following ways for classification/regression:
- **Classification**: If *unweighted*, outputs the most common classification among the k-nearest neighbors. If *weigted*, sums up the weights of the k-nearest neighbors for each classification value, outputting the classifcation with the highest weight.
- **Regression**: Basically does the same thing, except with the averages. When *weighted*, weights the nearest neighbors by the inverse of their distance.

In [4]:
'''Loading the Dataset & Making imports'''
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

dataset = load_wine()
X = dataset['data']
y = dataset['target']
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state=42)

## Importance of parameter k
K essentially tells you have many nearest-neighbors the datapoint will look at. **Low k values** capture the local structure of the data (but also noise), while **High k values** provide more smoothing and less noise, but may miss local structure.

In [5]:
from sklearn.neighbors import KNeighborsClassifier

# Create & fit the model
knn = KNeighborsClassifier(n_neighbors=5)  # Default argument
knn.fit(x_train, y_train)

# Make a prediction & get the score
y_pred = knn.predict(x_train)
knn.score(x_train, y_train)

0.7741935483870968

## Paramater Tuning for KNN
The most important parameter is k.