# K-Nearest Neighbors (KNN) Algorithm

K-Nearest Neighbors (KNN) is a simple, non-parametric, instance-based learning algorithm. It stores all available cases and classifies new cases based on a similarity measure (e.g., distance functions).

### Key Concepts:
- **Instance-based learning**: No explicit model is trained.
- **Lazy learning**: Delays the decision to generalize until a query is made.
- **Distance metric**: Euclidean distance is most commonly used.
- **k-value**: Number of neighbors considered.

### KNN Algorithm Diagram
![KNN Diagram](https://upload.wikimedia.org/wikipedia/commons/e/e7/KnnClassification.svg)


In [None]:
# Importing Required Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix


## Load Dataset
We'll use the Iris dataset for demonstration purposes.

In [None]:
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['target'] = iris.target
df.head()

## Exploratory Data Analysis (EDA)

In [None]:
sns.pairplot(df, hue='target')
plt.show()

## Data Preprocessing
- Split the dataset
- Feature scaling

In [None]:
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## Model Training using KNN

In [None]:
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)

## Model Evaluation

In [None]:
y_pred = knn.predict(X_test)
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

## Choosing the Optimal K
We use the elbow method to find the best value of `k`.

In [None]:
error_rate = []

for i in range(1, 21):
    knn = KNeighborsClassifier(n_neighbors=i)
    knn.fit(X_train, y_train)
    pred_i = knn.predict(X_test)
    error_rate.append(np.mean(pred_i != y_test))

plt.figure(figsize=(10, 6))
plt.plot(range(1, 21), error_rate, marker='o', linestyle='--', color='b')
plt.title('Error Rate vs. K Value')
plt.xlabel('K')
plt.ylabel('Error Rate')
plt.show()