# 🧠 K-Nearest Neighbors (KNN) — Iris Flower Classification

This notebook demonstrates a **practical implementation** of the **K-Nearest Neighbors (KNN)** algorithm using the famous **Iris dataset**.  
KNN is a simple yet powerful algorithm used for **classification** and **pattern recognition** tasks.

---

## 🎯 Objectives
- Understand the working of KNN algorithm.
- Apply KNN on the Iris dataset to classify flower species.
- Visualize decision boundaries and evaluate model accuracy.

---


In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')

## 🌸 Step 1: Load the Iris Dataset

In [None]:
# Load the dataset
iris = load_iris()
X = iris.data
y = iris.target

# Create DataFrame for visualization
df = pd.DataFrame(X, columns=iris.feature_names)
df['species'] = iris.target_names[y]

df.head()

## 📊 Step 2: Explore and Visualize the Data

In [None]:
# Pairplot visualization
sns.pairplot(df, hue='species', diag_kind='hist')
plt.suptitle("Iris Data Visualization", y=1.02)
plt.show()

# Correlation Heatmap
plt.figure(figsize=(6, 4))
sns.heatmap(pd.DataFrame(X, columns=iris.feature_names).corr(), annot=True, cmap='coolwarm')
plt.title("Feature Correlation Heatmap")
plt.show()

## ⚙️ Step 3: Prepare the Data for Training

In [None]:
# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Normalize features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("Training Data Shape:", X_train_scaled.shape)
print("Testing Data Shape:", X_test_scaled.shape)

## 🤖 Step 4: Train KNN Classifier

In [None]:
# Train the model
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train_scaled, y_train)

# Predictions
y_pred = knn.predict(X_test_scaled)

# Evaluate performance
accuracy = accuracy_score(y_test, y_pred)
print(f"✅ Model Accuracy: {accuracy*100:.2f}%")

## 📉 Step 5: Evaluate the Model

In [None]:
# Confusion Matrix and Classification Report
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, cmap='Greens', fmt='d',
            xticklabels=iris.target_names, yticklabels=iris.target_names)
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix — KNN Classifier")
plt.show()

print("\nClassification Report:")
print(classification_report(y_test, y_pred))

## 🔍 Step 6: Choosing Optimal K Value

In [None]:
# Test different K values
k_values = range(1, 15)
accuracies = []

for k in k_values:
    model = KNeighborsClassifier(n_neighbors=k)
    model.fit(X_train_scaled, y_train)
    y_pred_k = model.predict(X_test_scaled)
    accuracies.append(accuracy_score(y_test, y_pred_k))

plt.plot(k_values, accuracies, marker='o', color='blue')
plt.title("Accuracy vs K Value")
plt.xlabel("Number of Neighbors (K)")
plt.ylabel("Accuracy")
plt.grid(True)
plt.show()

best_k = k_values[np.argmax(accuracies)]
print(f"🏆 Best K value: {best_k} with Accuracy: {max(accuracies)*100:.2f}%")

## 🌼 Step 7: Predict on New Sample

In [None]:
# Example flower: [sepal length, sepal width, petal length, petal width]
new_sample = np.array([[5.1, 3.5, 1.4, 0.2]])
new_sample_scaled = scaler.transform(new_sample)
prediction = knn.predict(new_sample_scaled)

print(f"🌸 Predicted species: {iris.target_names[prediction][0]}")

---
### ✅ Summary
In this notebook, you learned:
- How to apply **K-Nearest Neighbors (KNN)** for classification.
- How to preprocess and scale data for better accuracy.
- How to tune hyperparameters (K value) to improve model performance.
- How to make predictions for new unseen samples.

---
