## **Project Description:**
**Implementing and Evaluating a Custom K-Nearest Neighbors Classifier**<br>

This project focuses on building and evaluating a custom K-Nearest Neighbors (KNN) classifier from scratch using Python. Here's a breakdown of the approach:

**Implementation:**

- **Distance Function:** A `distance` function is defined to calculate the Euclidean distance between two data points.
- **K-Nearest Neighbors Class:** A `K_Near` class is created to represent the KNN classifier:
    - **Constructor (`__init__`):** Initializes the `k` value (number of neighbors) used for classification.
    - **`fit` method:** Stores the training data (`X_train`) and target labels (`y_train`) for future reference.
    - **`predict` method:** This core method performs prediction for new data points (`X`):
        1. Calculates the distances between the new data point and all data points in the training set.
        2. Sorts the distances in ascending order to find the `k` nearest neighbors.
        3. Identifies the most frequent class label (mode) among the `k` nearest neighbors.
        4. Assigns this mode label as the predicted class for the new data point.
- **Model Evaluation:** The script utilizes the Iris flower dataset (`load_iris`) for evaluation.
    - Data is split into training and testing sets (`train_test_split`).
    - The custom KNN model (`k_near`) is trained on the training data (`fit`).
    - Predictions are made on the testing data (`predict`).
    - Accuracy score (`accuracy_score`) and confusion matrix (`confusion_matrix`) are calculated to assess model performance.

**Outcomes:**

- This project demonstrates the implementation of a KNN classifier from scratch, allowing for a deeper understanding of the algorithm's core principles.
- The evaluation using the Iris dataset provides insights into the model's effectiveness on a real-world classification task.
- The confusion matrix helps visualize how well the model distinguishes between different flower classes.

**Further Exploration:**

- Experiment with different values of `k` to see how it affects KNN performance.
- Implement distance metrics beyond Euclidean distance (e.g., Manhattan distance).
- Compare the custom KNN model with scikit-learn's KNN implementation.
- Explore more complex datasets and classification problems.

By building and evaluating a custom KNN classifier, this project provides valuable hands-on experience with KNN and lays the foundation for further exploration of machine learning algorithms.


In [None]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix

def distance(x1,x2):
    return np.sqrt(np.sum((x1-x2)**2))

class K_Near:
    def __init__(self,k=5):
        self.k=k

    def fit(self,X,y):
        self.X_train=X
        self.y_train=y

    def predict(self,X):
        y_predict=[]
        for i in range(len(X)):
            distances=[distance(X[i],x) for x in self.X_train]
            index=np.argsort(distances)[:self.k]
            label=[self.y_train[i] for i in index]
            common=max(set(label),key=label.count)
            y_predict.append(common)
        return np.array(y_predict)

iris=load_iris()
X_train,X_test,y_train,y_test=train_test_split(iris.data,iris.target,test_size=0.25)

k_near=K_Near(k=5)
k_near.fit(X_train,y_train)
y_predict=k_near.predict(X_test)

accuracy=accuracy_score(y_test,y_predict)
matrix=confusion_matrix(y_test,y_predict)

print(f"Accuracy:{accuracy*100}%\n")
print("Confusion Matrix:")
print(matrix)

Accuracy:89.47368421052632%

Confusion Matrix:
[[10  0  0]
 [ 0 14  4]
 [ 0  0 10]]
