# K Nearest Neighbour

In [1]:
# Import necessary libraries
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report

## Load Dataset

The Iris dataset is a classic and widely used dataset in the field of machine learning and statistics. It was introduced by the British biologist and statistician Ronald A. Fisher in 1936 as an example of discriminant analysis. The dataset consists of measurements of sepal length, sepal width, petal length, and petal width for 150 iris flowers, representing three different species: setosa, versicolor, and virginica. Each species has 50 samples.

### Features

*   Sepal Length: Length of the iris flower's sepal (the outermost whorl of a flower)
*   Sepal Width: Width of the iris flower's sepal.
*   Petal Length: Length of the iris flower's petal (the inner whorl of a flower).
*   Petal Width: Width of the iris flower's petal.

### Species
* Setosa: Characterized by its relatively small size, short sepals, and distinctive blue-ish green color.
* Versicolor: Features intermediate characteristics in terms of size, with a wider range of sepal and petal dimensions.
* Virginica: Generally larger in size, with longer sepals and petals, compared to the other two species.



In [2]:
iris = load_iris()
X = iris.data
y = iris.target

## Data Preprocessing

In [3]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Standardization is a preprocessing technique commonly used in machine learning to rescale features, ensuring that they have a mean of 0 and a standard deviation of 1. This process is also known as z-score normalization or zero-mean normalization. Standardizing features is particularly important for algorithms that rely on distances between data points, such as K-Nearest Neighbors (KNN).

In [4]:
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## Model Building

In [5]:
# Initialize the KNN classifier
knn = KNeighborsClassifier(n_neighbors=3)

## Model Training

In [7]:
# Fit the classifier to the training data
knn.fit(X_train, y_train)

In [8]:
# Make predictions on the test set
predictions = knn.predict(X_test)

## Model Evaluation

In [9]:
# Evaluate the model
accuracy = accuracy_score(y_test, predictions)
report = classification_report(y_test, predictions)

In [10]:
print(f"Accuracy: {accuracy:.2f}")
print("Classification Report:\n", report)

Accuracy: 1.00
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30

