
# KNN Classification Tutorial with Python

## Step 1: Setup

Ensure you have the necessary Python libraries. We'll use `scikit-learn` for KNN and `numpy` for data handling. Install them using pip if you don't have them:

```python
!pip install scikit-learn numpy
```

## Step 2: Import Libraries

In [1]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris, load_diabetes
from sklearn.metrics import accuracy_score

c:\Users\ebrah\anaconda3\lib\site-packages\numpy\.libs\libopenblas.EL2C6PLE4ZYW3ECEVIV3OXXGRN2NRFM2.gfortran-win_amd64.dll
c:\Users\ebrah\anaconda3\lib\site-packages\numpy\.libs\libopenblas.GK7GX5KEQ4F6UYO3P26ULGBQYHGQO7J4.gfortran-win_amd64.dll


## Step 3: Load the Dataset

We will use the Iris dataset, a classic in machine learning. It contains measurements of iris flowers and their species.

In [69]:
iris = load_iris()
X = iris.data
y = iris.target
X.shape

(150, 4)

In [61]:
print(X[0:5,:], y[:5])

[[ 0.03807591  0.05068012  0.06169621  0.02187235 -0.0442235  -0.03482076
  -0.04340085 -0.00259226  0.01990842 -0.01764613]
 [-0.00188202 -0.04464164 -0.05147406 -0.02632783 -0.00844872 -0.01916334
   0.07441156 -0.03949338 -0.06832974 -0.09220405]
 [ 0.08529891  0.05068012  0.04445121 -0.00567061 -0.04559945 -0.03419447
  -0.03235593 -0.00259226  0.00286377 -0.02593034]
 [-0.08906294 -0.04464164 -0.01159501 -0.03665645  0.01219057  0.02499059
  -0.03603757  0.03430886  0.02269202 -0.00936191]
 [ 0.00538306 -0.04464164 -0.03638469  0.02187235  0.00393485  0.01559614
   0.00814208 -0.00259226 -0.03199144 -0.04664087]] [151.  75. 141. 206. 135.]


## Step 4: Split the Data

Split our data into a training set and a test set.

In [62]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


## Step 5: Create and Train the KNN Model

Create our KNN model. Let's start with `k=3`.

In [63]:
k = 3
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)

KNeighborsClassifier(n_neighbors=3)

## Step 6: Make Predictions

Use our trained model to make predictions on our test set.

In [64]:
predictions = knn.predict(X_test)
print(X_test.shape)
print(predictions)

(89, 10)
[ 67. 141.  69. 220.  51.  95. 178. 139.  74.  53.  59. 131.  85. 129.
  39.  60. 180. 180. 151. 163.  78.  59.  59. 163. 103. 127. 127.  91.
  59.  51.  91.  69.  51. 129. 115. 178.  53.  51.  81.  75.  49.  66.
  47.  89. 154.  60.  55.  64.  59.  93. 121.  69.  55.  49. 138. 141.
  49. 127. 115.  59.  67.  78.  49.  65.  93.  67. 163.  53. 142.  58.
  58. 122. 155.  58.  65. 138. 139. 147. 178. 151. 113. 125.  51.  60.
  49.  84. 114.  42.  97.]


  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)


## Step 7: Evaluate the Model

Use accuracy as our evaluation metric — the fraction of correctly classified instances.

In [65]:
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy * 100:.2f}%")

Accuracy: 0.00%


In [70]:
print(set(y))

{0, 1, 2}


In [66]:
import numpy as np
predictions = knn.predict(np.array([[0, 0, 55, 56],[0,0, 0, 0]]))
print(predictions)

ValueError: X has 4 features, but KNeighborsClassifier is expecting 10 features as input.

In [67]:
from sklearn.model_selection import cross_val_score
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris, load_digits

# Load dataset
iris = load_digits()
X, y = iris.data, iris.target

# Create KNN classifier
knn = KNeighborsClassifier(n_neighbors=3)

# Perform 5-fold cross-validation
cv_scores = cross_val_score(knn, X, y, cv=10)

# Output the mean and standard deviation of the scores
print(f"CV Scores: {cv_scores}")
print(f"Mean score: {cv_scores.mean()}")
print(f"Standard Deviation: {cv_scores.std()}")
print(y[:5])

CV Scores: [0.93888889 1.         0.98888889 0.97222222 0.96666667 0.97777778
 0.98333333 0.98324022 0.98324022 0.97206704]
Mean score: 0.9766325263811299
Standard Deviation: 0.015472517471692416
[0 1 2 3 4]


  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
