# Program 9 - K Nearest Neighbors

---

Uses the `KNeighborsClassifier` class from the `sklearn` package.

We have used the iris dataset from `sklearn`.

In [1]:
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris

### Load data set

```
load_iris()
```

is used to load the iris dataset

In [2]:
iris = load_iris()

### Split data into training and testing data sets

using `train_test_split()`

In [3]:
x_train, x_test, y_train, y_test = train_test_split(iris.data,iris.target,test_size=0.1)

print("Size of training data and its label",x_train.shape,y_train.shape)
print("Size of testing data and its label",x_test.shape, y_test.shape)

Size of training data and its label (135, 4) (135,)
Size of testing data and its label (15, 4) (15,)


### Print label and their names

In [4]:
for i in range(len(iris.target_names)):
    print("Label", i , "-",str(iris.target_names[i]))

Label 0 - setosa
Label 1 - versicolor
Label 2 - virginica


In [5]:
iris.target_names

array(['setosa', 'versicolor', 'virginica'], dtype='<U10')

## Create object of KNN classifier

`n_neighbors` is the attribute used to give number of neighbors that the current point is being compared with.

In [6]:
classifier = KNeighborsClassifier(n_neighbors=1)

## Perform Training


In [7]:
classifier.fit(x_train, y_train)

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
           metric_params=None, n_jobs=1, n_neighbors=1, p=2,
           weights='uniform')

## Predict values for test data set

In [8]:
y_pred = classifier.predict(x_test)

In [9]:
print("Results of Classification using K-nn with K=1 ")
for r in range(0,len(x_test)):
    print(" Sample:", str(x_test[r]), " Actual-label:", str(y_test[r]), " Predicted-label:", str(y_pred[r]))

Results of Classification using K-nn with K=1 
 Sample: [6.3 3.3 6.  2.5]  Actual-label: 2  Predicted-label: 2
 Sample: [6.9 3.2 5.7 2.3]  Actual-label: 2  Predicted-label: 2
 Sample: [5.4 3.9 1.7 0.4]  Actual-label: 0  Predicted-label: 0
 Sample: [4.8 3.4 1.6 0.2]  Actual-label: 0  Predicted-label: 0
 Sample: [5.6 2.5 3.9 1.1]  Actual-label: 1  Predicted-label: 1
 Sample: [5.8 2.7 5.1 1.9]  Actual-label: 2  Predicted-label: 2
 Sample: [5.5 2.3 4.  1.3]  Actual-label: 1  Predicted-label: 1
 Sample: [7.7 2.6 6.9 2.3]  Actual-label: 2  Predicted-label: 2
 Sample: [5.2 3.5 1.5 0.2]  Actual-label: 0  Predicted-label: 0
 Sample: [5.6 2.8 4.9 2. ]  Actual-label: 2  Predicted-label: 2
 Sample: [6.  2.9 4.5 1.5]  Actual-label: 1  Predicted-label: 1
 Sample: [5.8 2.7 4.1 1. ]  Actual-label: 1  Predicted-label: 1
 Sample: [7.2 3.  5.8 1.6]  Actual-label: 2  Predicted-label: 2
 Sample: [6.  3.  4.8 1.8]  Actual-label: 2  Predicted-label: 2
 Sample: [6.3 2.7 4.9 1.8]  Actual-label: 2  Predicted-la

## Classification Accuracy
given using `classifier.score`

In [10]:
print("Classification Accuracy :" , classifier.score(x_test,y_test));

Classification Accuracy : 1.0


## Output Metrics of the classification

In [11]:
from sklearn.metrics import classification_report, confusion_matrix

In [12]:
print('Confusion Matrix')
print(confusion_matrix(y_test,y_pred))
print('Accuracy Metrics')
print(classification_report(y_test,y_pred))

Confusion Matrix
[[3 0 0]
 [0 4 0]
 [0 0 8]]
Accuracy Metrics
             precision    recall  f1-score   support

          0       1.00      1.00      1.00         3
          1       1.00      1.00      1.00         4
          2       1.00      1.00      1.00         8

avg / total       1.00      1.00      1.00        15

