Importing the required libraries. 

In [None]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.model_selection import cross_val_score
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")

Importing the dataset and understanding it.

In [None]:
data = pd.read_csv('../input/zoo-animal-classification/zoo.csv')
data.head()

In [None]:
data.shape

In [None]:
data.info()

The data is to split into two dataframes. 
* X - Features 
* y - Targets

In [None]:
y=data['class_type'].values
X=data.drop(['class_type','animal_name'],axis=1).values

The data is now split into Training and Testing Sets.

In [None]:
X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [None]:
sns.set_style('whitegrid')

Training the dataset by varying the number of Neighbours for a KNN Classifier while keeping other paramaters of the classifier constant and default. 

The mean score as well as error rate for each value of K is saved into an array. 

In [None]:
k_list = np.arange(1, 50, 2)
mean_scores = []
accuracy_list = []
error_rate = []

for i in k_list:
    knn = KNeighborsClassifier(n_neighbors=i)
    knn.fit(X_train,y_train)
    pred_i = knn.predict(X_test)
    score = cross_val_score(knn,X_train, y_train,cv=10)
    mean_scores.append(np.mean(score))
    error_rate.append(np.mean(pred_i != y_test))

Plotting the Mean Accuracy Score v/s K value for the dataset. 

In [None]:
plt.plot(k_list,mean_scores, marker='o')

plt.title('Accuracy of Model for Varying Values of K')
plt.xlabel("Values of K")
plt.ylabel("Mean Accuracy Score")
plt.xticks(k_list)
plt.rcParams['figure.figsize'] = (12,12) 

plt.show()

Plotting the Error Rate v/s K value for the dataset

In [None]:
plt.plot(k_list,error_rate, color='r', marker = 'o')

plt.title('Error Rate for Model for Varying Values of K')
plt.xlabel("Values of K")
plt.ylabel("Error Rate")
plt.xticks(k_list)
plt.rcParams['figure.figsize'] = (12,12) 

plt.show()

Taking K=5, the classifier model is trained and tested. 

In [None]:
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train,y_train)

In [None]:
knn.score(X_train,y_train)

In [None]:
knn.score(X_test,y_test)

An accuracy of 87% is obtained with Number Neighbours = 5 for a KNN Classifier Model. 

A histogram is plotted to observe the actual and predicted classes for the animals based on this model.

In [None]:
y_pred=knn.predict(X_test)
plt.rcParams['figure.figsize'] = (9,9) 
_, ax = plt.subplots()
ax.hist(y_test, color = 'm', alpha = 0.5, label = 'actual', bins=7)
ax.hist(y_pred, color = 'c', alpha = 0.5, label = 'prediction', bins=7)
ax.yaxis.set_ticks(np.arange(0,16))
labels = ["","Mammal","Bird","Reptile","Fish","Amphibian","Bug","Invertebrate"]
ax.set_xticklabels(labels)
ax.legend(loc = 'best')
plt.show()