Aim: Demonstrate and analyse the results of classification based on KNN Algorithm.
Program: Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print both correct and wrong predictions. Java/Python ML library classes can be used for this problem.

In [19]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

In [20]:
# Load the Iris dataset
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Series(iris.target, name='target')

# Print the column names
print("Column names of X:", X.columns.tolist())
print("Column name of y:", y.name)

Column names of X: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Column name of y: target


In [21]:
iris.target_names

array(['setosa', 'versicolor', 'virginica'], dtype='<U10')

In [22]:
X.shape, y.shape

((150, 4), (150,))

In [23]:
X.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [24]:
y.head()

0    0
1    0
2    0
3    0
4    0
Name: target, dtype: int64

In [25]:
y.value_counts()

target
0    50
1    50
2    50
Name: count, dtype: int64

In [26]:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [27]:
# Initialize the k-NN classifier
knn = KNeighborsClassifier(n_neighbors=3)

In [28]:
# Fit the classifier to the training data
knn.fit(X_train, y_train)

In [29]:
# Predict the classes for the test set
y_pred = knn.predict(X_test)

In [35]:
# Print correct predictions
print("Correct Predictions:")
for i in range(len(y_pred)):
    if y_pred[i] == y_test.iloc[i]:
        print(f"Predicted: {iris.target_names[y_pred[i]]}, Actual: {iris.target_names[y_test.iloc[i]]}, Features: {X_test.iloc[i].values}")


Correct Predictions:
Predicted: versicolor, Actual: versicolor, Features: [6.1 2.8 4.7 1.2]
Predicted: setosa, Actual: setosa, Features: [5.7 3.8 1.7 0.3]
Predicted: virginica, Actual: virginica, Features: [7.7 2.6 6.9 2.3]
Predicted: versicolor, Actual: versicolor, Features: [6.  2.9 4.5 1.5]
Predicted: versicolor, Actual: versicolor, Features: [6.8 2.8 4.8 1.4]
Predicted: setosa, Actual: setosa, Features: [5.4 3.4 1.5 0.4]
Predicted: versicolor, Actual: versicolor, Features: [5.6 2.9 3.6 1.3]
Predicted: virginica, Actual: virginica, Features: [6.9 3.1 5.1 2.3]
Predicted: versicolor, Actual: versicolor, Features: [6.2 2.2 4.5 1.5]
Predicted: versicolor, Actual: versicolor, Features: [5.8 2.7 3.9 1.2]
Predicted: virginica, Actual: virginica, Features: [6.5 3.2 5.1 2. ]
Predicted: setosa, Actual: setosa, Features: [4.8 3.  1.4 0.1]
Predicted: setosa, Actual: setosa, Features: [5.5 3.5 1.3 0.2]
Predicted: setosa, Actual: setosa, Features: [4.9 3.1 1.5 0.1]
Predicted: setosa, Actual: seto

In [36]:
# Print wrong predictions
print("\nWrong Predictions:")
for i in range(len(y_pred)):
    if y_pred[i] != y_test.iloc[i]:
        print(f"Predicted: {iris.target_names[y_pred[i]]}, Actual: {iris.target_names[y_test.iloc[i]]}, Features: {X_test.iloc[i].values}")



Wrong Predictions:


In [37]:
# Print overall accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"\nAccuracy: {accuracy:.2f}")


Accuracy: 1.00
