#Iris Flowers Classification ML Project

#About the Dataset


The data set consists of 50 samples from each of three species of Iris flowers (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. The Iris flower data set or Fisher's Iris data set is a multivariate data set introduced by Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems as an example of linear discriminant analysis; Based on the combination of these four features, Fisher developed a linear discriminant model to distinguish/classification of the species from each other.

#Importing Dependencies

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

#Exploring the dataset

In [None]:
iris = pd.read_csv("/content/iris_data.csv")

In [None]:
print(iris.head())

   SepalLengthCm  SepalWidthCm  PetalLengthCm  PetalWidthCm      Species
0            5.1           3.5            1.4           0.2  Iris-setosa
1            4.9           3.0            1.4           0.2  Iris-setosa
2            4.7           3.2            1.3           0.2  Iris-setosa
3            4.6           3.1            1.5           0.2  Iris-setosa
4            5.0           3.6            1.4           0.2  Iris-setosa


In [None]:
iris.describe()

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm
count,150.0,150.0,150.0,150.0
mean,5.843333,3.054,3.758667,1.198667
std,0.828066,0.433594,1.76442,0.763161
min,4.3,2.0,1.0,0.1
25%,5.1,2.8,1.6,0.3
50%,5.8,3.0,4.35,1.3
75%,6.4,3.3,5.1,1.8
max,7.9,4.4,6.9,2.5


In [None]:
iris["Species"].value_counts()

Iris-setosa        50
Iris-versicolor    50
Iris-virginica     50
Name: Species, dtype: int64

In [None]:
print("Target Labels", iris["Species"].unique())

Target Labels ['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']


#PLotting the dataset
Plotting a dataset is a great way to explore its distribution.

In [None]:
import plotly.express as px
fig = px.scatter(iris, x="SepalWidthCm", y="SepalLengthCm", color="Species")
fig.show()

In [None]:
x = iris.drop("Species", axis=1)
y = iris["Species"]

#Performing Classification

When you look at the petal measurements of the three species of iris shown in the plot above, what do you see? It’s pretty obvious to us humans that Iris-virginica has larger petals than Iris-versicolor and Iris-setosa. But computers cannot understand like we do. It needs some algorithm to do so. In order to achieve such a task, we need to implement an algorithm that is able to classify the iris flowers into their corresponding classes.

#Using the train_test_split function in sklearn in order to split the dataset into train and test


In [None]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

In [None]:
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=1)
knn.fit(x_train, y_train)

In [None]:
model_prediction = knn.predict(x_test)
model_prediction

array(['Iris-virginica', 'Iris-versicolor', 'Iris-setosa',
       'Iris-virginica', 'Iris-setosa', 'Iris-virginica', 'Iris-setosa',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-virginica', 'Iris-versicolor', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-setosa',
       'Iris-versicolor', 'Iris-versicolor', 'Iris-setosa', 'Iris-setosa',
       'Iris-virginica', 'Iris-versicolor', 'Iris-setosa', 'Iris-setosa',
       'Iris-virginica', 'Iris-setosa', 'Iris-setosa', 'Iris-versicolor',
       'Iris-versicolor', 'Iris-setosa'], dtype=object)

#Performance Measure

In [None]:
from sklearn.metrics import accuracy_score
accuracy=accuracy_score(y_test,model_prediction)*100
print("Accuracy of the model is {:.2f}".format(accuracy))

Accuracy of the model is 100.00


# Sample Predictions

In [None]:
x_new = np.array([[5, 2.9, 1.0, 0.2]])
prediction = knn.predict(x_new)
print("Prediction: {}".format(prediction))

Prediction: ['Iris-setosa']



X does not have valid feature names, but KNeighborsClassifier was fitted with feature names



In [None]:
x_new = np.array([[5.5, 2.3, 4.0, 1.3]])
prediction = knn.predict(x_new)
print("Prediction: {}".format(prediction))

Prediction: ['Iris-versicolor']



X does not have valid feature names, but KNeighborsClassifier was fitted with feature names



In [None]:
x_new = np.array([[6.5, 3.2, 5.1, 2.0]])
prediction = knn.predict(x_new)
print("Prediction: {}".format(prediction))

Prediction: ['Iris-virginica']



X does not have valid feature names, but KNeighborsClassifier was fitted with feature names

