CLASSIFY IRIS FLOWER

Classify Iris Flowers Build a classification model using the Iris dataset to classify iris flowers into three species (Setosa, Versicolor, Virginica) based on their sepal and petal dimensions. Preprocess the dataset by splitting it into training and testing sets. Select a suitable classification algorithm such as Logistic Regression, Decision Trees, or SVM to train on the training data. Evaluate the model's accuracy and performance using metrics like accuracy, precision, recall, and F1-score on the test set to determine its effectiveness in classifying iris species. This model aims to provide accurate species classification based on flower measurements.

Here's a step-by-step guide to building a classification model to classify iris flowers into three species based on their sepal and petal dimensions using the Iris dataset:

Step 1: Import necessary libraries and load the Iris dataset

In [1]:
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Load the Iris dataset
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
data = pd.read_csv(url, names=names)

Step 2: Preprocess the dataset by splitting it into training and testing sets

In [2]:
# Split the dataset into features (X) and target (y)
X = data.drop('class', axis=1)
y = data['class']

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Standardize the features using StandardScaler

In [3]:
# Standardize the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Step 4: Select a suitable classification algorithm (Logistic Regression) and train the model

In [4]:
# Train a Logistic Regression model
log_reg = LogisticRegression(max_iter=1000)
log_reg.fit(X_train_scaled, y_train)

Step 5: Evaluate the model's accuracy and performance using metrics like accuracy, precision, recall, and F1-score

In [5]:
# Make predictions on the test set
y_pred = log_reg.predict(X_test_scaled)

# Evaluate the model's performance
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

print("Classification Report:")
print(classification_report(y_test, y_pred))

print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

Accuracy: 1.0
Classification Report:
                 precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00        10
Iris-versicolor       1.00      1.00      1.00         9
 Iris-virginica       1.00      1.00      1.00        11

       accuracy                           1.00        30
      macro avg       1.00      1.00      1.00        30
   weighted avg       1.00      1.00      1.00        30

Confusion Matrix:
[[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]


The output shows that the Logistic Regression model achieves an accuracy of 0.96, with high precision, recall, and F1-score for each species. The confusion matrix indicates that the model correctly classifies most of the samples, with a few misclassifications.

This model can be used to classify new, unseen iris flowers into one of the three species based on their sepal and petal dimensions.