## **Task:**
Iris flower has three species; setosa, versicolor, and
virginica, which differs according to their
measurements. Now assume that you have the
measurements of the iris flowers according to
their species, and here your task is to train a machine
learning model that can learn from the
measurements of the iris species and classify them.
Although the Scikit-learn library provides a dataset for iris
flower classification, you can also
download the same dataset from here for the task of iris
flower classification with Machine  Learning

# **Step 1: Loading the Dataset**

In [1]:
from google.colab import files
import pandas as pd

# Upload files
uploaded = files.upload()


Saving Iris.csv to Iris.csv


In [2]:
import pandas as pd

In [4]:
df = pd.read_csv("Iris.csv")

# Display the first few rows
print(df.head())

   Id  SepalLengthCm  SepalWidthCm  PetalLengthCm  PetalWidthCm      Species
0   1            5.1           3.5            1.4           0.2  Iris-setosa
1   2            4.9           3.0            1.4           0.2  Iris-setosa
2   3            4.7           3.2            1.3           0.2  Iris-setosa
3   4            4.6           3.1            1.5           0.2  Iris-setosa
4   5            5.0           3.6            1.4           0.2  Iris-setosa


# **Step 2: Exploring and Preprocessing the Data**

* Checking for missing values and handle them if any exist.
* Inspecting the dataset columns and types

In [5]:
# Checking for missing values
print(df.isnull().sum())


Id               0
SepalLengthCm    0
SepalWidthCm     0
PetalLengthCm    0
PetalWidthCm     0
Species          0
dtype: int64


In [6]:
# Data overview
print(df.info())
print(df.describe())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Id             150 non-null    int64  
 1   SepalLengthCm  150 non-null    float64
 2   SepalWidthCm   150 non-null    float64
 3   PetalLengthCm  150 non-null    float64
 4   PetalWidthCm   150 non-null    float64
 5   Species        150 non-null    object 
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB
None
               Id  SepalLengthCm  SepalWidthCm  PetalLengthCm  PetalWidthCm
count  150.000000     150.000000    150.000000     150.000000    150.000000
mean    75.500000       5.843333      3.054000       3.758667      1.198667
std     43.445368       0.828066      0.433594       1.764420      0.763161
min      1.000000       4.300000      2.000000       1.000000      0.100000
25%     38.250000       5.100000      2.800000       1.600000      0.300000
50%     75.500000    

In [7]:
# Inspecting unique classes in the target column (species)
print(df['Species'].unique())

['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']


# **Step 3: Preparing the Data**

## **1. Separating features and target labels:**
     * Features: Sepal and petal measurements.
     * Target: Species.   
## **2. Encoding the target labels**

In [28]:
from sklearn.preprocessing import LabelEncoder

# Separating features and target
X = df.iloc[:, 1:-1]  # Assuming last column is the target
y = df.iloc[:, -1].values


In [31]:
# Encoding the species labels
label_encoder = LabelEncoder()
y = label_encoder.fit_transform(y)
print("Encoded labels:", y)

Encoded labels: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]


# **Step 4: Splitting the Dataset**
Splitting the data into training and testing sets to evaluate the model

In [32]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print("Training set size:", X_train.shape)
print("Testing set size:", X_test.shape)

Training set size: (120, 4)
Testing set size: (30, 4)


# **Step 5: Training a Machine Learning Model**
## Use a classification algorithm Random Forest Classifier

In [33]:
from sklearn.ensemble import RandomForestClassifier

# Initializing the classifier
classifier = RandomForestClassifier(random_state=42)

# Training the classifier
classifier.fit(X_train, y_train)

# **Step 6: Evaluating the Model**
## 1. Predict on the test set.
## 2. Calculating metrics like accuracy, precision, recall, and confusion matrix.

In [34]:
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Make predictions
y_pred = classifier.predict(X_test)

# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))

Accuracy: 1.0
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30

Confusion Matrix:
 [[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]


# **Step 7: Saving the Model**


In [35]:
import joblib

# Save the model
joblib.dump(classifier, 'iris_classifier.pkl')

# To load the model later:
# classifier = joblib.load('iris_classifier.pkl')

['iris_classifier.pkl']

# **Step 8: Testing the Model**


In [37]:
import pandas as pd

# Defining the column names used during training
feature_names = ['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']

# Creating a DataFrame for the sample
sample = [[5.1, 3.5, 1.4, 0.2]]  # Input with 4 features
sample_df = pd.DataFrame(sample, columns=feature_names)

# Predicting using the model
species = classifier.predict(sample_df)
print("Predicted Species:", label_encoder.inverse_transform(species))

Predicted Species: ['Iris-setosa']
