# 🚀 Flower Category Analysis

## 📝 Introduction & Objectives

<details>
<summary><strong>🔍 Iris Classification Model Overview</strong></summary>

An **iris classification model** uses the open-source Iris dataset from scikit-learn to classify flower species based on measured features.

</details>

<details>
<summary><strong>🎯 Benefits of Using the Iris Dataset</strong></summary>

- 📂 **Demonstrates basic scikit-learn workflow**  
  - _e.g., simple training and prediction process_  
- 📊 **Shows essential data preprocessing steps**  
  - _e.g., handling numerical features before model training_  
- 🌸 **Introduces multi-class classification concepts**  
  - _e.g., distinguishing between three iris species_  
- 🗂 **Uses a small, manageable dataset suitable for beginners**  
  - _e.g., only 150 samples for quick experimentation_  

</details>

## 🛠️ Data Loading and Initial Exploration

In [1]:
import warnings
warnings.filterwarnings("ignore")

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the internal dataset
from sklearn.datasets import load_iris

df = load_iris() 

df = pd.DataFrame(data=df.data, columns=df.feature_names)
df['target'] = load_iris().target

df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


In [2]:
df.shape

(150, 5)

## ⚙️ Data Preprocessing & Feature Engineering

In [3]:
# Import necessary libraries for splitting the dataset
from sklearn.model_selection import train_test_split

# Split features & target
X = df.drop(columns=['target'])
y = df['target']

# Train-test split (80% / 20%)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42, stratify=y
)

print("Train shape:", X_train.shape)

Train shape: (112, 4)


## 🧠 Model Training

In [4]:
# Import algorithm model
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn import svm
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
import matplotlib.pyplot as plt

In [5]:
# Create a Logistic Regression model instance
lr = LogisticRegression()

# Train the model using the training data
lr.fit(X_train, y_train)

# Display the accuracy
print("The accuracy of the Logistic Regression is ", lr.score(X_test, y_test))

The accuracy of the Logistic Regression is  0.9473684210526315


In [6]:
# Checking SVM which is used for classification that predicts the category of flowers

X_tr, X_te, y_tr, y_te = train_test_split(
    X, y, test_size=0.25, random_state=12, stratify=y
)

clf = SVC(kernel="rbf", C=1.0, gamma="scale") 
clf.fit(X_tr, y_tr)
print("SVM accuracy:", clf.score(X_te, y_te))

SVM accuracy: 0.8947368421052632


## 📈 Model Evaluation

In [7]:
# Data Standardization
from sklearn.preprocessing import StandardScaler

# Standardize training and test data
scaler = StandardScaler()
X_tr_std = scaler.fit_transform(X_train)
X_te_std = scaler.transform(X_test)

# Print in the same style as your screenshot (no rounding)
print("after", X_tr_std.std(axis=0), X_tr_std.mean(axis=0))

svm = SVC()
svm.fit(X_tr_std, y_train)
score = svm.score(X_te_std, y_test)

print("\nSVM:", score)

after [1. 1. 1. 1.] [-2.25018417e-15 -9.68471335e-16 -4.89687656e-16 -1.03092138e-16]

SVM: 0.9473684210526315
