
# ðŸŒ¸ Iris Flower Classification â€” Beginner Project (CodeAlpha)
This notebook guides you **step by step** to build a simple machine learning model that predicts the **species of Iris flowers** from 4 measurements.
- **Skills you learn:** Python, pandas, matplotlib, scikit-learn, model training & evaluation.
- **Dataset:** The classic Iris dataset bundled with scikit-learn (no download needed).


## 1. Setup â€” Import libraries

In [None]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import joblib

# Make outputs crisp
pd.set_option('display.precision', 3)
print("Libraries imported âœ…")


## 2. Load the Iris dataset

In [None]:

# Load iris as a pandas DataFrame
iris = load_iris(as_frame=True)
df = iris.frame.copy()

# Rename 'target' to 'species_id' and add a readable 'species' column
df = df.rename(columns={'target': 'species_id'})
id_to_name = {i: name for i, name in enumerate(iris.target_names)}
df['species'] = df['species_id'].map(id_to_name)

print("Shape:", df.shape)
df.head()


## 3. Quick data check (EDA)

In [None]:

print("Info:")
print(df.info())
print("\nClass balance:")
print(df['species'].value_counts())
print("\nSummary stats:")
df.describe()


### 3.1 Feature distributions

In [None]:

# Histograms for each numeric feature
features = iris.feature_names
for col in features:
    plt.figure()
    df[col].hist(bins=15)
    plt.title(f"Distribution: {col}")
    plt.xlabel(col)
    plt.ylabel("Count")
    plt.show()


### 3.2 Simple scatter plot (two features)

In [None]:

# Scatter plot: petal length vs petal width, colored by species (default colors)
plt.figure()
for name, group in df.groupby('species'):
    plt.scatter(group['petal length (cm)'], group['petal width (cm)'], label=name)
plt.xlabel("Petal length (cm)")
plt.ylabel("Petal width (cm)")
plt.title("Petal length vs Petal width by species")
plt.legend()
plt.show()


## 4. Split into train and test sets

In [None]:

X = df[iris.feature_names]           # features
y = df['species']                    # target labels as names

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print("Train size:", X_train.shape, " Test size:", X_test.shape)


## 5. Train a model (RandomForestClassifier)

In [None]:

model = RandomForestClassifier(n_estimators=200, random_state=42)
model.fit(X_train, y_train)
print("Model trained âœ…")


## 6. Evaluate the model

In [None]:

y_pred = model.predict(X_test)

acc = accuracy_score(y_test, y_pred)
print(f"Accuracy: {acc:.3f}\n")
print("Classification report:")
print(classification_report(y_test, y_pred))

# Confusion matrix plot with matplotlib (one chart)
cm = confusion_matrix(y_test, y_pred, labels=sorted(y_test.unique()))
fig = plt.figure()
plt.imshow(cm, interpolation='nearest')
plt.title("Confusion Matrix")
plt.xticks(ticks=range(len(cm)), labels=sorted(y_test.unique()), rotation=45, ha='right')
plt.yticks(ticks=range(len(cm)), labels=sorted(y_test.unique()))
for i in range(cm.shape[0]):
    for j in range(cm.shape[1]):
        plt.text(j, i, cm[i, j], ha="center", va="center")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.tight_layout()
plt.show()


## 7. Save the trained model

In [None]:

joblib.dump(model, "iris_model.pkl")
print("Saved: iris_model.pkl âœ…")


## 8. Try predictions with your own numbers

In [None]:

def predict_species(sepal_length, sepal_width, petal_length, petal_width):
    sample = pd.DataFrame([{
        'sepal length (cm)': sepal_length,
        'sepal width (cm)': sepal_width,
        'petal length (cm)': petal_length,
        'petal width (cm)': petal_width
    }])
    return model.predict(sample)[0]

# Example:
print("Example prediction:", predict_species(5.1, 3.5, 1.4, 0.2))



## 9. What to submit
- This notebook (`Iris_Classification.ipynb`) with outputs.
- `iris_model.pkl` (saved model).
- Short README and a LinkedIn post (tag **@CodeAlpha**) briefly explaining your approach and results.
