<a href="https://colab.research.google.com/github/Youssef-Hossam5/DL-AI46-SV/blob/main/IRIS_NN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Iris Flower Classification using Neural Network (PyTorch)

## Problem Description
The **Iris dataset** is a classical ML problem where the goal is to classify iris flowers into **three species** (Setosa, Versicolor, Virginica) based on **four features**: Sepal Length, Sepal Width, Petal Length, Petal Width.  
Traditionally solved using **Logistic Regression, KNN, or SVM**, here we solve it using a Deep Learning perspective with a simple feed-forward neural network.

## DL Perspective
In Deep Learning, we model the classification function as a neural network:
$y = f(x; \theta)$

Where:

- function is a neural network with trainable parameters $\theta$, optimized via backpropagation and gradient descent

Even though Iris is small, this demonstrates DL prospective/piepeline:
- Forward propagation  
- Loss computation with **CrossEntropyLoss**  
- Backpropagation and optimization  
- Train/validation/test pipeline  



#  Architecture Explanation (Iris Dataset)

The network uses a **4 → 16 → 8 → 3** small feed-forward architecture, suitable for tiny datasets like Iris.

| Layer          | Why                                                                             |
| -------------- | ------------------------------------------------------------------------------- |
| `Linear(4,16)` | First hidden layer; each of 16 neurons sees all 4 features → learn combinations |
| `ReLU`         | Add non-linearity → network can model complex decision boundaries               |
| `Linear(16,8)` | Second hidden layer → compress learned features                                 |
| `ReLU`         | Non-linearity again                                                             |
| `Linear(8,3)`  | Output layer → one neuron per class, logits for CrossEntropyLoss                |

## Rule of Thumb for Small Datasets

- **First hidden layer**: slightly larger than input (here 16 > 4)  
- **Second hidden layer**: smaller (here 8 < 16) → compress features  
- **Output layer**: one neuron per class (3 classes)  

### Why not bigger or more layers?

- Iris dataset is tiny (150 samples) → too big a network may **overfit**  
- Too few neurons → may **underfit** (can't learn patterns)  

This architecture balances **capacity vs dataset size**, providing enough neurons to learn the patterns without overfitting.

## Implementation


In [2]:
# ============================
# 1. Import Libraries
# ============================
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report
from torch.utils.data import DataLoader, TensorDataset
import numpy as np

In [3]:
# ============================
# 2. Load Dataset
# ============================
iris = load_iris()
X = iris.data           # Features (150,4)
y = iris.target         # Labels (150,)
print(X.shape)
print(y.shape)

(150, 4)
(150,)


In [4]:
# ============================
# 3. Preprocessing
# ============================

# Standardization is important for NN stability
scaler = StandardScaler()
X = scaler.fit_transform(X)

In [5]:
# ============================
# 4. Train / Validation / Test Split
# ============================

# First split train+val and test
X_train_val, X_test, y_train_val, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Split train and validation
X_train, X_val, y_train, y_val = train_test_split(
    X_train_val, y_train_val, test_size=0.2, random_state=42, stratify=y_train_val
)


In [6]:
# ============================
# 5. Convert to PyTorch Tensors
# ============================
### Explanation:
# the data X and y are: NumPy arrays, But PyTorch neural networks do not work with NumPy arrays.
#Because:
#Tensors support automatic differentiation (autograd), Tensors track gradients, Tensors can be moved to GPU, All PyTorch layers expect tensors

X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.long)

X_val = torch.tensor(X_val, dtype=torch.float32)
y_val = torch.tensor(y_val, dtype=torch.long)

X_test = torch.tensor(X_test, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.long)

In [7]:
# ============================
# 6. Create DataLoaders
# ============================

train_dataset = TensorDataset(X_train, y_train)
val_dataset = TensorDataset(X_val, y_val)
test_dataset = TensorDataset(X_test, y_test)

train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=16)
test_loader = DataLoader(test_dataset, batch_size=16)


In [8]:
# ============================
# 7. Define Neural Network Model
# ============================

class IrisNN(nn.Module):
    def __init__(self):
        super(IrisNN, self).__init__()

        self.model = nn.Sequential(
            nn.Linear(4, 16),      # Input layer (4 features)
            nn.ReLU(),
            nn.Linear(16, 8),
            nn.ReLU(),
            nn.Linear(8, 3)        # Output layer (3 classes)
        )

    def forward(self, x):
        return self.model(x)

model = IrisNN()

In [9]:
# ============================
# 8. Define Loss and Optimizer
# ============================

criterion = nn.CrossEntropyLoss()  # For multi-class classification single label
optimizer = optim.Adam(model.parameters(), lr=0.01)


In [10]:
# ============================
# 9. Training Loop
# ============================

num_epochs = 100

for epoch in range(num_epochs):
    model.train()
    train_loss = 0

    for inputs, labels in train_loader:

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)

        loss.backward()       # Backpropagation
        optimizer.step()      # Update weights

        train_loss += loss.item()

    # Validation
    model.eval()
    val_loss = 0

    with torch.no_grad():
        for inputs, labels in val_loader:
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            val_loss += loss.item()

    if (epoch+1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{num_epochs}] "
              f"Train Loss: {train_loss:.4f} "
              f"Val Loss: {val_loss:.4f}")



Epoch [10/100] Train Loss: 0.6925 Val Loss: 0.3717
Epoch [20/100] Train Loss: 0.4462 Val Loss: 0.4289
Epoch [30/100] Train Loss: 0.2905 Val Loss: 0.2693
Epoch [40/100] Train Loss: 0.3291 Val Loss: 0.3027
Epoch [50/100] Train Loss: 0.2608 Val Loss: 0.2319
Epoch [60/100] Train Loss: 0.2419 Val Loss: 0.1840
Epoch [70/100] Train Loss: 0.3056 Val Loss: 0.1235
Epoch [80/100] Train Loss: 0.2329 Val Loss: 0.1134
Epoch [90/100] Train Loss: 0.2516 Val Loss: 0.1080
Epoch [100/100] Train Loss: 0.2330 Val Loss: 0.1200


In [11]:
# ============================
# 10. Testing & Evaluation
# ============================

model.eval()
all_preds = []
all_labels = []

with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs, 1)

        all_preds.extend(predicted.numpy())
        all_labels.extend(labels.numpy())

print("\nTest Accuracy:", accuracy_score(all_labels, all_preds))
print("\nClassification Report:\n")
print(classification_report(all_labels, all_preds))


Test Accuracy: 0.9666666666666667

Classification Report:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      0.90      0.95        10
           2       0.91      1.00      0.95        10

    accuracy                           0.97        30
   macro avg       0.97      0.97      0.97        30
weighted avg       0.97      0.97      0.97        30

