# 🌻 Practical 2: Flower Classification with Activation Functions
This is **Practical 2** of the **Hands-On Computer Vision Series: A Practical Learning Journey**.

In this notebook, we enhance our neural network model by adding **activation functions**, specifically the **ReLU** activation function. This allows the model to learn non-linear patterns and improves its classification capability.

We also explain **loss functions**, **learning rate**, and important **hyperparameters** you can tune.

In [None]:
# Import required libraries
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

In [None]:
# Define image size and class names
IMG_HEIGHT = 224
IMG_WIDTH = 224
IMG_CHANNELS = 3
CLASS_NAMES = ["daisy", "dandelion", "roses", "sunflowers", "tulips"]

## ☁️ Using Google Cloud-hosted Flower Dataset
This notebook uses flower images and labels stored in **CSV format** on Google Cloud:

- `train_set.csv` for training
- `eval_set.csv` for evaluation

If you are running locally:
- Download the dataset and CSVs
- Replace the paths with your local file paths
- Ensure image paths in the CSV match your local folder structure

In [None]:
# Load dataset from CSV and parse image-label pairs
def read_and_decode(filename, resize_dims):
    img_bytes = tf.io.read_file(filename)
    img = tf.image.decode_jpeg(img_bytes, channels=IMG_CHANNELS)
    img = tf.image.convert_image_dtype(img, tf.float32)
    img = tf.image.resize(img, resize_dims)
    return img

def parse_csvline(csv_line):
    record_default = ["", ""]
    filename, label_string = tf.io.decode_csv(csv_line, record_default)
    img = read_and_decode(filename, [IMG_HEIGHT, IMG_WIDTH])
    label = tf.argmax(tf.math.equal(CLASS_NAMES, label_string))
    return img, label


In [None]:
# You can change the batch size here to 16, 32, 64, etc.
# Load training and evaluation datasets from Google Cloud
train_dataset = (
    tf.data.TextLineDataset("gs://cloud-ml-data/img/flower_photos/train_set.csv")
    .map(parse_csvline, num_parallel_calls=tf.data.AUTOTUNE)
    .batch(16)
    .prefetch(tf.data.AUTOTUNE)
)

eval_dataset = (
    tf.data.TextLineDataset("gs://cloud-ml-data/img/flower_photos/eval_set.csv")
    .map(parse_csvline, num_parallel_calls=tf.data.AUTOTUNE)
    .batch(16)
    .prefetch(tf.data.AUTOTUNE)
)

## 🔧 What is an Activation Function?
Activation functions are essential in neural networks because they introduce **non-linearity**. Without them, a model made of multiple layers would behave just like a single-layer linear model.

**Why we need them:**
- They allow the network to **learn complex patterns** in the data
- They help us solve problems that are **not linearly separable**
- They add depth to the network’s decision-making ability

### 🚀 Common Activation Functions:
- 🔸 **ReLU (Rectified Linear Unit):**
  - Formula: $f(x) = \max(0, x)$
  - Most widely used due to its simplicity and efficiency
  - Helps avoid vanishing gradients

- 🔸 **Sigmoid:**
  - Formula: $f(x) = \frac{1}{1 + e^{-x}}$
  - Squashes values into the range [0, 1]
  - Commonly used in binary classification (output layer)

- 🔸 **Tanh:**
  - Formula: $f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$
  - Range: [-1, 1], centered at 0
  - Can work better than sigmoid for hidden layers

- 🔸 **Softmax:**
  - Used in the output layer for **multi-class classification**
  - Converts raw scores into probabilities that sum to 1

➡️ In this notebook, we use **ReLU** for hidden layers and **Softmax** in the output layer.

In [None]:
# Feel free to adjust the number of hidden units (e.g., 64, 256)
# Build the model with activation functions
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    tf.keras.layers.Dense(128, activation='relu'),  # Hidden layer 1
    tf.keras.layers.Dense(128, activation='relu'),  # Hidden layer 2
    tf.keras.layers.Dense(5, activation='softmax')  # Output layer
])
model.summary()

## 📊 Understanding Parameters in a Neural Network
**Parameters** in a neural network include **weights** and **biases** — the values the model learns during training.

**Each Dense (fully connected) layer has:**
- 🔹 **Weights**: Connect each input to each output neuron
- 🔹 **Biases**: One per output neuron

### 🧮 Formula to Calculate Trainable Parameters:
`Total Parameters = (input_features × output_units) + output_units`

---
### 📐 Parameters in This Model:
- Input size = `224 × 224 × 3 = 150,528`
- Hidden Layer 1 = 128 units
- Hidden Layer 2 = 128 units
- Output Layer = 5 units

**Layer 1 (Input → Hidden 1)**:
- Weights: `150,528 × 128 = 19,267,584`
- Biases: `128`
- Total: `19,267,712`

**Layer 2 (Hidden 1 → Hidden 2)**:
- Weights: `128 × 128 = 16,384`
- Biases: `128`
- Total: `16,512`

**Layer 3 (Hidden 2 → Output)**:
- Weights: `128 × 5 = 640`
- Biases: `5`
- Total: `645`

**Total Parameters = 19,267,712 + 16,512 + 645 = 19,284,869**

✅ These are all **trainable** parameters. You can confirm them using the `model.summary()` output.

> 📌 More parameters mean more capacity to learn, but also more risk of **overfitting** — which we'll address in the next practical!

## 📉 What is a Loss Function?
A **loss function** tells the model how far off its predictions are from the actual labels. It is a critical part of training, as it drives how the model learns by adjusting weights to minimize this loss.

### 🧪 How it works:
1. The model makes a prediction
2. The loss function compares this prediction to the true label
3. It computes an error (loss)
4. This loss is used to update the model’s weights using backpropagation

### 🔍 Types of Loss Functions:
- 🔹 **SparseCategoricalCrossentropy**:
  - Used when labels are integers (not one-hot encoded)
  - Perfect for multi-class classification (like our flower dataset)

- 🔹 **CategoricalCrossentropy**:
  - Use this when your labels are **one-hot encoded** (e.g., [0, 0, 1, 0, 0])

- 🔹 **BinaryCrossentropy**:
  - Used for binary classification problems (e.g., cat vs. dog)
  - Often paired with a sigmoid activation

- 🔹 **Mean Squared Error (MSE)**:
  - Common in regression tasks

➡️ In this practical, we use **SparseCategoricalCrossentropy** because our dataset contains multiple classes and integer labels.
The goal during training is to minimize this loss as much as possible.

## 🛠️ Understanding Hyperparameters
**Hyperparameters** are external configuration values set before training. They’re not learned from the data, but they directly influence how well your model learns.

### 🔑 Common Hyperparameters:
- **Learning Rate**: Controls how big each weight update step is.
  - Too high → training may diverge
  - Too low → training is slow or stuck
- **Batch Size**: Number of samples processed before weights are updated.
  - Smaller batch size → noisier but more generalized updates
  - Larger batch size → faster training but may overfit
- **Number of Epochs**: How many times the model sees the entire dataset.
  - More epochs → potentially better training (until overfitting begins)
- **Number of Hidden Units**: Controls the capacity of the network to learn patterns.
  - Too few → underfitting
  - Too many → overfitting if regularization isn’t applied
- **Image Size**: Affects the resolution of input data and model complexity

### 🧪 Try This:
- Change `learning_rate=0.001` to `0.0001` or `0.01`
- Try different `batch_size` like `16`, `32`, `64`
- Increase or decrease the number of neurons in hidden layers
- Add more epochs to see performance trends over time

🔁 **By experimenting with these values, you’ll gain a deeper intuition for how neural networks learn!**

In [None]:
# You can experiment with different learning rates here
# Try: 0.0001, 0.001 (default), 0.01
# Compile the model
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    metrics=['accuracy']
)

In [None]:
# Try increasing the number of epochs to see if accuracy improves or overfits
# Train the model
history = model.fit(
    train_dataset,
    validation_data=eval_dataset,
    epochs=10
)

In [None]:
# Plot accuracy and loss curves
import matplotlib.pyplot as plt
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(10)
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Train Acc')
plt.plot(epochs_range, val_acc, label='Val Acc')
plt.title('Accuracy')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Train Loss')
plt.plot(epochs_range, val_loss, label='Val Loss')
plt.title('Loss')
plt.legend()
plt.show()

## ✅ Summary: What We Learned
In this practical, you unlocked the power of **non-linear learning** by using activation functions!

- 🧠 **ReLU** enabled the model to learn complex, non-linear patterns from image data.
- 📈 You saw how to build a deeper model with **two hidden layers**.
- 📉 We explained **loss functions** and how models learn by minimizing error.
- 🔧 We introduced **hyperparameters** like learning rate and hidden units—feel free to tweak them!

🎓 By experimenting with different values, you’ll start to see how model behavior changes in real time.

👉 **Up next in Practical 3**:
We’ll tackle **overfitting** and how to fight it using techniques like **Dropout**, **Regularization**, and even deeper networks to build more robust models.

> The journey continues — keep building, keep tweaking, and most of all, keep learning! 🚀