# HW5-1: Iris Classification Problem (using tf.keras, PyTorch, and PyTorch Lightning, with TensorBoard visualization)

- The code in this notebook was generated using [GitHub Copilot](https://github.com/features/copilot).

## Prompt

Generate a TensorFlow program for a classification task using the Iris dataset. The program should include dataset preprocessing, model creation, training, and evaluation. Incorporate advanced features such as learning rate scheduling, TensorBoard logging, and custom callback implementations.

- Dataset: Use the Iris dataset from sklearn.datasets.
- Preprocessing: 
    - Perform one-hot encoding on target labels.
    - Split data into training and testing sets (80-20 split).
- Model Architecture:
    - Sequential model with 2 hidden layers (10 neurons each, ReLU activation).
    - Output layer with 3 classes and softmax activation.
- Compile: Use Adam optimizer with a learning rate of 0.01, categorical crossentropy as the loss, and accuracy as a metric.
- Learning Rate Scheduler: Reduce learning rate after 20 epochs using exponential decay.
- TensorBoard: Set up TensorBoard for logging training metrics, including histograms.
- Custom Callback: Print the loss at the end of each epoch.
- Training: Train the model for 100 epochs with validation.


In [1]:
%load_ext tensorboard

In [2]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
import numpy as np
import os
from datetime import datetime

# Load and preprocess the Iris dataset
iris_data = load_iris()
X = iris_data.data  # Features
y = iris_data.target.reshape(-1, 1)  # Target reshaped for one-hot encoding

# One-hot encoding of target values
encoder = OneHotEncoder(sparse_output=False)
y_encoded = encoder.fit_transform(y)

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.2, random_state=42)

# Define the model
model = Sequential([
    Dense(10, input_shape=(X_train.shape[1],), activation='relu', name='hidden_layer_1'),
    Dense(10, activation='relu', name='hidden_layer_2'),
    Dense(3, activation='softmax', name='output_layer')
])

# Compile the model
model.compile(
    optimizer=Adam(learning_rate=0.01),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Define learning rate scheduler
def scheduler(epoch, lr):
    if epoch < 20:
        return lr
    else:
        return (lr * tf.math.exp(-0.1)).numpy()

lr_scheduler = tf.keras.callbacks.LearningRateScheduler(scheduler)

# Setup TensorBoard logging directory
log_dir = "logs5-1/iris_classification_tf/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

# Custom callback to log loss
class LossLoggerCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        print(f"Epoch {epoch+1}: Loss = {logs['loss']}")

loss_logger = LossLoggerCallback()

# Train the model
model.fit(
    X_train, y_train,
    validation_data=(X_test, y_test),
    epochs=100,
    callbacks=[tensorboard_callback, lr_scheduler, loss_logger]
)

# Evaluate the model
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=2)
print(f"Test Accuracy: {test_accuracy}")

2024-12-16 23:46:11.216183: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-16 23:46:11.227454: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-16 23:46:11.230959: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-16 23:46:11.239253: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Epoch 1/100


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
I0000 00:00:1734363972.918026  138807 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-12-16 23:46:12.945435: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2343] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...


[1m1/4[0m [32m━━━━━[0m[37m━━━━━━━━━━━━━━━[0m [1m1s[0m 624ms/step - accuracy: 0.3438 - loss: 2.1720Epoch 1: Loss = 1.5765020847320557
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 70ms/step - accuracy: 0.3565 - loss: 1.7645 - val_accuracy: 0.3000 - val_loss: 1.0225 - learning_rate: 0.0100
Epoch 2/100
[1m1/4[0m [32m━━━━━[0m[37m━━━━━━━━━━━━━━━[0m [1m0s[0m 12ms/step - accuracy: 0.3438 - loss: 0.9923Epoch 2: Loss = 0.9117698669433594
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 22ms/step - accuracy: 0.5017 - loss: 0.9419 - val_accuracy: 0.7000 - val_loss: 0.8124 - learning_rate: 0.0100
Epoch 3/100
[1m1/4[0m [32m━━━━━[0m[37m━━━━━━━━━━━━━━━[0m [1m0s[0m 12ms/step - accuracy: 0.7188 - loss: 0.7910Epoch 3: Loss = 0.8060577511787415
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 22ms/step - accuracy: 0.6956 - loss: 0.7892 - val_accuracy: 0.7000 - val_loss: 0.7238 - learning_rate: 0.0100
Epoch 4/100
[1m1/4[0m [32m━━━━━[0m

## Prompt

Convert the code to PyTorch.

- Log loss and learning rate to TensorBoard 

In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from torch.utils.tensorboard import SummaryWriter
from datetime import datetime
import numpy as np

# Load and preprocess the Iris dataset
iris_data = load_iris()
X = iris_data.data  # Features
y = iris_data.target.reshape(-1, 1)  # Target reshaped for one-hot encoding

# One-hot encoding of target values
encoder = OneHotEncoder(sparse_output=False)
y_encoded = encoder.fit_transform(y)

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.2, random_state=42)

# Convert data to PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)

# Define the model
class IrisModel(nn.Module):
    def __init__(self):
        super(IrisModel, self).__init__()
        self.hidden_layer_1 = nn.Linear(X_train.shape[1], 10)
        self.hidden_layer_2 = nn.Linear(10, 10)
        self.output_layer = nn.Linear(10, 3)

    def forward(self, x):
        x = torch.relu(self.hidden_layer_1(x))
        x = torch.relu(self.hidden_layer_2(x))
        x = torch.softmax(self.output_layer(x), dim=1)
        return x

model = IrisModel()

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.1)

# Learning rate scheduler
def scheduler(epoch, lr):
    if epoch < 20:
        return lr
    else:
        return lr * np.exp(-0.1)

# Setup TensorBoard logging
date_time = datetime.now().strftime("%Y%m%d-%H%M%S")
log_dir = f"logs5-1/iris_classification_torch/{date_time}"
writer = SummaryWriter(log_dir=log_dir)

# Training loop
epochs = 100
for epoch in range(epochs):
    # Set model to training mode
    model.train()

    # Forward pass
    outputs = model(X_train)
    loss = criterion(outputs, torch.argmax(y_train, dim=1))

    # Backward pass
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Log loss and learning rate to TensorBoard
    writer.add_scalar('Loss/train', loss.item(), epoch)
    current_lr = optimizer.param_groups[0]['lr']
    writer.add_scalar('Learning Rate', current_lr, epoch)

    # Adjust learning rate
    for param_group in optimizer.param_groups:
        param_group['lr'] = scheduler(epoch, param_group['lr'])

    # Print loss
    print(f"Epoch {epoch + 1}: Loss = {loss.item()} | Learning Rate = {current_lr}")

    # Evaluate on validation set
    model.eval()
    with torch.no_grad():
        val_outputs = model(X_test)
        val_loss = criterion(val_outputs, torch.argmax(y_test, dim=1))
        val_correct = (torch.argmax(val_outputs, dim=1) == torch.argmax(y_test, dim=1)).sum().item()
        val_accuracy = val_correct / y_test.size(0)
        writer.add_scalar('Loss/validation', val_loss.item(), epoch)
        writer.add_scalar('Accuracy/validation', val_accuracy, epoch)

# Final evaluation
model.eval()
with torch.no_grad():
    test_outputs = model(X_test)
    correct = (torch.argmax(test_outputs, dim=1) == torch.argmax(y_test, dim=1)).sum().item()
    accuracy = correct / y_test.size(0)

print(f"Test Accuracy: {accuracy}")

# Close TensorBoard writer
writer.close()

Epoch 1: Loss = 1.104034423828125 | Learning Rate = 0.1
Epoch 2: Loss = 1.0930980443954468 | Learning Rate = 0.1
Epoch 3: Loss = 1.0499930381774902 | Learning Rate = 0.1
Epoch 4: Loss = 0.953551173210144 | Learning Rate = 0.1
Epoch 5: Loss = 0.8758890628814697 | Learning Rate = 0.1
Epoch 6: Loss = 0.8722573518753052 | Learning Rate = 0.1
Epoch 7: Loss = 0.8551760315895081 | Learning Rate = 0.1
Epoch 8: Loss = 0.8269813060760498 | Learning Rate = 0.1
Epoch 9: Loss = 0.8106176257133484 | Learning Rate = 0.1
Epoch 10: Loss = 0.820892870426178 | Learning Rate = 0.1
Epoch 11: Loss = 0.7938375473022461 | Learning Rate = 0.1
Epoch 12: Loss = 0.7787870764732361 | Learning Rate = 0.1
Epoch 13: Loss = 0.7670819759368896 | Learning Rate = 0.1
Epoch 14: Loss = 0.7550526261329651 | Learning Rate = 0.1
Epoch 15: Loss = 0.7484996914863586 | Learning Rate = 0.1
Epoch 16: Loss = 0.7183910012245178 | Learning Rate = 0.1
Epoch 17: Loss = 0.7145587205886841 | Learning Rate = 0.1
Epoch 18: Loss = 0.6817573

## Prompt

Convert the code to PyTorch Lightning.

- Change all logs to be recorded at the end of each epoch.

In [4]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, TensorDataset
import pytorch_lightning as pl
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

# Load and preprocess the Iris dataset
iris_data = load_iris()
X = iris_data.data  # Features
y = iris_data.target.reshape(-1, 1)  # Target reshaped for one-hot encoding

# One-hot encoding of target values
encoder = OneHotEncoder(sparse_output=False)
y_encoded = encoder.fit_transform(y)

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y_encoded, test_size=0.2, random_state=42
)

# Convert data to PyTorch tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)

# Create DataLoaders
train_dataset = TensorDataset(X_train, y_train)
val_dataset = TensorDataset(X_test, y_test)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)


# Define the LightningModule
class IrisModel(pl.LightningModule):
    def __init__(self, input_dim, hidden_dim, output_dim, learning_rate=0.1):
        super(IrisModel, self).__init__()
        self.save_hyperparameters()
        self.hidden_layer_1 = nn.Linear(input_dim, hidden_dim)
        self.hidden_layer_2 = nn.Linear(hidden_dim, hidden_dim)
        self.output_layer = nn.Linear(hidden_dim, output_dim)
        self.learning_rate = learning_rate

    def forward(self, x):
        x = F.relu(self.hidden_layer_1(x))
        x = F.relu(self.hidden_layer_2(x))
        x = F.softmax(self.output_layer(x), dim=1)
        return x

    def training_step(self, batch, batch_idx):
        X_batch, y_batch = batch
        outputs = self(X_batch)
        loss = F.cross_entropy(outputs, torch.argmax(y_batch, dim=1))
        self.log("train_loss", loss, on_epoch=True, on_step=False)

        # Log the learning rate for the current step
        lr = self.optimizers().param_groups[0]["lr"]
        self.log("learning_rate", lr, on_epoch=True, on_step=False)

        return loss

    def validation_step(self, batch, batch_idx):
        X_batch, y_batch = batch
        outputs = self(X_batch)
        loss = F.cross_entropy(outputs, torch.argmax(y_batch, dim=1))
        val_correct = (
            (torch.argmax(outputs, dim=1) == torch.argmax(y_batch, dim=1)).sum().item()
        )
        val_accuracy = val_correct / y_batch.size(0)
        self.log("val_loss", loss, prog_bar=True, on_epoch=True, on_step=False)
        self.log("val_accuracy", val_accuracy, prog_bar=True, on_epoch=True, on_step=False)

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.learning_rate)
        scheduler = torch.optim.lr_scheduler.LambdaLR(
            optimizer,
            lr_lambda=lambda epoch: 1.0 if epoch < 20 else 0.1 ** (epoch - 19),
        )
        return [optimizer], [scheduler]

    def test_step(self, batch, batch_idx):
        X_batch, y_batch = batch
        outputs = self(X_batch)
        loss = F.cross_entropy(outputs, torch.argmax(y_batch, dim=1))
        test_correct = (
            (torch.argmax(outputs, dim=1) == torch.argmax(y_batch, dim=1)).sum().item()
        )
        test_accuracy = test_correct / y_batch.size(0)
        self.log("test_loss", loss, prog_bar=True, on_epoch=True, on_step=False)
        self.log("test_accuracy", test_accuracy, prog_bar=True, on_epoch=True, on_step=False)


# Instantiate the model
model = IrisModel(input_dim=4, hidden_dim=10, output_dim=3, learning_rate=0.1)

# Define the trainer
trainer = pl.Trainer(
    max_epochs=100, log_every_n_steps=10, logger=pl.loggers.TensorBoardLogger("logs5-1/")
)

# Train the model
trainer.fit(model, train_dataloaders=train_loader, val_dataloaders=val_loader)

# Evaluate the model on the test set
trainer.test(dataloaders=val_loader)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name           | Type   | Params | Mode 
--------------------------------------------------
0 | hidden_layer_1 | Linear | 50     | train
1 | hidden_layer_2 | Linear | 110    | train
2 | output_layer   | Linear | 33     | train
--------------------------------------------------
193       Trainable params
0         Non-trainable params
193       Total params
0.001     Total estimated model params size (MB)
3         Modules in train mode
0         Modules in eval mode


Sanity Checking DataLoader 0:   0%|          | 0/1 [00:00<?, ?it/s]

/home/g113056077/.pyenv/versions/aiot-hw5/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.


                                                                           

/home/g113056077/.pyenv/versions/aiot-hw5/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.
/home/g113056077/.pyenv/versions/aiot-hw5/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py:298: The number of training batches (4) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


Epoch 99: 100%|██████████| 4/4 [00:00<00:00, 172.03it/s, v_num=0, val_loss=0.570, val_accuracy=0.967]

`Trainer.fit` stopped: `max_epochs=100` reached.


Epoch 99: 100%|██████████| 4/4 [00:00<00:00, 148.66it/s, v_num=0, val_loss=0.570, val_accuracy=0.967]


Restoring states from the checkpoint path at logs5-1/lightning_logs/version_0/checkpoints/epoch=99-step=400.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from the checkpoint at logs5-1/lightning_logs/version_0/checkpoints/epoch=99-step=400.ckpt
/home/g113056077/.pyenv/versions/aiot-hw5/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:424: The 'test_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.


Testing DataLoader 0: 100%|██████████| 1/1 [00:00<00:00, 282.71it/s]


[{'test_loss': 0.5700286626815796, 'test_accuracy': 0.9666666388511658}]

In [5]:
%tensorboard --logdir "logs5-1"