# MLflow Exercise Tasks
MLflow tracking is a powerful tool for logging and organizing machine learning experiments. It provides a centralized repository to log parameters, metrics, artifacts, and code versions. Here are some key concepts:

- **Experiment**: A named process, typically representing a machine learning workflow, that can contain multiple runs.
- **Run**: A single execution of a script or piece of code within an experiment.
- **Parameters**: Input values to a run, such as hyperparameters.
- **Metrics**: Output values or performance indicators logged during a run.
- **Artifacts**: Output files, such as models or plots, logged during a run.

By using MLflow, teams can effectively track and reproduce experiments, facilitating collaboration and model reproducibility.

## Exercise Overview
Welcome to the MLflow workshop on experiment tracking! In this exercise, we'll explore how to leverage MLflow to log and organize metrics, parameters, and artifacts in the context of machine learning workflows. The exercises are divided into two parts:
1. **Logging Metrics** and Parameters with MLflow: This part focuses on using MLflow in a sklearn-based machine learning workflow, specifically with a RandomForestClassifier on the Iris dataset.

2. **PyTorch Image Classifier with MLflow**: The second part of the exercise involves creating an image classifier using PyTorch. We'll utilize MLflow to track the training process and log important artifacts, such as the trained model and confusion matrix.

##  Exercise 1 - Logging Metrics and Parameters with MLflow
> **Objective**: In this exercise, we will practice using MLflow to log metrics and parameters in a machine learning workflow.

Code comments starting with an exclamation mark `#!` represent a TODO. The tracking server can be reached via the URL `http://localhost:5001`.
**MLflow is already installed as a package.**

### Part 1: Setting Up
We begin by importing the necessary libraries and loading a sample dataset (Iris). The dataset is split into training and testing sets.

In [None]:
import mlflow
import mlflow.sklearn
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

In [None]:
# Load a Sample Dataset
from sklearn.datasets import load_iris
data = load_iris()
X, y = data.data, data.target


In [None]:
# Split the Dataset into Training and Testing Sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Part 2: Training a RandomForest Classifier
In this section, you will train a RandomForest classifier, log hyperparameters (e.g., the number of trees), and record accuracy as a metric. Additionally, you'll perform tasks like changing hyperparameters, viewing runs on the MLflow UI, and comparing experiments.

In [None]:
#! Set the tracking URI to `http://mlflow:5001`
mlflow.set_tracking_uri("http://mlflow:5001")

In [None]:
try:
    #! Create an experiment with the name `MLflow RandomForest Demo`. Save the experiment_id in the variable `exp_id`.
    exp_id = mlflow.create_experiment(name="MLflow RandomForest Demo")
except mlflow.exceptions.RestException:
   exp_id =  mlflow.get_experiment_by_name(name="MLflow RandomForest Demo").experiment_id


In [None]:
with mlflow.start_run(experiment_id=exp_id):
    num_trees = 100
    
    #! Log the hyperparameter `num_trees` as a parameter
    mlflow.log_param("num_trees", num_trees)
    dataset_input =mlflow.data.from_numpy(X, source="Input.csv")
    mlflow.log_input(dataset_input,context="input")
    
    clf = RandomForestClassifier(n_estimators=num_trees)
    clf.fit(X_train, y_train)

    accuracy = clf.score(X_test, y_test)

    #! Log the accuracy (`accuracy`) as a metric
    mlflow.log_metric("accuracy", accuracy)

### Part 3: Additional Tasks
These tasks encourage you to further explore MLflow functionalities, such as saving datasets as artifacts.

1. Set the number of trees num_trees in the previous code block to 200 and run the code block again.
2. Visit the URL http://localhost:5001 in your browser. Check both runs of the experiment.
3. On the overview page of the runs of the experiment MLflow Regression Demo, check both runs and compare them by clicking Compare.
4. Before training, save the datasets X with the context input.

## Exercise 2 - PyTorch Image Classifier with MLflow
> **Objective**: In this exercise, we'll create an image classifier with PyTorch and use MLflow to monitor the trained model.

### Part 1: Setting Up
The second exercise involves creating a PyTorch-based image classifier using Fashion MNIST. We define a simple Convolutional Neural Network (CNN) architecture.

In [None]:
import mlflow
import mlflow.pytorch
import torch
import torch.nn as nn
import torch.nn.functional as F 
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torchvision.datasets import CIFAR10
from tqdm import tqdm
from sklearn.metrics import confusion_matrix
import seaborn as sn
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

In [None]:
# Load the Fashion MNIST Dataset and Apply Necessary Transformations
import pytorch_lightning as pl
from torch.utils.data import DataLoader
from torchvision import transforms, datasets

class FashionMNISTDataModule(pl.LightningDataModule):
    def __init__(self, batch_size:int =4, data_path:str = './data/FASHIONMNIST'):
        super().__init__()
        self.data_path = data_path
        self.batch_size = batch_size

    def prepare_data(self):
        datasets.FashionMNIST(self.data_path, download=True)

    def setup(self, stage=None):
        transform = transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.5,), (0.5,))
        ])

        if stage == 'fit' or stage is None:
            self.train_dataset = datasets.FashionMNIST(
                './data/FASHIONMNIST',
                train=True,
                transform=transform
            )

        if stage == 'test' or stage is None:
            self.test_dataset = datasets.FashionMNIST(
                './data/FASHIONMNIST',
                train=False,
                transform=transform
            )

    def train_dataloader(self):
        return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True, num_workers=2)

    def test_dataloader(self):
        return DataLoader(self.test_dataset, batch_size=self.batch_size, shuffle=False, num_workers=2)

In [None]:
batch_size = 64
learning_rate = 0.001
epochs = 2


data_module = FashionMNISTDataModule(batch_size=batch_size)
data_module.prepare_data()
data_module.setup()

### Part 2: Training the Model
You will log parameters (e.g., learning rate, batch size, epochs) and the loss per epoch during the training process. Additionally, you'll log the trained model as an artifact.

In [None]:
import pytorch_lightning as pl

import torch.nn as nn
import torch.nn.functional as F
import pytorch_lightning as pl

class FashionMNISTModel(pl.LightningModule):
    def __init__(self, learning_rate=0.001):
        super(FashionMNISTModel, self).__init__()
        self.learning_rate = learning_rate
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 4 * 4, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = F.cross_entropy(y_hat, y)
        self.log('train_loss', loss)
        #! Log the loss per epoch
        mlflow.log_metric("loss", loss)

        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.SGD(self.parameters(), lr=self.learning_rate)
        return optimizer


model = FashionMNISTModel(learning_rate=learning_rate)

model.prepare_data = data_module.prepare_data
model.setup = data_module.setup

In [None]:
try:
    mlflow.set_tracking_uri("http://mlflow:5001")
    #! Create an experiment with the name `MLflow Image Classifier Demo`. Save the experiment_id in the variable `exp_id`.
    exp_id = mlflow.create_experiment(name="MLflow Image Classifier Demo")
except mlflow.exceptions.RestException:
   exp_id =  mlflow.get_experiment_by_name(name="MLflow Image Classifier Demo").experiment_id


In [None]:
# Train the Model
with mlflow.start_run(experiment_id=exp_id):
    #! Log the variables `batch_size`, `learning_rate`, and `epochs` as parameters
    mlflow.log_params({
        "batch_size": batch_size,
        "learning_rate": learning_rate,
        "epochs": epochs
    })

    trainer = pl.Trainer(max_epochs=epochs)
    trainer.fit(model, data_module)
    
    model_path = "model.pth"
    torch.save(model.state_dict(), model_path)
    
    #! Log the saved model as an artifact 
    mlflow.log_artifact(model_path)
    
    mlflow.pytorch.log_model(model, "model")

    create_confusion_matrix("confusion_matrix.png",model,data_module.test_dataloader())
    mlflow.log_artifact("confusion_matrix.png","Confusion Matrix")

### Part 3: Additional Tasks
These tasks extend the exercise by logging the trained model using mlflow.pytorch.log_model and incorporating a function to create and log a confusion matrix.
1. Log the trained model using `mlflow.pytorch.log_model`.
2. Insert the call to the following function `create_confusion_matrix` in the above code block. First, execute the cell with the function. The saved confusion matrix should be logged as an artifact.
3. Copy the cell above. Delete all mlflow logging functions and add `mlflow.pytorch.autolog` before you start a mlflow run. Look at the results at the Mlflow Tracking Server

In [None]:
def create_confusion_matrix(filename, model, testloader):
    y_pred = []
    y_true = []
    
    for inputs, labels in testloader:
        output = model(inputs)

        output = (torch.max(torch.exp(output), 1)[1]).data.cpu().numpy()
        y_pred.extend(output)
        
        labels = labels.data.cpu().numpy()
        y_true.extend(labels)

    classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
        'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')

    cf_matrix = confusion_matrix(y_true, y_pred)
    df_cm = pd.DataFrame(cf_matrix / np.sum(cf_matrix, axis=1)[:, None], index = [i for i in classes],
                         columns = [i for i in classes])
    plt.figure(figsize = (12,7))
    sn.heatmap(df_cm, annot=True)
    plt.savefig(filename)

In [None]:
# Train the Model
mlflow.pytorch.autolog()
with mlflow.start_run(experiment_id=exp_id):
    trainer = pl.Trainer(max_epochs=epochs)
    trainer.fit(model, data_module)