# Autologging in MLflow

MLflow's autologging automatically logs parameters, metrics, models, and artifacts for supported machine learning libraries. This makes tracking experiments easier and more efficient, without the need to manually log each item.

In this notebook, we will:
- Load and preprocess the California housing dataset.
- Train different types of models using multiple machine learning libraries.
- Use MLflow's autologging feature to track the experiments for each model. MLflow provides built-in support for automatically logging experiments of machine learning models from different libraries.
- All models will be logged under a single MLflow experiment.

In [1]:
import mlflow
import mlflow.sklearn
import mlflow.statsmodels
import mlflow.keras
import mlflow.pytorch
import mlflow.xgboost
import mlflow.catboost

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, Ridge, Lasso
import statsmodels.api as sm
import xgboost as xgb

from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf

import pytorch_lightning as pl
import torch
from torch import nn, optim
from torch.utils.data import DataLoader, TensorDataset

import warnings
warnings.filterwarnings('ignore')

import logging
logging.getLogger('mlflow').setLevel(logging.ERROR)
logging.getLogger('tensorflow').setLevel(logging.ERROR)

In [2]:
# Load the California housing dataset
housing = fetch_california_housing()
X = housing.data
y = housing.target

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

### Setting up the experiment

We will start by setting up a new MLflow experiment where all runs will be logged. If the experiment does not exist, it will be created.

In [3]:
# Set up the experiment
mlflow.set_experiment("California Housing Regression Autologging")

<Experiment: artifact_location='file:///C:/Users/israe/Documents/Codes/Notebooks/mlruns/328990853911858228', creation_time=1724614767818, experiment_id='328990853911858228', last_update_time=1724614767818, lifecycle_stage='active', name='California Housing Regression Autologging', tags={}>

### Autologging `statsmodels` models

First, we will demonstrate how to autolog models using the MLflow's `statsmodels` autologging. We will train three different regression models using `statsmodels` and log them with autologging. To use autologging with `statsmodels`, we simply need to call `mlflow.statsmodels.autolog()`.

In [4]:
# Enable autologging for statsmodels
mlflow.statsmodels.autolog()

# Define and train models with autologging
for model_type in ['OLS', 'WLS', 'GLS']:
    mlflow.start_run(run_name=f"Statsmodels-{model_type}")

    if model_type == 'OLS':
        model = sm.OLS(y_train, X_train).fit()
    elif model_type == 'WLS':
        model = sm.WLS(y_train, X_train).fit()
    elif model_type == 'GLS':
        model = sm.GLS(y_train, X_train).fit()
    else:
        raise ValueError("Unsupported model type")

    # Evaluate the model on the test set
    y_pred = model.predict(X_test)

    mlflow.end_run()

**Logged features**

When autologging is enabled for `statsmodels`, MLflow automatically logs the following information:

- **Model parameters**: Logs the parameters passed during model instantiation. If defaults are used, these are also recorded.
- **Model evaluation metrics**: Automatically logs metrics like MSE and R-squared. To log other metrics, we need to calculate them manually using `mlflow.log_metric()`.
- **Artifacts**:
  - **Model file**: The trained model is logged as an artifact. By default, the model is serialized using the `statsmodels` format.
  - **Model summary**: The model's summary is saved as a `.txt` file artifact, which includes key statistics like coefficients, R-squared values, p-values, etc.
- **Tags**: While MLflow adds basic tags like `mlflow.source.type` and `mlflow.source.name`, MLflow does not set additional default tags for `statsmodels`. We can manually add more tags to further categorize or describe the run.
- **Datasets**: The actual datasets (`X_train`, `y_train`) and their shapes are not logged by default. To log them, we would need to do that manually using `mlflow.log_artifact()` or by saving the dataset to a file and then logging it.


### Autologging `sklearn` models

Next, we'll train three different regression models using Scikit-learn and log them with MLflow's `sklearn` autologging. To use autologging with Scikit-learn, we simply call `mlflow.sklearn.autolog()`.

In [5]:
# Enable autologging for scikit-learn
mlflow.sklearn.autolog()

# Define models
models = {
    "LinearRegression": LinearRegression(),
    "Ridge": Ridge(),
    "Lasso": Lasso()
}

# Train models with autologging
for model_name, model in models.items():
    mlflow.start_run(run_name=f"Sklearn-{model_name}")
    
    # Train the model
    model.fit(X_train, y_train)
    
    # Predict on the test set
    y_pred = model.predict(X_test)
    
    mlflow.end_run()

**Logged features**

When autologging is enabled for `scikit-learn` models, MLflow automatically logs the following:

- **Model parameters**: Logs the hyperparameters used for the models, such as `alpha` for Ridge and Lasso. If default values are used, these are also recorded.
- **Model evaluation metrics**: Automatically logs common regression metrics like MSE and MAE for traning and validation sets. Additional metrics can be manually logged using `mlflow.log_metric()`.
- **Artifacts**:
  - **Model file**: The trained model is logged as an artifact. By default, it’s serialized using the `pickle` format.
- **Tags**: MLflow adds some default tags, such as `mlflow.source.type` and `mlflow.source.name`. It also  set additional default tags for `sklearn` such as `estimator_class` and `estimator_name`. Custom tags can be added manually to further describe or categorize the run.
- **Datasets**: The datasets (`X_train`, `y_train`) are not logged automatically, but it logs their metadata such as input shapes and size. If we need to log the datasets, we can do so manually using `mlflow.log_artifact()`.

### Autologging `keras` models
In this section, we will train three regression models using Keras and log them with MLflow's `keras` autologging. To use autologging with Keras, we simply call `mlflow.keras.autolog()`.

In [6]:
# Enable autologging for Keras
mlflow.keras.autolog()

# Define a function to create a Keras model
def create_keras_model(units=64, activation='relu'):
    model = Sequential([
        Dense(units, activation=activation, input_shape=(X_train.shape[1],)),
        Dense(units, activation=activation),
        Dense(1)
    ])
    model.compile(optimizer='adam', loss='mse', metrics=['mae'])
    return model

# Train models with different configurations
for units in [32, 64, 128]:
    mlflow.start_run(run_name=f"Keras-{units}units")
    
    model = create_keras_model(units=units)
    model.fit(X_train, y_train, epochs=10, validation_split=0.2, verbose=0)

    mlflow.end_run()



**Logged features**

When autologging is enabled for `keras` models, MLflow automatically logs the following:

- **Model parameters**: Logs the parameters used during model creation, such as the epochs and batch size.
- **Model evaluation metrics**: Automatically logs training and validation metrics that we defined in `model.compile` such as the loss and other defined metrics. It logs these matrices for each epoch.
- **Artifacts**:
  - **Model file**: The trained model is logged as an artifact. The default format in which Keras models are saved when using TensorFlow 2.x (and MLflow's autologging) is the SavedModel format (`saved_model.pb`). This is the most comprehensive format, capturing all the details necessary for restoring the model completely and deploying it in production. If we explicitly save our model using `model.save('model.h5')`, then it will be saved in the HDF5 format. However, newer versions of Keras may recommend saving models in the `.keras` format instead.
  - **Model summary**: The model's summary is saved as a `.txt` file artifact, which includes an overview of the model architecture.
  - **TensorBoard logs**: If TensorBoard is used during training, the logs are automatically captured and stored as artifacts. These logs contain detailed information about the training process, such as loss and metric curves, histograms of weights and biases, and other debugging data.
  
- **Tags**: While MLflow adds basic tags like `mlflow.source.type` and `mlflow.source.name`, MLflow does not set additional default tags for `keras`. We can manually add more tags to further categorize or describe the run.
- **Datasets**: The datasets themselves (`X_train`, `y_train`) are not logged automatically, but metadata such as input shapes and sizes are captured. If we need to log the datasets, we can manually do so using `mlflow.log_artifact()`.


### Autologging `pytorch` models
MLflow's autologging for PyTorch is designed to work with **PyTorch Lightning**, a high-level library that abstracts away much of the boilerplate code involved in training PyTorch models. This means that if we are using plain PyTorch, MLflow's `mlflow.pytorch.autolog()` won't capture the logs automatically. Instead, autologging is triggered when using PyTorch Lightning's `Trainer` class.

For plain PyTorch, we need to manually log model parameters, evaluation metrics, and other artifacts using `mlflow.log_metric()`, `mlflow.log_params()`, and `mlflow.log_artifact()`.

In [7]:
# Define PyTorch Lightning model
class LitModel(pl.LightningModule):
    def __init__(self, input_dim, hidden_dim):
        super(LitModel, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        self.fc3 = nn.Linear(hidden_dim, 1)
        self.criterion = nn.MSELoss()
        self.optimizer = optim.Adam(self.parameters(), lr=0.001)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        return self.fc3(x)

    def training_step(self, batch, batch_idx):
        X_batch, y_batch = batch
        y_pred = self(X_batch).squeeze()
        loss = self.criterion(y_pred, y_batch)
        return loss

    def configure_optimizers(self):
        return self.optimizer

# Enable autologging for PyTorch Lightning
mlflow.pytorch.autolog()

# Prepare data
train_dataset = TensorDataset(torch.tensor(X_train, dtype=torch.float32), torch.tensor(y_train, dtype=torch.float32))
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_dataset = TensorDataset(torch.tensor(X_test, dtype=torch.float32), torch.tensor(y_test, dtype=torch.float32))
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

# Train models with different hidden layer sizes
for hidden_dim in [32, 64, 128]:
    mlflow.start_run(run_name=f"Pytorch-{hidden_dim}hidden")
    model = LitModel(input_dim=X_train.shape[1], hidden_dim=hidden_dim)
    
    trainer = pl.Trainer(max_epochs=10)
    trainer.fit(model, train_loader)

    # Predict on the test set
    model.eval()  # Set the model to evaluation mode
    all_preds = []
    with torch.no_grad():  # Turn off gradient computation
        for X_batch, _ in test_loader:
            y_pred = model(X_batch).squeeze()

    mlflow.end_run()

INFO:pytorch_lightning.utilities.rank_zero:GPU available: False, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type    | Params | Mode 
----------------------------------------------
0 | fc1       | Linear  | 288    | train
1 | fc2       | Linear  | 1.1 K  | train
2 | fc3       | Linear  | 33     | train
3 | criterion | MSELoss | 0      | train
----------------------------------------------
1.4 K     Trainable params
0         Non-trainable params
1.4 K     Total params
0.006     Total estimated model params size (MB)
4         Modules in train mode
0         Modules in eval mode


Training: |                                                                                      | 0/? [00:00<…

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=10` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: False, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type    | Params | Mode 
----------------------------------------------
0 | fc1       | Linear  | 576    | train
1 | fc2       | Linear  | 4.2 K  | train
2 | fc3       | Linear  | 65     | train
3 | criterion | MSELoss | 0      | train
----------------------------------------------
4.8 K     Trainable params
0         Non-trainable params
4.8 K     Total params
0.019     Total estimated model params size (MB)
4         Modules in train mode
0         Modules in eval mode


Training: |                                                                                      | 0/? [00:00<…

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=10` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: False, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.callbacks.model_summary:
  | Name      | Type    | Params | Mode 
----------------------------------------------
0 | fc1       | Linear  | 1.2 K  | train
1 | fc2       | Linear  | 16.5 K | train
2 | fc3       | Linear  | 129    | train
3 | criterion | MSELoss | 0      | train
----------------------------------------------
17.8 K    Trainable params
0         Non-trainable params
17.8 K    Total params
0.071     Total estimated model params size (MB)
4         Modules in train mode
0         Modules in eval mode


Training: |                                                                                      | 0/? [00:00<…

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=10` reached.


**Logged features**

When using MLflow's autologging with PyTorch Lightning, the following features are automatically logged:

- **Model parameters**: Logs hyperparameters used for training the model, such as `lr`, `epochs`, etc.
- **Model evaluation metrics**: `mlflow.pytorch.autolog()` does not automatically log evaluation metrics based on the training and test data unless they are specified or calculated manually. To log metrics automatically, ensure they are calculated within the training process or manually log them after predictions.
- **Artifacts**:
  - **Model file**: The trained PyTorch model is saved using PyTorch's native format (`.pt` or `.pth`).
  - **Model summary**: The model's summary is saved as a `.txt` file artifact, which includes an overview of the model architecture.
- **Tags**: MLflow automatically logs basic tags and additional tags related to PyTorch Lightning such as `Mode`.
- **Datasets**: The actual datasets (`X_train`, `y_train`) and their shapes are not logged by default. To log them, we would need to do that manually using `mlflow.log_artifact()` or by saving the dataset to a file and then logging it.


### Autologging `XGBoost` model

In this section, we will train a regression model using XGBoost and log the process using `mlflow.xgboost`. Autologging for XGBoost is enabled by calling `mlflow.xgboost.autolog()`.

In [8]:
# Enable autologging for XGBoost
mlflow.xgboost.autolog()

# Train XGBoost model
mlflow.start_run(run_name="XGBoost")
xgb_model = xgb.XGBRegressor()
xgb_model.fit(X_train, y_train)

# Predict on the test set
y_pred = xgb_model.predict(X_test)

mlflow.end_run()

**Logged features**

MLflow automatically logs several important features related to the model and the training process. Here's what gets logged:

- **Model parameters**: Logs all the hyperparameters used for the XGBoost model, such as `learning_rate`, `n_jobs`, `max_depth`, etc.
- **Model evaluation metrics**: If we evaluate the model's performance (e.g., calculating metrics like RMSE, MAE, etc.) after training, these metrics can be manually logged using `mlflow.log_metric()`. However, `mlflow.xgboost.autolog()` does not automatically log evaluation metrics based on the training and test data unless they are specified or calculated manually. To log metrics automatically, ensure they are calculated within the training process or manually log them after predictions.
- **Artifacts**:
  - **Model file**: The model is saved in the XGBoost-specific binary format (`.xgb`), which includes the entire model (structure and trained parameters). This format is specific to XGBoost and is optimized for quick loading and efficient storage. 
  - **Feature importance**: A plot and text files showing the importance of each feature used in the model.
- **Tags**: MLflow adds standard tags like `mlflow.source.type` and `mlflow.source.name`. Additional tags specific to XGBoost, can be logged manually if needed.
- **Datasets**: Similar to other models, the datasets themselves are not logged automatically, but metadata such as input shapes and sizes are captured. We would need to log the datasets manually if required using `mlflow.log_artifact()`.

### Recommendations
Autologging is a powerful feature that simplifies tracking experiments, but manual logging provides more control over what and how data is recorded, especially for custom needs or unsupported scenarios.

- **Use autologging when**: 
  - We need automatic and comprehensive logging of model training, parameters, and metrics.
  - Working with supported libraries (e.g., `statsmodels`, `scikit-learn`, `keras` with TensorFlow, `xgboost`, PyTorch Lightning).

- **Log manually when**:
  - We need to log custom metrics, datasets, or specific model details not covered by autologging.
  - Using libraries or frameworks not fully supported by MLflow's autologging (e.g., standard PyTorch).