## Compare runs, choose a model, and deploy it to a REST API

### Ref https://mlflow.org/docs/latest/getting-started/quickstart-2/index.html

![alt](https://mlflow.org/docs/latest/_images/quickstart_tracking_overview.png)

* Run a hyperparameter sweep on a training script
* Compare the results of the runs in the MLflow UI
* Choose the best run and register it as a model
* Deploy the model to a REST API
* Build a container image suitable for deployment to a cloud platform

As an ML Engineer or MLOps professional, you can use MLflow to compare, share, and deploy the best models produced by the team. In this quickstart, you will use the MLflow Tracking UI to compare the results of a hyperparameter sweep, choose the best run, and register it as a model. Then, you will deploy the model to a REST API. Finally, you will create a Docker container image suitable for deployment to a cloud platform.

In [1]:
# Setup the tracker URI
!export MLFLOW_TRACKING_URI=http://localhost:5000

'export' is not recognized as an internal or external command,
operable program or batch file.


In [2]:
# install the required packages
!pip install -q hyperopt

In [2]:
import keras
import numpy as np
import pandas as pd
from hyperopt import STATUS_OK, Trials, fmin, hp, tpe
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

import mlflow
from mlflow.models import infer_signature




#### Now load the dataset and split it into training, validation, and test sets.

In [3]:
mlflow.set_tracking_uri("http://127.0.0.1:5000")
mlflow.set_experiment("wine-quality")

2024/04/19 15:53:57 INFO mlflow.tracking.fluent: Experiment with name 'wine-quality' does not exist. Creating a new experiment.


<Experiment: artifact_location='mlflow-artifacts:/726349895697917117', creation_time=1713522237995, experiment_id='726349895697917117', last_update_time=1713522237995, lifecycle_stage='active', name='wine-quality', tags={}>

In [4]:
last_run = mlflow.last_active_run()
last_run

In [5]:
# Retrieve the run, including dataset information
if last_run != None:
    run = mlflow.get_run(mlflow.last_active_run().info.run_id)
    dataset_info = run.inputs.dataset_inputs[0].dataset
    print(f"Dataset name: {dataset_info.name}")
    print(f"Dataset digest: {dataset_info.digest}")
    print(f"Dataset profile: {dataset_info.profile}")
    print(f"Dataset schema: {dataset_info.schema}")

In [6]:
if last_run == None:
    print("Reading from URI.")
    source_uri = "https://raw.githubusercontent.com/mlflow/mlflow/master/tests/datasets/winequality-white.csv"
    # Load dataset
    data = pd.read_csv(source_uri, sep=";",)
    mlflow_ds = mlflow.data.from_pandas(data, source=source_uri)
else:
    print("Reading the dataset used by the previous run.")
    # Load the dataset's source, which downloads the content from the source URL to the local filesystem
    dataset_source = mlflow.data.get_source(dataset_info)
    data = pd.read_csv(dataset_source.load(), sep=";")
    mlflow_ds = mlflow.data.from_pandas(data, source = dataset_source.url)

Reading from URI.


  return _dataset_source_registry.resolve(
  string_columns = trimmed_df.columns[(df.applymap(type) == str).all(0)]


In [7]:

# Split the data into training, validation, and test sets
train, test = train_test_split(data, test_size=0.25, random_state=42)
train_x = train.drop(["quality"], axis=1).values
train_y = train[["quality"]].values.ravel()
test_x = test.drop(["quality"], axis=1).values
test_y = test[["quality"]].values.ravel()
train_x, valid_x, train_y, valid_y = train_test_split(
    train_x, train_y, test_size=0.2, random_state=42
)
signature = infer_signature(train_x, train_y)

In [8]:
signature

inputs: 
  [Tensor('float64', (-1, 11))]
outputs: 
  [Tensor('int64', (-1,))]
params: 
  None

### Then let’s define the model architecture and train the model. The train_model function uses MLflow to track the parameters, results, and model itself of each trial as a child run.

In [10]:
def train_model(params, epochs, train_x, train_y, valid_x, valid_y, test_x, test_y):
    # Define model architecture
    model = keras.Sequential(
        [
            keras.Input([train_x.shape[1]]),
            keras.layers.Normalization(mean=np.mean(train_x), variance=np.var(train_x)),
            keras.layers.Dense(64, activation="relu"),
            keras.layers.Dense(1),
        ]
    )

    # Compile model
    model.compile(
        optimizer=keras.optimizers.SGD(
            learning_rate=params["lr"], momentum=params["momentum"]
        ),
        loss="mean_squared_error",
        metrics=[keras.metrics.RootMeanSquaredError()],
    )

    # Train model with MLflow tracking
    with mlflow.start_run(nested=True):
        model.fit(
            train_x,
            train_y,
            validation_data=(valid_x, valid_y),
            epochs=epochs,
            batch_size=64,
        )
        # Evaluate the model
        eval_result = model.evaluate(valid_x, valid_y, batch_size=64)
        eval_rmse = eval_result[1]

        # Log parameters and results
        mlflow.log_params(params)
        mlflow.log_metric("eval_rmse", eval_rmse)

        # Log model
        mlflow.tensorflow.log_model(model, "model", signature=signature)

        return {"loss": eval_rmse, "status": STATUS_OK, "model": model}


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


# TASKS 1, 2, 3

*Task1 and Task2 are done inline with other tasks*

## Creating entries in MLflow


1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

1. Modify the code to log metrics (loss, accuracy) etc. You may use mlflow.autolog() and compare it 
with mlflow.log_metrics(). [10 pts]
2. Modify the code to log paramters (network configuration, learning rate, optimizer, regularization, 
etc). You may use mlflow.autolog() and compare it with mlflow.log_param(). [10 pts]
3. Use the “with mlflow.start_run()” construct to run the model build for each configuration. This 
will create an entry in MLflow. Each time your rerun the code block, a new entry shall be created in 
MLflow. [15 pts]

### The `objective` function takes in the hyperparameters and returns the results of the train_model function for that set of hyperparameters.

In [11]:
def objective(params):
    # MLflow will track the parameters and results for each run
    result = train_model(
        params,
        epochs=3,
        train_x=train_x,
        train_y=train_y,
        valid_x=valid_x,
        valid_y=valid_y,
        test_x=test_x,
        test_y=test_y,
    )
    return result


Let's define the search space for `Hyperopt`. In this case, we want to try different values of `learning-rate` and `momentum`. Hyperopt begins its optimization process by selecting an initial set of hyperparameters, typically chosen at random or based on a specified domain space. This domain space defines the range and distribution of possible values for each hyperparameter. After evaluating the initial set, Hyperopt uses the results to update its probabilistic model, guiding the selection of subsequent hyperparameter sets in a more informed manner, aiming to converge towards the optimal solution.

In [12]:
space = {
    "lr": hp.loguniform("lr", np.log(1e-6), np.log(1e-1)),
    "momentum": hp.uniform("momentum", 0.0, 1.0),
}


Finally, we will run the hyperparameter sweep using Hyperopt, passing in the objective function and search space. Hyperopt will try different hyperparameter combinations and return the results of the best one. We will store the best parameters, model, and evaluation metrics in MLflow.

In [13]:
with mlflow.start_run():
    # log the input artifact
    mlflow.log_input(mlflow_ds, "training")
        
    # Conduct the hyperparameter search using Hyperopt
    trials = Trials()
    best = fmin(
        fn=objective,
        space=space,
        algo=tpe.suggest,
        max_evals=8,
        trials=trials,
    )

    # Fetch the details of the best run
    best_run = sorted(trials.results, key=lambda x: x["loss"])[0]

    # Log the best parameters, loss, and model
    mlflow.log_params(best)
    mlflow.log_metric("eval_rmse", best_run["loss"])
    mlflow.tensorflow.log_model(best_run["model"], "model", signature=signature)

    # Print out the best parameters and corresponding loss
    print(f"Best parameters: {best}")
    print(f"Best eval rmse: {best_run['loss']}")


  0%|          | 0/8 [00:00<?, ?trial/s, best loss=?]




Epoch 1/3                                            


 1/46 [..............................] - ETA: 1:13 - loss: 36.7620 - root_mean_squared_error: 6.0632

Epoch 2/3                                            

 1/46 [..............................] - ETA: 0s - loss: 1.3344 - root_mean_squared_error: 1.1551

Epoch 3/3                                            

 1/46 [..............................] - ETA: 0s - loss: 0.7585 - root_mean_squared_error: 0.8709

 1/12 [=>............................] - ETA: 0s - loss: 0.7472 - root_mean_squared_error: 0.8644

  0%|          | 0/8 [00:06<?, ?trial/s, best loss=?]INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmp59l3k0pl\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmp59l3k0pl\model\data\model\assets




Epoch 1/3                                                                      

 1/46 [..............................] - ETA: 58s - loss: 40.1112 - root_mean_squared_error: 6.3333
 9/46 [====>.........................] - ETA: 0s - loss: 17.9638 - root_mean_squared_error: 4.2384 

Epoch 2/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 2.6824 - root_mean_squared_error: 1.6378

Epoch 3/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 1.2126 - root_mean_squared_error: 1.1012

 1/12 [=>............................] - ETA: 0s - loss: 1.0527 - root_mean_squared_error: 1.0260

 12%|█▎        | 1/8 [00:46<04:54, 42.12s/trial, best loss: 0.8881410956382751]INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmp_k26yfe7\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmp_k26yfe7\model\data\model\assets



Epoch 1/3                                                                      

 1/46 [..............................] - ETA: 1:00 - loss: 34.6053 - root_mean_squared_error: 5.8826

Epoch 2/3                                                                      

 1/46 [..............................] - ETA: 1s - loss: 2.9523 - root_mean_squared_error: 1.7182
 6/46 [==>...........................] - ETA: 0s - loss: 3.6334 - root_mean_squared_error: 1.9062
 7/46 [===>..........................] - ETA: 0s - loss: 3.6192 - root_mean_squared_error: 1.9024

Epoch 3/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 3.3673 - root_mean_squared_error: 1.8350
 4/46 [=>............................] - ETA: 0s - loss: 3.0722 - root_mean_squared_error: 1.7528
 8/46 [====>.........................] - ETA: 0s - loss: 3.1067 - root_mean_squared_error: 1.7626

 1/12 [=>............................] - ETA: 0s - loss: 2.5540 -

INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmphf98dl2n\model\data\model\assets



Epoch 1/3                                                                      

 1/46 [..............................] - ETA: 45s - loss: 31.9179 - root_mean_squared_error: 5.6496

Epoch 2/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 31.9427 - root_mean_squared_error: 5.6518

Epoch 3/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 29.9748 - root_mean_squared_error: 5.4749

 1/12 [=>............................] - ETA: 0s - loss: 32.5349 - root_mean_squared_error: 5.7039

 38%|███▊      | 3/8 [01:44<02:39, 31.96s/trial, best loss: 0.8881410956382751]INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpx1hhg5im\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpx1hhg5im\model\data\model\assets



Epoch 1/3                                                                      

 1/46 [..............................] - ETA: 39s - loss: 37.7177 - root_mean_squared_error: 6.1415

Epoch 2/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 32.8692 - root_mean_squared_error: 5.7332

Epoch 3/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 22.9761 - root_mean_squared_error: 4.7933

 1/12 [=>............................] - ETA: 0s - loss: 19.7697 - root_mean_squared_error: 4.4463

 50%|█████     | 4/8 [02:07<01:54, 28.56s/trial, best loss: 0.8881410956382751]INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpnnn6lwqy\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpnnn6lwqy\model\data\model\assets



Epoch 1/3                                                                      

 1/46 [..............................] - ETA: 39s - loss: 36.0115 - root_mean_squared_error: 6.0010

Epoch 2/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 2.6582 - root_mean_squared_error: 1.6304

Epoch 3/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 1.9613 - root_mean_squared_error: 1.4005

 1/12 [=>............................] - ETA: 0s - loss: 2.2294 - root_mean_squared_error: 1.4931

 62%|██████▎   | 5/8 [02:28<01:17, 25.93s/trial, best loss: 0.8881410956382751]INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpqz386tyj\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpqz386tyj\model\data\model\assets



Epoch 1/3                                                                      

 1/46 [..............................] - ETA: 36s - loss: 32.0079 - root_mean_squared_error: 5.6576

Epoch 2/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 2.0355 - root_mean_squared_error: 1.4267

Epoch 3/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 1.6968 - root_mean_squared_error: 1.3026
 4/46 [=>............................] - ETA: 0s - loss: 1.5285 - root_mean_squared_error: 1.2363

 1/12 [=>............................] - ETA: 0s - loss: 0.8745 - root_mean_squared_error: 0.9351

 75%|███████▌  | 6/8 [02:50<00:49, 24.57s/trial, best loss: 0.8881410956382751]INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpbupjzj4t\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpbupjzj4t\model\data\model\assets



Epoch 1/3                                                                      

 1/46 [..............................] - ETA: 39s - loss: 33.8419 - root_mean_squared_error: 5.8174

Epoch 2/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 0.9453 - root_mean_squared_error: 0.9723

Epoch 3/3                                                                      

 1/46 [..............................] - ETA: 0s - loss: 0.4924 - root_mean_squared_error: 0.7017

 1/12 [=>............................] - ETA: 0s - loss: 0.7197 - root_mean_squared_error: 0.8483

 88%|████████▊ | 7/8 [03:12<00:23, 23.67s/trial, best loss: 0.8881410956382751]INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmp0rx9q7ak\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmp0rx9q7ak\model\data\model\assets



100%|██████████| 8/8 [03:30<00:00, 26.37s/trial, best loss: 0.8580485582351685]
INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpeccplph6\model\data\model\assets


INFO:tensorflow:Assets written to: C:\Users\Admin\AppData\Local\Temp\tmpeccplph6\model\data\model\assets



Best parameters: {'lr': 0.06466402471522194, 'momentum': 0.04412290425378629}
Best eval rmse: 0.8580485582351685


### Serve the model locally


In [17]:
# Install pyenv from https://github.com/pyenv/pyenv#installation
# install libffi-dev liblzma-dev libbz2-dev before spawning the model serving.
# if you come across _ctypes not found error,
# sudo apt-get install libffi-dev
# pyenv uninstall 3.10.12
# pyenv install 3.10.12

#### MLflow allows you to easily serve models produced by any run or model version. You can serve the model you just registered by running

In [19]:
#!mlflow models serve -m "models:/winequality/1" --port 5002 --host 0.0.0.0

#### To test the model, you can send a request to the REST API using the curl command

In [None]:
#!curl -d '{"dataframe_split": {"columns": ["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"], "data": [[7,0.27,0.36,20.7,0.045,45,170,1.001,3,0.45,8.8]]}}' -H 'Content-Type: application/json' -X POST localhost:5002/invocations

### Build a container image for your model

Most routes toward deployment will use a container to package your model, its dependencies, and relevant portions of the runtime environment. You can use MLflow to build a Docker image for your model.

In [20]:
#!mlflow models build-docker --model-uri "models:/wine-quality/1" --name "qs_mlops"

This command builds a Docker image named qs_mlops that contains your model and its dependencies.

In [21]:
#!docker run -p 5002:8080 qs_mlops

This Docker run command runs the image you just built and maps port 5002 on your local machine to port 8080 in the container. You can now send requests to the model using the same curl command as before:

In [22]:
#!curl -d '{"dataframe_split": {"columns": ["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"], "data": [[7,0.27,0.36,20.7,0.045,45,170,1.001,3,0.45,8.8]]}}' -H 'Content-Type: application/json' -X POST localhost:5002/invocations