In [None]:
# https://mlflow.org/docs/latest/getting-started/intro-quickstart/index.html

Step 1 - Get MLflow
MLflow is available on PyPI. If you don’t already have it installed on your system, you can install it with:

In [6]:
!pip install mlflow


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.2[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m


Step 2 - Start a Tracking Server

In [None]:
# TERMINAL
# Run a local tracking server (optionel)

#mlflow server --host 127.0.0.1 --port 8080

In [None]:
# Set the tracking server uri

# partie à creuser
#import mlflow

#mlflow.set_tracking_uri(uri = "http://<host>:<port>")

Step 3 - Train a model and prepare metadata for logging
In this section, we’re going to log a model with MLflow. A quick overview of the steps are:

Load and prepare the Iris dataset for modeling.

Train a Logistic Regression model and evaluate its performance.

Prepare the model hyperparameters and calculate metrics for logging.

In [7]:
# Imports

import mlflow
from mlflow.models import infer_signature

import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

In [8]:
# Test avec dataset iris pour voir comment ça marche

In [9]:
# Load the Iris dataset
X, y = datasets.load_iris(return_X_y = True)

In [10]:
# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

In [11]:
# Define the model hyperparameters
params = {
    "solver": "lbfgs",
    "max_iter": 1000,
    "multi_class": "auto",
    "random_state": 8888,
}

# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)

# Predict on the test set
y_pred = lr.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)

Step 4 - Log the model and its metadata to MLflow
In this next step, we’re going to use the model that we trained, the hyperparameters that we specified for the model’s fit, and the loss metrics that were calculated by evaluating the model’s performance on the test data to log to MLflow.

The steps that we will take are:

Initiate an MLflow run context to start a new run that we will log the model and metadata to.

Log model parameters and performance metrics.

Tag the run for easy retrieval.

Register the model in the MLflow Model Registry while logging (saving) the model.

In [12]:
# Set our tracking server uri for logging
mlflow.set_tracking_uri(uri = "http://127.0.0.1:8080")

# Create a new MLflow Experiment
mlflow.set_experiment("MLflow Quickstart")

# Start an MLflow run
with mlflow.start_run():
    # Log the hyperparameters
    mlflow.log_params(params)

    # Log the loss metric
    mlflow.log_metric("accuracy", accuracy)

    # Set a tag that we can use to remind ourselves what this run was for
    mlflow.set_tag("Training Info", "Basic LR model for iris data")

    # Infer the model signature
    signature = infer_signature(X_train, lr.predict(X_train))

    # Log the model
    model_info = mlflow.sklearn.log_model(
        sk_model = lr,
        artifact_path = "iris_model",
        signature = signature,
        input_example = X_train,
        registered_model_name = "tracking-quickstart",
    )


2024/04/02 13:03:20 INFO mlflow.tracking.fluent: Experiment with name 'MLflow Quickstart' does not exist. Creating a new experiment.


Successfully registered model 'tracking-quickstart'.
2024/04/02 13:03:22 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: tracking-quickstart, version 1
Created version '1' of model 'tracking-quickstart'.


Step 5 - Load the model as a Python Function (pyfunc) and use it for inference
After logging the model, we can perform inference by:

Loading the model using MLflow’s pyfunc flavor.

Running Predict on new data using the loaded model.

In [13]:
# Load the model back for predictions as a generic Python Function model
loaded_model = mlflow.pyfunc.load_model(model_info.model_uri)

predictions = loaded_model.predict(X_test)

iris_feature_names = datasets.load_iris().feature_names

result = pd.DataFrame(X_test, columns = iris_feature_names)
result["actual_class"] = y_test
result["predicted_class"] = predictions

result[:4]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2024/04/02 13:03:28 INFO mlflow.store.artifact.artifact_repo: The progress bar can be disabled by setting the environment variable MLFLOW_ENABLE_ARTIFACTS_PROGRESS_BAR to false


Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),actual_class,predicted_class
0,6.1,2.8,4.7,1.2,1,1
1,5.7,3.8,1.7,0.3,0,0
2,7.7,2.6,6.9,2.3,2,2
3,6.0,2.9,4.5,1.5,1,1


Step 6 - View the Run in the MLflow UI
In order to see the results of our run, we can navigate to the MLflow UI. Since we have already started the Tracking Server at http://localhost:8080, we can simply navigate to that URL in our browser.

In [None]:
# aller sur "http://127.0.0.1:8080"