## Scenario 2: A cross-functional team with one data scientist working on an ML model

This scenario demonstrates how a data scientist can use MLflow to track machine learning experiments in a team setting, using a centralized tracking server. This setup is common in organizations where multiple people need to access experiment results, models, and artifacts.

### MLflow setup overview:
- **Tracking server:** Yes (runs as a local server, accessible to the team)
- **Backend store:** SQLite database (stores experiment metadata in `backend.db`)
- **Artifacts store:** Local filesystem (stores model files and other artifacts)

With this setup, all experiment runs, parameters, metrics, and artifacts are saved in a central location. Team members can explore and compare experiments using the MLflow UI, even from different machines (if the server is accessible).

### How to use the MLflow tracking server and UI
- **First, you must launch the MLflow tracking server** by running the following command in your terminal:
  ```bash
  mlflow server --backend-store-uri sqlite:///backend.db
  ```
- The UI will be available at the address printed in your terminal (by default, [http://localhost:5000](http://localhost:5000)).
- If you run the server on a remote machine or a different port, use the appropriate address (e.g., `http://<your-server>:<port>`).
- Use the UI to browse experiments, compare runs, and inspect logged models and artifacts.
- You can also interact with the model registry for collaborative model management.

> **Tip:** This setup is ideal for small teams and collaborative projects. For larger teams or production, you may use a remote database and cloud storage for the backend and artifacts.

In [1]:
import mlflow


mlflow.set_tracking_uri("http://127.0.0.1:5000")

In [2]:
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'http://127.0.0.1:5000'


In [3]:
mlflow.search_experiments()

[<Experiment: artifact_location='mlflow-artifacts:/0', creation_time=1751191587046, experiment_id='0', last_update_time=1751191587046, lifecycle_stage='active', name='Default', tags={}>]

In [4]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score, confusion_matrix
import numpy as np
import mlflow
import sklearn
import datetime
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

mlflow.set_experiment("my-experiment-1")

X, y = load_iris(return_X_y=True)
class_names = load_iris().target_names

# Try several values of C to demonstrate experiment tracking
for C in [0.01, 0.1, 1, 10]:
    params = {"C": C, "random_state": 42, "max_iter": 1000, "solver": "lbfgs"}
    with mlflow.start_run() as run:
        mlflow.log_params(params)
        mlflow.log_param("sklearn_version", sklearn.__version__)
        lr = LogisticRegression(**params).fit(X, y)
        y_pred = lr.predict(X)
        acc = accuracy_score(y, y_pred)
        mlflow.log_metric("accuracy", acc)
        # Log model coefficients as params (flattened for logging)
        for i, coef in enumerate(lr.coef_.flatten()):
            mlflow.log_param(f"coef_{i}", coef)
        # Labeled confusion matrix as DataFrame
        cm = confusion_matrix(y, y_pred)
        cm_df = pd.DataFrame(cm, index=class_names, columns=class_names)
        cm_df.to_csv("confusion_matrix_labeled.csv")
        mlflow.log_artifact("confusion_matrix_labeled.csv")
        # Confusion matrix heatmap as image
        plt.figure(figsize=(5,4))
        sns.heatmap(cm_df, annot=True, fmt="d", cmap="Blues")
        plt.title(f"Confusion Matrix (C={C})")
        plt.ylabel("True label")
        plt.xlabel("Predicted label")
        plt.tight_layout()
        plt.savefig("confusion_matrix_heatmap.png")
        plt.close()
        mlflow.log_artifact("confusion_matrix_heatmap.png")
        # Provide input_example and use 'name' instead of deprecated 'artifact_path'
        input_example = np.expand_dims(X[0], axis=0)
        mlflow.sklearn.log_model(lr, name="models", input_example=input_example)
        # Log model type, number of classes, and timestamp as tags
        mlflow.set_tag("model_type", type(lr).__name__)
        mlflow.set_tag("n_classes", len(np.unique(y)))
        mlflow.set_tag("run_time", datetime.datetime.now().isoformat())
        mlflow.set_tag("description", "Logistic regression on Iris dataset with varying C")
        print(f"Logged run for C={C}, accuracy={acc:.3f}")

2025/06/29 10:06:36 INFO mlflow.tracking.fluent: Experiment with name 'my-experiment-1' does not exist. Creating a new experiment.


Logged run for C=0.01, accuracy=0.873
🏃 View run melodic-zebra-89 at: http://127.0.0.1:5000/#/experiments/1/runs/60350aa0096443459d01e870998b062d
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1
Logged run for C=0.1, accuracy=0.960
🏃 View run likeable-zebra-336 at: http://127.0.0.1:5000/#/experiments/1/runs/7c60cc80407b449ea25e8c373300d049
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1
Logged run for C=1, accuracy=0.973
🏃 View run masked-dog-634 at: http://127.0.0.1:5000/#/experiments/1/runs/8723a6a2a1784627ae29552ba64befff
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1
Logged run for C=10, accuracy=0.980
🏃 View run respected-duck-790 at: http://127.0.0.1:5000/#/experiments/1/runs/bf94a0c925c44f69ac6fd1cb26d3046f
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1


In [5]:
mlflow.search_experiments()

[<Experiment: artifact_location='mlflow-artifacts:/1', creation_time=1751191596295, experiment_id='1', last_update_time=1751191596295, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='mlflow-artifacts:/0', creation_time=1751191587046, experiment_id='0', last_update_time=1751191587046, lifecycle_stage='active', name='Default', tags={}>]

### Interacting with the model registry

In [6]:
from mlflow.tracking import MlflowClient


client = MlflowClient("http://127.0.0.1:5000")

In [7]:
client.search_registered_models()

[]

In [8]:
run_id = client.search_runs(experiment_ids='1')[0].info.run_id
mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name='iris-classifier'
)

Successfully registered model 'iris-classifier'.


2025/06/29 10:06:48 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: iris-classifier, version 1
Created version '1' of model 'iris-classifier'.


<ModelVersion: aliases=[], creation_timestamp=1751191608717, current_stage='None', deployment_job_state=<ModelVersionDeploymentJobState: current_task_name='', job_id='', job_state='DEPLOYMENT_JOB_CONNECTION_STATE_UNSPECIFIED', run_id='', run_state='DEPLOYMENT_JOB_RUN_STATE_UNSPECIFIED'>, description='', last_updated_timestamp=1751191608717, metrics=None, model_id=None, name='iris-classifier', params=None, run_id='bf94a0c925c44f69ac6fd1cb26d3046f', run_link='', source='models:/m-e4c3fea03d0a48dc9a99e585f54eeff5', status='READY', status_message=None, tags={}, user_id='', version='1'>