# Different scenarios for using MLflow

__Scenario 1.__ A single data scientist participating in an ML competition

__Scenario 2.__ A cross-functional team working with a single data scientist on an ML model

__Scenario 3.__ Multiple data scientists working on multiple ML models



## Scenario 1.

She does not need to share her runs with other

- Local tracking server is enough. Does not need a remote tracking server 
- Using model registry is useless. DS is not interested in deploying model in production

MLflow setup:

- Tracking server: no
- Backend store: local filesystem
- Artifacts store: local filesystem

The experiments can be explored locally by launching the MLflow UI.


In [1]:
import mlflow

In [2]:
# as we didnot specify tracking uri, it assumes you want to use your localfilesystem
# i.e. 'mlruns' folder, to store artifacts and metadata about the experiment.
# Note that for an experiments its info is saved in: mlruns/<exp_id>/<run_id>
# the following, gives tracking uri
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'file:///workspaces/mlops-zoomcamp/02-experiment%20tracking/mlruns'


##### Creating an experiment and logging a new run

In [4]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

# we do not use sqlite: instead we use file system
mlflow.set_experiment("scenario-1")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")

Traceback (most recent call last):
  File "/home/codespace/.python/current/lib/python3.12/site-packages/mlflow/store/tracking/file_store.py", line 329, in search_experiments
    exp = self._get_experiment(exp_id, view_type)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/codespace/.python/current/lib/python3.12/site-packages/mlflow/store/tracking/file_store.py", line 427, in _get_experiment
    meta = FileStore._read_yaml(experiment_dir, FileStore.META_DATA_FILE_NAME)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/codespace/.python/current/lib/python3.12/site-packages/mlflow/store/tracking/file_store.py", line 1373, in _read_yaml
    return _read_helper(root, file_name, attempts_remaining=retries)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/codespace/.python/current/lib/python3.12/site-packages/mlflow/store/tracking/file_store.py", line 1366, in _read_helper
    result = read_yaml(root,



default artifacts URI: 'file:///workspaces/mlops-zoomcamp/02-experiment%20tracking/mlruns/517005957011971498/ecaac0ca2b8c4439a5c0fc68038682e2/artifacts'


In [None]:
# mlflow.show_experiments()

To explore the above models in browser, write the following in your terminal:
- `mlflow ui`

##### Interacting with the model registry

In [6]:
from mlflow.tracking import MlflowClient


client = MlflowClient()
from mlflow.exceptions import MlflowException

try:
    client.search_registered_models()
except MlflowException:
    print("It's not possible to access the model registry :(")

## Scenario 2: A cross-functional team with one data scientist working on an ML model

MLflow setup:

- tracking server: yes, local server
- backend store: sqlite database
- artifacts store: local filesystem

The experiments can be explored locally by accessing the local tracking server.

To run this example you need to launch the mlflow server locally by running the following command in your terminal:

`mlflow server --backend-store-uri sqlite:///backend.db --default-artifact-root ./artifacts_local`

- this is not `uri` (this is an actual `server`)
- the last parameter specifies the folder that artifacts will be saved.
- note that `artifacts_local` only save the model,
- the metadata, parameters, metrics, and so on will be saved in `backend.db` as a sqliteDB.

__before running the next cell, you need to run the above line in your terminal__

In [7]:
import mlflow

mlflow.set_tracking_uri("http://127.0.0.1:5000")

print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'http://127.0.0.1:5000'


In [8]:
mlflow.search_experiments()

[<Experiment: artifact_location='/workspaces/mlops-zoomcamp/02-experiment tracking/scenarios/artifacts_local/0', creation_time=1747386695839, experiment_id='0', last_update_time=1747386695839, lifecycle_stage='active', name='Default', tags={}>]

In [9]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("scenario-2")
# mlflow.set_tracking_uri("sqlite:///mlflow.db")

with mlflow.start_run():

    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"default artifacts URI: '{mlflow.get_artifact_uri()}'")

2025/05/16 09:12:08 INFO mlflow.tracking.fluent: Experiment with name 'scenario-2' does not exist. Creating a new experiment.


default artifacts URI: '/workspaces/mlops-zoomcamp/02-experiment tracking/scenarios/artifacts_local/1/c241e5e16eff444196d4d0d403014dec/artifacts'
üèÉ View run persistent-trout-574 at: http://127.0.0.1:5000/#/experiments/1/runs/c241e5e16eff444196d4d0d403014dec
üß™ View experiment at: http://127.0.0.1:5000/#/experiments/1


In [10]:
mlflow.search_experiments()

[<Experiment: artifact_location='/workspaces/mlops-zoomcamp/02-experiment tracking/scenarios/artifacts_local/1', creation_time=1747386728932, experiment_id='1', last_update_time=1747386728932, lifecycle_stage='active', name='scenario-2', tags={}>,
 <Experiment: artifact_location='/workspaces/mlops-zoomcamp/02-experiment tracking/scenarios/artifacts_local/0', creation_time=1747386695839, experiment_id='0', last_update_time=1747386695839, lifecycle_stage='active', name='Default', tags={}>]

##### Interacting with the model registry

In [13]:
from mlflow.tracking import MlflowClient


client = MlflowClient("http://127.0.0.1:5000")

In [14]:
client.search_registered_models()

[]

In [15]:
# # Search runs in a specific experiment (by ID)
runs = client.search_runs(
    experiment_ids=["1"],             # Can be a list of one or more experiment IDs
    filter_string="",                 # Optional: filter by metrics, params, etc.
    order_by=["attributes.start_time DESC"],  # Sort by most recent
    max_results=5                     # Limit number of runs returned
)

# Print run IDs
for run in runs:
    print(f"Run ID: {run.info.run_id}")

Run ID: c241e5e16eff444196d4d0d403014dec


In [16]:
run_id = 'c241e5e16eff444196d4d0d403014dec'
mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name='iris-classifier'
)

Successfully registered model 'iris-classifier'.
2025/05/16 09:16:10 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: iris-classifier, version 1
Created version '1' of model 'iris-classifier'.


<ModelVersion: aliases=[], creation_timestamp=1747386970635, current_stage='None', description='', last_updated_timestamp=1747386970635, name='iris-classifier', run_id='c241e5e16eff444196d4d0d403014dec', run_link='', source=('/workspaces/mlops-zoomcamp/02-experiment '
 'tracking/scenarios/artifacts_local/1/c241e5e16eff444196d4d0d403014dec/artifacts/models'), status='READY', status_message=None, tags={}, user_id='', version='1'>

In [17]:

model_name = 'iris-classifier'

latest_versions = client.get_latest_versions(name=model_name)

for version in latest_versions:
    print(f"version: {version.version}, stage: {version.current_stage}")

version: 1, stage: None


  latest_versions = client.get_latest_versions(name=model_name)


In [18]:
# to label the version as 'Staging' stage
new_stage = 'Staging'
client.transition_model_version_stage(
    name=model_name,
    version=1,
    stage=new_stage,
    archive_existing_versions=True
)

  client.transition_model_version_stage(


<ModelVersion: aliases=[], creation_timestamp=1747386970635, current_stage='Staging', description='', last_updated_timestamp=1747387163698, name='iris-classifier', run_id='c241e5e16eff444196d4d0d403014dec', run_link='', source=('/workspaces/mlops-zoomcamp/02-experiment '
 'tracking/scenarios/artifacts_local/1/c241e5e16eff444196d4d0d403014dec/artifacts/models'), status='READY', status_message=None, tags={}, user_id='', version='1'>

In [19]:
latest_versions = client.get_latest_versions(name=model_name)

for version in latest_versions:
    print(f"version: {version.version}, stage: {version.current_stage}")

version: 1, stage: Staging


  latest_versions = client.get_latest_versions(name=model_name)


## Scenario 3: Multiple data scientists working on multiple ML models

MLflow setup:

- Tracking server: yes, remote server (EC2). (I think it is for sharing info between DS and deployment engineer: registering, seeing staging, production etc.)
- Backend store: postgresql database. (to store metadata, metrics, parameters, etc)
- Artifacts store: s3 bucket. (to store models)

The experiments can be explored by accessing the remote server.

The example uses AWS to host a remote server. In order to run the example you'll need an AWS account. Follow the steps described in the file `mlflow_on_aws.md` to create a new AWS account and launch the tracking server.

In [None]:
import mlflow
from azureml.core import Workspace
from azureml.core.authentication import InteractiveLoginAuthentication
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from mlflow.tracking import MlflowClient


In [None]:
# 1. Authenticate to Azure
interactive_auth = InteractiveLoginAuthentication()

# 2. Connect to your Azure ML Workspace
ws = Workspace(
    subscription_id="your-subscription-id",
    resource_group="your-resource-group",
    workspace_name="your-workspace-name",
    auth=interactive_auth
)

# 3. Set the MLflow tracking URI to Azure ML
mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
print(f"Tracking URI: {mlflow.get_tracking_uri()}")


In [None]:
# 4. Set experiment
mlflow.set_experiment("my-experiment-1")

# 5. Train and log a model
with mlflow.start_run() as run:
    X, y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)

    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    # Log model to MLflow
    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"Artifacts URI: {mlflow.get_artifact_uri()}")

    run_id = run.info.run_id

In [None]:
# 6. Register the model in Azure ML Model Registry
result = mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name="iris-classifier"
)

In [None]:
# 7. (Optional) Transition the model to "Production"
client = MlflowClient()
client.transition_model_version_stage(
    name="iris-classifier",
    version=result.version,
    stage="Production",
    archive_existing_versions=True
)

print(f"Model 'iris-classifier' version {result.version} is now in Production.")