# MLflow in Practice

## Scenarios

Let's consider these 3 scenarios:
1. A single data scientist participating in an ML competition.
2. A cross-functional team with one data scientist working on an ML model.
3. Multiple data scientist working on multiple ML models.

## Configuring MLflow

- Backend store: where MLflow save the metadata of your experiment
    - Local filesystem
    - SQLAlchemy compatible DB (e.g. SQLite)
- Artifacts store:
    - Local filesystem
    - Remote (e.g. s3 bucket)
- Tracking server:
    - No tracking server
    - Localhost
    - Remote

# Scenario 2: A cross-functional team with one data scientist working on an ML model.
MLflow setup:
- Backend store: SQLite database
- Artifacts store: Local filesystem
- Tracking server: Yes, local server

The experiments can be explored locally by accessing the local tracking server. To run this example you need to launch the MLflow server locally by running the following command in your terminal:

`mlflow server --backend-store-uri sqlite:///backend.db`

After running the command above, if we start an experiment, the metadata will be saved in the `backend.db` file, and not in an mlruns folder.

In [31]:
# Before begin, run this first in terminal: 
# mlflow server --backend-store-uri sqlite:///backend.db

# Import libraries
import mlflow

# Set tracking URI
mlflow.set_tracking_uri("http://127.0.0.1:5000")

In [5]:
print(f"Tracking URI: {mlflow.get_tracking_uri()}")

Tracking URI: http://127.0.0.1:5000


In [6]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():

    X,y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)
    
    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)

    # Track metric with log_metric
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"Default artifacts URI: {mlflow.get_artifact_uri()}")
    

2025/02/21 20:46:44 INFO mlflow.tracking.fluent: Experiment with name 'my-experiment-1' does not exist. Creating a new experiment.
2025/02/21 20:46:46 INFO mlflow.tracking._tracking_service.client: 🏃 View run resilient-smelt-354 at: http://127.0.0.1:5000/#/experiments/1/runs/79a64f86a9a042468e6056a0a5054ee5.
2025/02/21 20:46:46 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1.


Default artifacts URI: mlflow-artifacts:/1/79a64f86a9a042468e6056a0a5054ee5/artifacts


In [6]:
mlflow.search_experiments()

[<Experiment: artifact_location='file:///home/arief/Desktop/projects/mlops_zoomcamp/module_2/mlflow_in_practice/mlruns/226973183148448093', creation_time=1740144395286, experiment_id='226973183148448093', last_update_time=1740144395286, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='file:///home/arief/Desktop/projects/mlops_zoomcamp/module_2/mlflow_in_practice/mlruns/0', creation_time=1740143943304, experiment_id='0', last_update_time=1740143943304, lifecycle_stage='active', name='Default', tags={}>]

## Interacting with the model registry

In [7]:
from mlflow.tracking import MlflowClient

client = MlflowClient()

In [8]:
from mlflow.exceptions import MlflowException

try:
    print(client.search_registered_models())
except MlflowException:
    print("It is not possible to access the model registry :(")


[]


In [29]:
# Register a model

run_id = client.search_runs(experiment_ids='1')[0].info.run_id
run_id

mlflow.register_model(
    model_uri=f"runs:/{run_id}/models",
    name="iris_classifier"
)

Registered model 'iris_classifier' already exists. Creating a new version of this model...
2025/02/21 21:00:57 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: iris_classifier, version 2
Created version '2' of model 'iris_classifier'.


<ModelVersion: aliases=[], creation_timestamp=1740146457903, current_stage='None', description='', last_updated_timestamp=1740146457903, name='iris_classifier', run_id='79a64f86a9a042468e6056a0a5054ee5', run_link='', source='mlflow-artifacts:/1/79a64f86a9a042468e6056a0a5054ee5/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='2'>

In [30]:
from mlflow.exceptions import MlflowException

try:
    print(client.search_registered_models())
except MlflowException:
    print("It is not possible to access the model registry :(")


[<RegisteredModel: aliases={}, creation_timestamp=1740146425970, description='', last_updated_timestamp=1740146457903, latest_versions=[<ModelVersion: aliases=[], creation_timestamp=1740146457903, current_stage='None', description='', last_updated_timestamp=1740146457903, name='iris_classifier', run_id='79a64f86a9a042468e6056a0a5054ee5', run_link='', source='mlflow-artifacts:/1/79a64f86a9a042468e6056a0a5054ee5/artifacts/models', status='READY', status_message='', tags={}, user_id='', version='2'>], name='iris_classifier', tags={}>]
