# MLflow in Practice

## Scenarios

Let's consider these 3 scenarios:
1. A single data scientist participating in an ML competition.
2. A cross-functional team with one data scientist working on an ML model.
3. Multiple data scientist working on multiple ML models.

## Configuring MLflow

- Backend store: where MLflow save the metadata of your experiment
    - Local filesystem
    - SQLAlchemy compatible DB (e.g. SQLite)
- Artifacts store:
    - Local filesystem
    - Remote (e.g. s3 bucket)
- Tracking server:
    - No tracking server
    - Localhost
    - Remote

# Scenario 1: A single data scientist participating in an ML competition.

MLflow setup:
- Backend store: Local filesystem
- Artifacts store: Local filesystem
- Tracking server: No

In [1]:
# Import libraries
import mlflow

# Default tracking URI
print(f"tracking URI: {mlflow.get_tracking_uri()}")

tracking URI: file:///home/arief/Desktop/projects/mlops_zoomcamp/module_2/mlflow_in_practice/mlruns


In [2]:
# List all experiments
mlflow.search_experiments()

[<Experiment: artifact_location='file:///home/arief/Desktop/projects/mlops_zoomcamp/module_2/mlflow_in_practice/mlruns/0', creation_time=1740143943304, experiment_id='0', last_update_time=1740143943304, lifecycle_stage='active', name='Default', tags={}>]

In [5]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

mlflow.set_experiment("my-experiment-1")

with mlflow.start_run():

    X,y = load_iris(return_X_y=True)

    params = {"C": 0.1, "random_state": 42}
    mlflow.log_params(params)
    
    lr = LogisticRegression(**params).fit(X, y)
    y_pred = lr.predict(X)

    # Track metric with log_metric
    mlflow.log_metric("accuracy", accuracy_score(y, y_pred))

    mlflow.sklearn.log_model(lr, artifact_path="models")
    print(f"Default artifacts URI: {mlflow.get_artifact_uri()}")
    



Default artifacts URI: file:///home/arief/Desktop/projects/mlops_zoomcamp/module_2/mlflow_in_practice/mlruns/226973183148448093/806d9235a3b14ec996f270b559776b1b/artifacts


In [6]:
mlflow.search_experiments()

[<Experiment: artifact_location='file:///home/arief/Desktop/projects/mlops_zoomcamp/module_2/mlflow_in_practice/mlruns/226973183148448093', creation_time=1740144395286, experiment_id='226973183148448093', last_update_time=1740144395286, lifecycle_stage='active', name='my-experiment-1', tags={}>,
 <Experiment: artifact_location='file:///home/arief/Desktop/projects/mlops_zoomcamp/module_2/mlflow_in_practice/mlruns/0', creation_time=1740143943304, experiment_id='0', last_update_time=1740143943304, lifecycle_stage='active', name='Default', tags={}>]

## Interacting with the model registry

In [7]:
from mlflow.tracking import MlflowClient

client = MlflowClient()

In [9]:
from mlflow.exceptions import MlflowException

try:
    print(client.search_registered_models())
except MlflowException:
    print("It is not possible to access the model registry :(")


[]
