# MLflow Tracking

The MLflow Tracking is an API and UI for logging parameters, code versions, metrics, and output files when running your machine learning code and for later visualizing the results.

## Concepts

![Taken from MLflow Docs](https://mlflow.org/docs/latest/_images/tracking-basics.png)

**Runs**

MLflow Tracking is organized around the concept of runs, which are executions of some piece of data science code, for example, a single python train.py execution.


**Experiments** 

An experiment groups together runs for a specific task. 




In [1]:
import mlflow
mlflow.login()

2024/05/25 23:00:00 INFO mlflow.utils.credentials: Successfully connected to MLflow hosted tracking server! Host: https://adb-3088650010345545.5.azuredatabricks.net.


In [2]:
mlflow.get_tracking_uri()

'databricks'

# Create an experiment

To create an experiment in Databricks the name should be a path in the Workspace, example: /Shared/Users/...

In [3]:
experiment_name = "/Shared/Experiments/01 - Introduction to MLflow - 1"

The path must exist before creating the experiment. 

In [4]:
from mlflow_for_ml_dev.experiments.exp_utils import print_experiment_info

In [5]:
experiment_id = mlflow.create_experiment(name=experiment_name)
print(experiment_id)

2631591499706434


In [6]:
# Set the experiment  "/Shared/Experiments/01 - Introduction to MLflow" as active experiment
experiment = mlflow.set_experiment(experiment_name)

In [7]:
# get the artifact location
experiment.artifact_location

'dbfs:/databricks/mlflow-tracking/2631591499706434'

In [8]:
# demo run

from sklearn.ensemble import RandomForestClassifier

rfc = RandomForestClassifier(n_estimators=10)

with mlflow.start_run(run_name="first-run") as run:
    mlflow.log_param("param1", 5)
    mlflow.sklearn.log_model(sk_model=rfc, artifact_path="sklearn-model")



An experiment can also be created by using `mlflow.set_experiment(experiment_name)`. If the experiment does not exist mlflow creates an experiment using the provided name. Since the `experiment_name` is a path in the workspace is necessary to ensure that the folder structure exists before creating the experiment

## Specifying Artifact Location

In [9]:
experiment_name = "/Shared/Experiments/01 - Introduction to MLflow - 2"
experiment_id = mlflow.create_experiment(name=experiment_name, artifact_location="dbfs:/FileStore/mlflow-experiments")

In [10]:
# Set the experiment  "/Shared/Experiments/01 - Introduction to MLflow - 2" as active experiment
experiment = mlflow.set_experiment(experiment_name)

In [11]:
# get the artifact location
experiment.artifact_location

'dbfs:/FileStore/mlflow-experiments'

In [12]:
# demo run
from sklearn.ensemble import RandomForestClassifier

rfc = RandomForestClassifier(n_estimators=10)

with mlflow.start_run(run_name="first-run") as run:
    mlflow.log_param("param1", 5)
    mlflow.sklearn.log_model(sk_model=rfc, artifact_path="sklearn-model")



## Adding tags

In [13]:
experiment_name = "/Shared/Experiments/01 - Introduction to MLflow - 3"
experiment_id = mlflow.create_experiment(
    name=experiment_name,
    tags={"topic":"experiment_management", "project_name":"UNKNOWN"}
)

# Set the experiment  "/Shared/Experiments/01 - Introduction to MLflow - 3" as active experiment
experiment = mlflow.set_experiment(experiment_name)

In [15]:
# get the experiment tags
experiment.tags

'dbfs:/databricks/mlflow-tracking/2631591499706436'

## Adding a description

In [19]:
experiment_name = "/Shared/Experiments/01 - Introduction to MLflow - 4"
experiment_id = mlflow.create_experiment(
    name=experiment_name,
    tags={
        "topic":"experiment_management",
        "project_name":"UNKNOWN",
        "mlflow.note.content":"This is a test experiment"})

In [20]:
# Set the experiment  "/Shared/Experiments/01 - Introduction to MLflow - 4" as active experiment
experiment = mlflow.set_experiment(experiment_name)

In [None]:
# get the experiment tags
experiment.tags

## Update Tags

In [None]:
experiment_name = "/Shared/Experiments/01 - Introduction to MLflow - 4"
experiment = mlflow.set_experiment(experiment_name)

In [21]:
tags = {
    "tag1": "value1",
    "tag2": "value2"
}
mlflow.set_experiment_tags(tags=tags)


In [None]:
# get the updated experiment object
experiment = mlflow.set_experiment(experiment_name)

# get the experiment tags
experiment.tags

In [23]:
# Update Value of tag1
mlflow.set_experiment_tag(key="tag1", value="new_value1")

# get the updated experiment object
experiment = mlflow.set_experiment(experiment_name)

In [None]:
# get the experiment tags
experiment.tags

## Using the client to set a tag

In [25]:
client = mlflow.MlflowClient()

In [None]:
experiment.name

In [26]:
client.set_experiment_tag(experiment_id = experiment.experiment_id, key="tag3", value="value3")

In [None]:
experiment = mlflow.set_experiment(experiment_name)

# get the experiment tags
experiment.tags

## Rename Experiment

In [28]:
new_name = "/Shared/Experiments/01 - Introduction to MLflow - 4 - Renamed"
client.rename_experiment(experiment_id = experiment.experiment_id, new_name=new_name)

In [None]:
experiment = mlflow.set_experiment(new_name)

experiment.name

## Clean Up

In [30]:
experiments = mlflow.search_experiments(filter_string="name LIKE '/Shared/Experiments%'")
# experiments = mlflow.search_experiments()
for experiment in experiments:
    print(f"Deleting: {experiment.name}")
    mlflow.delete_experiment(experiment.experiment_id)


Deleting: /Shared/Experiments/01 - Introduction to MLflow - 4 - Renamed
Deleting: /Shared/Experiments/01 - Introduction to MLflow - 3
Deleting: /Shared/Experiments/01 - Introduction to MLflow - 2
Deleting: /Shared/Experiments/01 - Introduction to MLflow - 1
