# ✍️ Exercise: Intro to MLFlow - Part I

In this exercise, we will cover the basics of MLFlow. MLFlow is an open-source platform for the complete machine learning lifecycle. It is designed to work with any machine learning library and to be agnostic to the execution environment. It is also designed to be scalable and to support the complete machine learning lifecycle, including experimentation, reproducibility, and deployment.

In this first part, we will cover the following topics:
- How to Install MLFlow.
- How to launch the MLFlow Server.
- How to create a new MLFlow Experiment.
- How to create a new MLFlow Run.
- How to log parameters, metrics, and artifacts.

## How to Install MLFlow  

💡 Remember: We can simply install MLFlow **using pip** 🎉

```bash
pip install mlflow
```

## How to launch the MLFlow Server

💡 Remember: After installing MLFlow, we can launch the MLFlow server using the following command **in the terminal**:

```bash
mlflow server
```

You will see the following output:

```bash
[2024-02-21 23:29:52 +0100] [725738] [INFO] Starting gunicorn 21.2.0
[2024-02-21 23:29:52 +0100] [725738] [INFO] Listening at: http://127.0.0.1:5000 (725738)
[2024-02-21 23:29:52 +0100] [725738] [INFO] Using worker: sync
[2024-02-21 23:29:52 +0100] [725739] [INFO] Booting worker with pid: 725739
[2024-02-21 23:29:52 +0100] [725740] [INFO] Booting worker with pid: 725740
[2024-02-21 23:29:53 +0100] [725741] [INFO] Booting worker with pid: 725741
```

👉 Then, we can **access the mlflow server by opening the following URL in a web browser**: http://localhost:5000.


## Exercise I: Connecting to the MLFlow Server

1. 👉 Connect to MLFlow using `mlflow.set_tracking_uri()` and set the URI to `http://localhost:5000`.
2. 👉 Use `mlflow.search_experiments()` to list all the experiments.

In [3]:
import mlflow


# set the experiment id
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment(experiment_id="268156047926477072")

mlflow.autolog()



## Exercise II: Creating a New MLFlow Experiment

1. 👉 Create a new MLFlow Experiment using `mlflow.create_experiment()` and set the name to `intro-to-mlflow`.
2. 👉 Check if the experiment was created by using `mlflow.get_experiment_by_name()`.
3. 👉 Print the experiment ID.

In [9]:
experiment_id = mlflow.create_experiment("intro-to-mlflow")
experiment = mlflow.get_experiment_by_name("intro_to_mlflow")
print(f"El ID del experimento es: {experiment.experiment_id}")

El ID del experimento es: 354971876388856999


## Exercise III: Creating a New MLFlow Run

1. 👉 Create a new MLFlow Run using `mlflow.start_run()` and set the experiment_id to the ID of the `intro-to-mlflow` experiment.
2. 👉 Check if the run was created by using `run.info.run_id`.
3. 👉 Print the run_id.

In [10]:
import mlflow

experiment_id = mlflow.get_experiment_by_name("intro-to-mlflow").experiment_id

with mlflow.start_run(experiment_id=experiment_id) as run:
    run_id = run.info.run_id

2024/12/09 20:29:45 INFO mlflow.tracking._tracking_service.client: 🏃 View run bedecked-pig-33 at: http://localhost:5000/#/experiments/821024274977858993/runs/7e4c8d10bf8d4bf6a911925cbab5328b.
2024/12/09 20:29:45 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: http://localhost:5000/#/experiments/821024274977858993.


In [11]:
print(f"El ID de la ejecución es: {run_id}")

El ID de la ejecución es: 7e4c8d10bf8d4bf6a911925cbab5328b


## Exercise IV: Logging Tags, Parameters and Metrics

Imagine you have the following information about the run:

- model_type: "RandomForest"
- accuracy: 0.85
- max_depth: 10
- precision: 0.90
- learning_rate: 0.01
- recall: 0.80

1. 👉 Think. What should you log as a tag, parameter, and metric?
2. 👉 Create a new MLFlow Run using `mlflow.start_run()` and set the experiment_id to the ID of the `intro-to-mlflow` experiment.
3. 👉 Log the tags using `mlflow.set_tags()`.
4. 👉 Log the parameters using `mlflow.log_param()`.
5. 👉 Log the metrics using `mlflow.log_metric()`.

In [12]:
experiment_id = mlflow.get_experiment_by_name("intro-to-mlflow").experiment_id

with mlflow.start_run(experiment_id=experiment_id) as run:
    # 3. Loguear los tags
    mlflow.set_tags({"model_type": "RandomForest"})
    
    # 4. Loguear los parámetros
    mlflow.log_param("max_depth", 10)
    mlflow.log_param("learning_rate", 0.01)
    
    # 5. Loguear las métricas
    mlflow.log_metric("accuracy", 0.85)
    mlflow.log_metric("precision", 0.90)
    mlflow.log_metric("recall", 0.80)
    
    # Imprimir el ID de la ejecución
    run_id = run.info.run_id
    print(f"El ID de la ejecución es: {run_id}")

2024/12/09 20:30:57 INFO mlflow.tracking._tracking_service.client: 🏃 View run vaunted-trout-782 at: http://localhost:5000/#/experiments/821024274977858993/runs/4a8be0f2b9c644f7a126ae192ab4fd82.
2024/12/09 20:30:57 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: http://localhost:5000/#/experiments/821024274977858993.


El ID de la ejecución es: 4a8be0f2b9c644f7a126ae192ab4fd82
