 **Module 1.5: Organizing Experiments in MLflow**


## 🎯 **Learning Objectives Expanded**

### 1️⃣ **Create and Name Experiments Using `mlflow.set_experiment()`**

* **What it means:**
  Organizing your MLflow runs by assigning them to clearly named experiments, making it easier to group related runs logically.

* **Detailed Steps:**

  * Set the experiment name:

    ```python
    import mlflow
    mlflow.set_experiment("model-selection-experiment")
    ```
  * If the experiment doesn’t exist, MLflow automatically creates it.

* **Why it matters:**
  Clear experiment naming helps you easily manage, identify, and compare related model runs over time.

---

### 2️⃣ **Use Tags to Track Metadata (e.g., stage, author, purpose)**

* **What it means:**
  Adding custom labels or "tags" to your MLflow runs to capture extra details beyond just parameters and metrics (e.g., author, project phase).

* **Detailed Steps:**

  * Define tags explicitly in your run:

    ```python
    with mlflow.start_run():
        mlflow.set_tag("stage", "development")
        mlflow.set_tag("author", "John Doe")
        mlflow.set_tag("purpose", "Baseline model test")
    ```

* **Why it matters:**
  Tags enhance your ability to filter, search, and organize experiment runs based on specific metadata or context, improving team collaboration and understanding.

---

### 3️⃣ **Log and Organize Multiple Model Types Within a Single Experiment**

* **What it means:**
  Using a single named experiment to manage runs of various model types (e.g., Linear Regression, Random Forest, XGBoost), simplifying comparison and management.

* **Detailed Steps:**

  ```python
  models = ["linear_regression", "random_forest", "xgboost"]
  for model_type in models:
      with mlflow.start_run():
          mlflow.log_param("model_type", model_type)
          mlflow.set_tag("stage", "baseline")

          # Train and log different model types here
          mlflow.log_metric("accuracy", accuracy)
  ```

* **Why it matters:**
  Centralizing diverse model runs within one experiment allows easy comparative analysis, enabling better model selection decisions.

---

### 4️⃣ **View Structured Experiment Data Using `search_runs()`**

* **What it means:**
  Retrieving and reviewing experiment results (parameters, metrics, tags, etc.) programmatically using MLflow’s Python API.

* **Detailed Steps:**

  * Retrieve structured data as a DataFrame:

    ```python
    runs_df = mlflow.search_runs(experiment_names=["model-selection-experiment"])
    runs_df[["run_id", "params.model_type", "metrics.accuracy", "tags.stage"]]
    ```
  * Sort and analyze runs:

    ```python
    runs_df.sort_values(by="metrics.accuracy", ascending=False)
    ```

* **Why it matters:**
  Quick access to structured experiment data helps in rapidly comparing results, identifying top-performing models, and understanding experimental outcomes in detail.






In [1]:
# 📓 Module 1.5: Organizing Experiments in MLflow
# Goal: Understand how to create, retrieve, and use named experiments effectively in MLflow.

# ✅ Step 1: Install MLflow
!pip install -q mlflow

# ✅ Step 2: Import necessary libraries
import mlflow
import os

# ✅ Step 3: Create or set an experiment
# This experiment name will group all runs under a single label for easy comparison
experiment_name = "model-selection-experiment"
mlflow.set_experiment(experiment_name)

# Retrieve the experiment to confirm creation or access
experiment = mlflow.get_experiment_by_name(experiment_name)
print(f"Experiment '{experiment.name}' created with ID: {experiment.experiment_id}")

# ✅ Step 4: Log dummy runs under this experiment
# Demonstrates how to use tags and structure runs clearly
for model_type in ["linear_regression", "random_forest"]:
    with mlflow.start_run():
        # Log a dummy model type as a parameter
        mlflow.log_param("model_type", model_type)

        # Use tags to add metadata (e.g., user, purpose)
        mlflow.set_tag("stage", "dev")
        mlflow.set_tag("author", "mlflow_course")

        # Log fake metric for demo purposes
        mlflow.log_metric("accuracy", 0.8 if model_type == "linear_regression" else 0.9)

        print(f"Logged run for model: {model_type}, Run ID: {mlflow.active_run().info.run_id}")

# ✅ Step 5: Retrieve and inspect all runs in the experiment
runs_df = mlflow.search_runs(experiment_ids=[experiment.experiment_id])
print("\nAll runs for the experiment:")
display_cols = ["run_id", "params.model_type", "metrics.accuracy", "tags.stage"]
print(runs_df[display_cols])


2025/08/02 22:02:37 INFO mlflow.tracking.fluent: Experiment with name 'model-selection-experiment' does not exist. Creating a new experiment.


Experiment 'model-selection-experiment' created with ID: 222412334690476895
Logged run for model: linear_regression, Run ID: 921812ecce1f45b0b5e49a107d27bba1
Logged run for model: random_forest, Run ID: 8e9c8f88cd30491795a0eb61957cce20

All runs for the experiment:
                             run_id  params.model_type  metrics.accuracy  \
0  8e9c8f88cd30491795a0eb61957cce20      random_forest               0.9   
1  921812ecce1f45b0b5e49a107d27bba1  linear_regression               0.8   

  tags.stage  
0        dev  
1        dev  


## 📝 Assessment: Organizing Experiments

### 📘 Multiple Choice (Choose the best answer)

**1. What does `mlflow.set_experiment("my-exp")` do?**   
A. Starts a new run with the given name   
**B. Sets or creates an experiment to group related runs** ✅   
C. Tags a model as experimental   
D. Initializes the experiment UIv

---

**2. Why would you use `mlflow.set_tag()`?**   
A. To assign a unique ID to a model   
B. To deploy a model to production
**C. To attach metadata (e.g., author, stage) to a run** ✅   
D. To search for artifacts   

---

**3. What is the correct method to retrieve all runs from a specific experiment?**   
A. `mlflow.get_runs()`   
**B. `mlflow.search_runs()`** ✅   
C. `mlflow.find_experiment()`   
D. `mlflow.list_models()`   

---

**4. What happens if you call `mlflow.set_experiment()` with a name that doesn’t exist yet?**   
A. It raises an error   
**B. It automatically creates the new experiment** ✅   
C. It uses a default experiment   
D. It disables tracking   

---
   
### ✏️ Short Answer
v
**5. Why is it helpful to use tags when logging MLflow runs?**   
*Tags help identify purpose, environment (dev/prod), authorship, or versioning for better tracking and collaboration.*   

---

**6. What’s the difference between an experiment and a run in MLflow?**   
*An experiment is a container for grouping multiple runs; each run represents one model training execution.*   

---

### 🧪 Mini Project   

**7. Task:**   
You’re comparing 3 models: Linear Regression, Decision Tree, and XGBoost.v

* Use `mlflow.set_experiment("model-comparison")`   
* For each model:   

  * Log the model type as a parameter   
  * Set a `stage` tag (e.g., “baseline”)   
  * Log an example metric (e.g., accuracy or RMSE)   
* Retrieve all runs using `search_runs()`   
* Export a table with columns: run ID, model type, metric, stage   

