In [3]:
import mlflow


Q1. Install MLflow

In [5]:
!mlflow --version

mlflow, version 3.1.1


Q2. Download and preprocess the data

In [8]:
!python preprocess_data.py --raw_data_path ./data --dest_path ./output


How many files were saved to OUTPUT_FOLDER?- Total 4 files

In [9]:
!ls -lh ./output


total 6.9M
-rw-rw-rw- 1 codespace codespace 128K Jul 15 10:37 dv.pkl
-rw-rw-rw- 1 codespace codespace 2.4M Jul 15 10:37 test.pkl
-rw-rw-rw- 1 codespace codespace 2.3M Jul 15 10:37 train.pkl
-rw-rw-rw- 1 codespace codespace 2.2M Jul 15 10:37 val.pkl


Q3. Train a model with autolog

In [14]:
!python train.py --data_path ./output


2025/07/15 12:11:45 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2025/07/15 12:11:45 INFO mlflow.store.db.utils: Updating database tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
✅ RMSE: 5.4312


In [5]:
import mlflow
from mlflow.tracking import MlflowClient


In [6]:
mlflow.set_tracking_uri("sqlite:///mlflow.db")
# Get the experiment by name
experiment_name = "nyc-taxi-experiment-module-2"
client = MlflowClient()
experiment = client.get_experiment_by_name(experiment_name)

# Get the latest run from that experiment
runs = client.search_runs(experiment_ids=[experiment.experiment_id],
                          order_by=["start_time DESC"],
                          max_results=1)
latest_run = runs[0]


What is the value of the min_samples_split parameter:

In [7]:
min_samples_split = latest_run.data.params.get("min_samples_split")
print(f"min_samples_split: {min_samples_split}")


min_samples_split: 2


Q5. Tune model hyperparameters

In [8]:
!python hpo.py

  import pkg_resources
2025/07/15 14:21:10 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2025/07/15 14:21:10 INFO mlflow.store.db.utils: Updating database tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
2025/07/15 14:21:10 INFO mlflow.tracking.fluent: Experiment with name 'random-forest-hyperopt' does not exist. Creating a new experiment.
100%|██████████| 15/15 [01:10<00:00,  4.69s/trial, best loss: 5.335419588556921]


Q6. Promote the best model to the model registry

In [10]:
!python register_model.py

2025/07/15 14:39:12 INFO mlflow.tracking.fluent: Experiment with name 'random-forest-best-models' does not exist. Creating a new experiment.
🏃 View run respected-sheep-530 at: http://127.0.0.1:5000/#/experiments/3/runs/a086f4848b2f49aa84db8535091e9681
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run industrious-owl-73 at: http://127.0.0.1:5000/#/experiments/3/runs/afbd74be893148bf8d44854281a37909
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run righteous-cow-147 at: http://127.0.0.1:5000/#/experiments/3/runs/05ac197766c74704ae589d0aa10d7a67
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run serious-mink-7 at: http://127.0.0.1:5000/#/experiments/3/runs/5ef8970435ee4d8689dd94d07f52b599
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run suave-bear-329 at: http://127.0.0.1:5000/#/experiments/3/runs/e113497a21b3446aa301446a2a4d3e7a
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
✅ Best run: a086f

In [11]:
!python register_model.py --data_path ./output


🏃 View run flawless-perch-299 at: http://127.0.0.1:5000/#/experiments/3/runs/58fd7f12dbe243e4a59f89c96a14747b
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run languid-shad-595 at: http://127.0.0.1:5000/#/experiments/3/runs/70dc65200aa84610880418ffbf1ad54f
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run serious-lamb-23 at: http://127.0.0.1:5000/#/experiments/3/runs/5933a92fa8d644e8a58625295c2f937b
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run clumsy-goat-15 at: http://127.0.0.1:5000/#/experiments/3/runs/5253cf2b47a24ef2a3dcd9ebcfc013d3
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
🏃 View run stylish-roo-578 at: http://127.0.0.1:5000/#/experiments/3/runs/7300b82de32c4254bcdf1dad1cbdf1cb
🧪 View experiment at: http://127.0.0.1:5000/#/experiments/3
✅ Best run: 58fd7f12dbe243e4a59f89c96a14747b, test RMSE: 5.5674
Registered model 'random-forest-best' already exists. Creating a new version of this model...
2025