- tracking server: yes, local server
- backend store: sqlite database
- artifacts store: local filesystem

##### To enable the mlflow server locally: 
(Run the following command in the terminal and in the right directory)

    - mlflow server --backend-store-uri sqlite:///backend.db
    - mlflow server --backend-store-uri sqlite:///backend.db default-artifact-root ./artifacts_local
- If the default-artifact-root is required : you need to give a local directory to store the artifacts. You can also specify the mlruns folder too. To have clarity, use a different folder for artifacts
   
    

In [7]:
import os
import pickle
import click
import mlflow
import numpy as np
from hyperopt import STATUS_OK, Trials, fmin, hp, tpe
from hyperopt.pyll import scope
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import root_mean_squared_error

In [None]:
import mlflow

#set the tracking uri to the address that is obtained from following the above command.
mlflow.set_tracking_uri("http://127.0.0.1:5000")


In [4]:
print(f"tracking URI: '{mlflow.get_tracking_uri()}'")

tracking URI: 'http://127.0.0.1:5000'


In [6]:
mlflow.search_experiments()

[<Experiment: artifact_location='mlflow-artifacts:/0', creation_time=1748614943845, experiment_id='0', last_update_time=1748614943845, lifecycle_stage='active', name='Default', tags={}>]

In [8]:
mlflow.set_experiment("random-forest-hyperopt")

2025/05/30 14:38:56 INFO mlflow.tracking.fluent: Experiment with name 'random-forest-hyperopt' does not exist. Creating a new experiment.


<Experiment: artifact_location='mlflow-artifacts:/1', creation_time=1748615936206, experiment_id='1', last_update_time=1748615936206, lifecycle_stage='active', name='random-forest-hyperopt', tags={}>

In [10]:
def load_pickle(filename: str):
    with open(filename, "rb") as f_in:
        return pickle.load(f_in)

In [11]:
def run_optimization(data_path: str, num_trials: int):

    X_train, y_train = load_pickle(os.path.join(data_path, "train.pkl"))
    X_val, y_val = load_pickle(os.path.join(data_path, "val.pkl"))

    def objective(params):
        with mlflow.start_run():
            mlflow.set_tag("model","rfr")
            mlflow.set_tag("type","hyperparameter tuning")
            mlflow.log_params(params)

            rf = RandomForestRegressor(**params)
            rf.fit(X_train, y_train)
            y_pred = rf.predict(X_val)
            rmse = root_mean_squared_error(y_val, y_pred)

            mlflow.log_metric("rmse", rmse)
        return {'loss': rmse, 'status': STATUS_OK}

    search_space = {
        'max_depth': scope.int(hp.quniform('max_depth', 1, 20, 1)),
        'n_estimators': scope.int(hp.quniform('n_estimators', 10, 50, 1)),
        'min_samples_split': scope.int(hp.quniform('min_samples_split', 2, 10, 1)),
        'min_samples_leaf': scope.int(hp.quniform('min_samples_leaf', 1, 4, 1)),
        'random_state': 42
    }

    rstate = np.random.default_rng(42)  # for reproducible results
    best_results = fmin(
        fn=objective,
        space=search_space,
        algo=tpe.suggest,
        max_evals=num_trials,
        trials=Trials(),
        rstate=rstate
    )

In [14]:
run_optimization(data_path="/workspaces/MLOps/02-experiment_tracking/processed_green_trip_data", num_trials=20)

  0%|          | 0/20 [00:00<?, ?trial/s, best loss=?]

🏃 View run omniscient-zebra-686 at: http://127.0.0.1:5000/#/experiments/1/runs/c76aaa44777447a1988ada719cb968b6

🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1

🏃 View run skillful-mink-698 at: http://127.0.0.1:5000/#/experiments/1/runs/c204cf84684b448d869e56e146b65dbd

🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1                   

🏃 View run hilarious-hare-200 at: http://127.0.0.1:5000/#/experiments/1/runs/118b8cc8575f4406a909f7b185875358

🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1                   

🏃 View run learned-kite-371 at: http://127.0.0.1:5000/#/experiments/1/runs/54ebbc0881c3434983267c6016ffe7ca

🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1                   

🏃 View run brawny-gnat-30 at: http://127.0.0.1:5000/#/experiments/1/runs/58512c7d60104081941a963c92fb6e25

🧪 View experiment at: http://127.0.0.1:5000/#/experiments/1                   

🏃 View run respected-shrew-130 at: http://127.0.0.1:5000/#/experiment