https://mlflow.org/docs/latest/ml/tracking/#tracking_server

# Mimick running mlflow server 
* experiemnt data in PostGres
* artifacts on S3 bucket (mimicked by Minio)

1. install necessary librairies

In [1]:
%%bash
pip install mlflow psycopg2 boto3



[0m

2. setup env variable for mlflow to use S3 mimick build with docker

* Key Differences: FakeS3 / AWS S3

| Component       | MinIO (Local)                                | Real S3 (AWS)                  |
|-----------------|----------------------------------------------|--------------------------------|
| **Endpoint**    | `MLFLOW_S3_ENDPOINT_URL=http://localhost:9000` | Not needed (uses AWS default)  |
| **Access Key**  | `AWS_ACCESS_KEY_ID=minio_user`               | `AWS_ACCESS_KEY_ID=your_aws_key` |
| **Secret Key**  | `AWS_SECRET_ACCESS_KEY=minio_password`       | `AWS_SECRET_ACCESS_KEY=your_aws_secret` |
| **Region**      | Any (MinIO ignores it)                       | Must be your actual AWS region |
| **Buckets**     | `s3://s3mimick` (created by `mc` command)    | `s3://your-existing-bucket`    |
| **Docker Services** | PostgreSQL + MinIO + `mc`                | PostgreSQL only                |


In [None]:
%%bash
# necessary because bash doesnt necessarely knows where is .env for mlflow to connect to AWS
export AWS_ACCESS_KEY_ID=minio_user
export AWS_SECRET_ACCESS_KEY=minio_password
export AWS_DEFAULT_REGION=us-east-1
# only necessary if using minio , AWS set it up automatically
export AWS_ENDPOINT_URL=http://localhost:9000
# for security, restricting model to be logged to Specific S3 Buckets using Regex expression
export MLFLOW_CREATE_MODEL_VERSION_SOURCE_VALIDATION_REGEX="^s3://(production-models|staging-models)/.*$"



In [3]:
%%bash 

echo ${AWS_ACCESS_KEY_ID}
echo ${AWS_SECRET_ACCESS_KEY}
echo ${AWS_DEFAULT_REGION}
echo ${AWS_ENDPOINT_URL}

minio_user
minio_password
us-east-1
http://localhost:9000


3. set-up remote data stores (conf. in .yaml ; postgres for metatdata & minio for artifacts)



In [4]:
%%bash 
# dockerfile i sable to use the .env file
docker compose -f compose_mimick_S3.yaml --env-file .env up -d

 Container end-to-end-ml-pipeline-minio-1  Running
 Container end-to-end-ml-pipeline-postgres-1  Running
 Container end-to-end-ml-pipeline-minio-create-s3_mimick-1  Recreate
 Container end-to-end-ml-pipeline-minio-create-s3_mimick-1  Recreated
 Container end-to-end-ml-pipeline-minio-1  Waiting
 Container end-to-end-ml-pipeline-minio-1  Healthy
 Container end-to-end-ml-pipeline-minio-create-s3_mimick-1  Starting
 Container end-to-end-ml-pipeline-minio-create-s3_mimick-1  Started


4. start the mlflow server BUCKET location as artifact destination


In [6]:
%%bash 

# release the port for mlflow (error of port used not seen from notebook)
lsof -ti:5005 | xargs kill -9

# start mlflow server
mlflow server \
    --backend-store-uri postgresql://${MIMICK_POSTGRES_USER}:${MIMICK_POSTGRES_PASSWORD}@localhost:5432/${MIMICK_POSTGRES_DATABASE} \
        --artifacts-destination ${MIMICK_S3_BUCKET} \
            --host 0.0.0.0 --port 5005  \
                --gunicorn-opts "--daemon"  
# gunicorn needed to run following cells in a notebook              

2025/09/28 18:32:13 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2025/09/28 18:32:13 INFO mlflow.store.db.utils: Updating database tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
2025/09/28 18:32:13 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2025/09/28 18:32:13 INFO mlflow.store.db.utils: Updating database tables
INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.


In [None]:
%%bash
echo ${MIMICK_S3_BUCKET}
echo ${MIMICK_POSTGRES_USER}
echo ${MIMICK_POSTGRES_PASSWORD}
echo ${MIMICK_POSTGRES_DATABASE}

In [4]:
# verify mlflow is up and running
import requests
requests.get("http://localhost:5002/health")

<Response [200]>

5. log to mlflow server

In [5]:
# if start from here make sure AWS environ variables are declared for mlflow to be granted access
# import os
# os.environ['AWS_ACCESS_KEY_ID'] = "minio_user"
# os.environ['AWS_SECRET_ACCESS_KEY'] = "minio_password" 
# os.environ['AWS_DEFAULT_REGION'] = "us-east-1"
# os.environ['AWS_ENDPOINT_URL'] = "http://localhost:9000"


import mlflow
mlflow.set_tracking_uri("http://localhost:5002")
# real use case would point to the actual mlflow running server

In [9]:
%%bash

echo ${MLFLOW_TRACKING_URI}

http://localhost:5005


6. send logs to postgres / artifacts to s3://minio

In [6]:
# test good connection to postgres and S3 storage
mlflow.set_experiment("Test loads on postgres & s3 bucket")
with mlflow.start_run():
     mlflow.log_params({
            "search_space_max_iter": f"arange(100, 1000, 100)"
        })
     mlflow.log_artifact("./requirements.txt")
      

2025/09/28 20:24:59 INFO mlflow.tracking.fluent: Experiment with name 'Test loads on postgres & s3 bucket' does not exist. Creating a new experiment.


🏃 View run nosy-stork-949 at: http://localhost:5002/#/experiments/1/runs/e6c41562e1a446e091ea5448a6abb367
🧪 View experiment at: http://localhost:5002/#/experiments/1


# Load Experiment on Cloud

In [8]:
import numpy as np
import mlflow
from scipy.stats import loguniform, uniform
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler


# Load files
X_train = np.loadtxt("src/data/X_train.csv", delimiter=",")
X_test = np.loadtxt("src/data/X_test.csv", delimiter=",")
y_train = np.loadtxt("src/data/y_train.csv", delimiter=",")
y_test = np.loadtxt("src/data/y_test.csv", delimiter=",")


mlflow.set_experiment("load model on cloud #1")

# from create_trained_model import train_and_log_model

# train_and_log_model(n_iter = 50, random_state = 44)

with mlflow.start_run(run_name="Randomized Hyperparameter Search"):
    pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('classifier', LogisticRegression(solver='saga', penalty='elasticnet'))
            ])
    
    param_distributions = {
            'classifier__C': loguniform(1e-5, 100),
            'classifier__l1_ratio': uniform(0, 1),
            'classifier__max_iter': np.arange(100, 1000, 100)
        }
    mlflow.log_params({
            "search_space_C": f"loguniform({1e-5}, {100})",
            "search_space_l1_ratio": f"uniform(0, 1)",
            "search_space_max_iter": f"arange(100, 1000, 100)"
        })
    
    # to create A/B testing
    n_iter = 2
    random_state = 44
    mlflow.log_params({
            "n_iter": n_iter,
            "random_state": random_state,
        })
    
    print(f"Running RandomizedSearchCV with n_iter={n_iter}...")
    random_search = RandomizedSearchCV( estimator=pipeline,
                                        param_distributions=param_distributions,
                                        n_iter=n_iter,
                                        cv=8,  # 8-fold cross-validation
                                        scoring='roc_auc',  # Use ROC AUC score for evaluation
                                        random_state=random_state,
                                        n_jobs=-1,  # Use all available CPU cores
                                        )

    random_search.fit(X_train, y_train)

    # --- 4. Log Best Results to MLflow ---
    # Get the best parameters and score from the search
    best_params = random_search.best_params_
    best_score = random_search.best_score_
    best_estimator = random_search.best_estimator_

    # MLflow will log these as a single set of parameters for this run.
    print("Logging best parameters and cross-validation score...")
    mlflow.log_params(best_params)
    mlflow.log_metric("best_cv_roc_auc", best_score)

    # Log the best estimator's details
    print("Logging best estimator model...")
    model_info = mlflow.sklearn.log_model(
        sk_model=best_estimator,
        name="Best_model",
        input_example=X_train,
        registered_model_name="Best-logreg-from-RandomSearch", # mandatory if wants to save model as .pkl
    )
    mlflow.set_logged_model_tags(
        model_info.model_id, {"Training Info": "model for A/B testing",
                              "random_state":random_state}
    )
    base_model_uri = mlflow.get_artifact_uri("Best_model")
    # --- 5. Evaluate the Best Model on the Test Set ---
    print("Evaluating the best model on the test set...")
    y_pred = best_estimator.predict(X_test)
    y_pred_proba = best_estimator.predict_proba(X_test)[:, 1]

    test_accuracy = accuracy_score(y_test, y_pred)
    test_roc_auc = roc_auc_score(y_test, y_pred_proba)

    # Log final metrics on the test set
    print("Logging final test metrics...")
    mlflow.log_metric("test_accuracy", test_accuracy)
    mlflow.log_metric("test_roc_auc", test_roc_auc)

Running RandomizedSearchCV with n_iter=2...




Logging best parameters and cross-validation score...
Logging best estimator model...


Registered model 'Best-logreg-from-RandomSearch' already exists. Creating a new version of this model...
2025/09/28 20:44:56 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: Best-logreg-from-RandomSearch, version 2
Created version '2' of model 'Best-logreg-from-RandomSearch'.


Evaluating the best model on the test set...
Logging final test metrics...
🏃 View run Randomized Hyperparameter Search at: http://localhost:5002/#/experiments/2/runs/3f1c16345ba546189b1cdeca15dcb899
🧪 View experiment at: http://localhost:5002/#/experiments/2


In [9]:
import numpy as np
import mlflow
import pandas as pd
from scipy.stats import loguniform, uniform
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler



# decide about the model named used to recover model uri
model_name = "Best_model_rs_1"
n_iter=2
random_state=1

# Load files
X_train = np.loadtxt("src/data/X_train.csv", delimiter=",")
X_test = np.loadtxt("src/data/X_test.csv", delimiter=",")
y_train = np.loadtxt("src/data/y_train.csv", delimiter=",")
y_test = np.loadtxt("src/data/y_test.csv", delimiter=",")

# evaluation with mlflow
from sklearn import datasets
d = datasets.load_breast_cancer()
eval_data = X_test.copy()
eval_data = pd.DataFrame(eval_data, columns=d.feature_names)
eval_data["label"] = y_test



mlflow.set_experiment("load model on cloud #2")

# from create_trained_model import train_and_log_model
# train_and_log_model(n_iter = 50, random_state = 44)

with mlflow.start_run(run_name="Randomized Hyperparameter Search"):
    pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('classifier', LogisticRegression(solver='saga', penalty='elasticnet'))
            ])
    
    param_distributions = {
            'classifier__C': loguniform(1e-5, 100),
            'classifier__l1_ratio': uniform(0, 1),
            'classifier__max_iter': np.arange(100, 1000, 100)
        }
    
    print(f"Running RandomizedSearchCV with n_iter={n_iter}...")
    random_search = RandomizedSearchCV( estimator=pipeline,
                                        param_distributions=param_distributions,
                                        n_iter=n_iter,
                                        cv=8,  # 8-fold cross-validation
                                        scoring='roc_auc',  # Use ROC AUC score for evaluation
                                        random_state=random_state,
                                        n_jobs=-1,  # Use all available CPU cores
                                        )

    random_search.fit(X_train, y_train)

    # MLflow will log these as a single set of parameters for this run.
    print("Logging best parameters and cross-validation score...")
    mlflow.log_params(best_params)
    mlflow.log_metric("best_cv_roc_auc", best_score)

    # Infer signature
    print("Creature signature for model...")
    from mlflow.models import infer_signature
    signature = infer_signature(X_test, random_search.best_estimator_.predict(X_test))
    # Log the best estimator's details
    print("Logging best estimator model...")
    model_info = mlflow.sklearn.log_model(
        sk_model=best_estimator,
        name=f"{model_name}",
        input_example=X_train,
        registered_model_name="Best-logreg-from-RandomSearch", # mandatory if wants to save model as .pkl
        tags={"Training Info": "model for A/B testing",
                              "random_state":random_state},
        signature=signature
    )

    model_uri = mlflow.get_artifact_uri(f"{model_name}")
        
    # --- 5. Evaluate the Best Model on the Test Set ---
    # Comprehensive evaluation with MLflow
    result = mlflow.evaluate(
        model_info.model_uri,
        eval_data,
        targets="label",
        model_type="classifier", 
        evaluators=["default"],
    )

    # Log final metrics on the test set
    print("Logging final test metrics...")
    mlflow.log_metric("test_accuracy", test_accuracy)
    mlflow.log_metric("test_roc_auc", test_roc_auc)

2025/09/28 20:45:37 INFO mlflow.tracking.fluent: Experiment with name 'load model on cloud #2' does not exist. Creating a new experiment.


Running RandomizedSearchCV with n_iter=2...
Logging best parameters and cross-validation score...
Creature signature for model...
Logging best estimator model...


Registered model 'Best-logreg-from-RandomSearch' already exists. Creating a new version of this model...
2025/09/28 20:45:40 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: Best-logreg-from-RandomSearch, version 3
Created version '3' of model 'Best-logreg-from-RandomSearch'.
  from .autonotebook import tqdm as notebook_tqdm
Downloading artifacts: 100%|██████████| 7/7 [00:00<00:00, 14.30it/s]
2025/09/28 20:45:45 INFO mlflow.tracking.fluent: Active model is set to the logged model with ID: m-7946d37476964a01bbcb0d6b984249d8
2025/09/28 20:45:45 INFO mlflow.tracking.fluent: Use `mlflow.set_active_model` to set the active model to a different one if needed.
2025/09/28 20:45:46 INFO mlflow.models.evaluation.evaluators.classifier: The evaluation dataset is inferred as binary dataset, positive label is 1.0, negative label is 0.0.
2025/09/28 20:45:46 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on f

Logging final test metrics...
🏃 View run Randomized Hyperparameter Search at: http://localhost:5002/#/experiments/3/runs/da5ac2607b294d9c910df48854c4c48a
🧪 View experiment at: http://localhost:5002/#/experiments/3


<Figure size 1050x700 with 0 Axes>

In [11]:
model_uri

'mlflow-artifacts:/3/da5ac2607b294d9c910df48854c4c48a/artifacts/Best_model_rs_1'

In [12]:
from mlflow.models import MetricThreshold

# First, evaluate your scikit-learn model
result = mlflow.evaluate(model_uri, eval_data, targets="label", model_type="classifier")

# Define quality thresholds for classification models
quality_thresholds = {
    "accuracy_score": MetricThreshold(threshold=0.85, greater_is_better=True),
    "f1_score": MetricThreshold(threshold=0.80, greater_is_better=True),
    "roc_auc": MetricThreshold(threshold=0.75, greater_is_better=True),
}

# Validate model meets quality standards
try:
    mlflow.validate_evaluation_results(
        candidate_result=result,
        validation_thresholds=quality_thresholds,
    )
    print("✅ Scikit-learn model meets all quality thresholds")
except mlflow.exceptions.ModelValidationFailedException as e:
    print(f"❌ Model failed validation: {e}")

# Compare against baseline model (e.g., previous model version)
baseline_result = mlflow.evaluate(
    base_model_uri, eval_data, targets="label", model_type="classifier"
)

# Validate improvement over baseline
improvement_thresholds = {
    "f1_score": MetricThreshold(
        threshold=0.02, greater_is_better=True  # Must be 2% better
    ),
}

try:
    mlflow.validate_evaluation_results(
        candidate_result=result,
        baseline_result=baseline_result,
        validation_thresholds=improvement_thresholds,
    )
    print("✅ New model improves over baseline")
except mlflow.exceptions.ModelValidationFailedException as e:
    print(f"❌ Model doesn't improve sufficiently: {e}")



Downloading artifacts:   0%|          | 0/1 [02:15<?, ?it/s]


KeyboardInterrupt: 

# DEBUGGING

## debug minio access

In [13]:
import os

# Set the environment variables
os.environ['AWS_ACCESS_KEY_ID'] = os.getenv('MIMICK_S3_USER')
os.environ['AWS_SECRET_ACCESS_KEY'] = os.getenv('MIMICK_S3_PASSWORD')
os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'  # MinIO default
os.environ['MLFLOW_S3_ENDPOINT_URL'] = "http://172.19.0.2:9000"  # Typical MinIO endpoint


In [14]:
import boto3
try:
    s3 = boto3.client('s3')
    s3.head_bucket(Bucket='s3mimick')
    print("S3 connection successful")
except Exception as e:
    print(f"S3 connection failed: {e}")

S3 connection successful


In [18]:
import os
print("Current environment variables:")
print(f"AWS_ACCESS_KEY_ID: {os.environ.get('AWS_ACCESS_KEY_ID', 'NOT SET')}")
print(f"AWS_SECRET_ACCESS_KEY: {os.environ.get('AWS_SECRET_ACCESS_KEY', 'NOT SET')}")
print(f"MLFLOW_S3_ENDPOINT_URL: {os.environ.get('MLFLOW_S3_ENDPOINT_URL', 'NOT SET')}")

# Check what your Docker environment variables are
print(f"MIMICK_S3_USER: {os.environ.get('MIMICK_S3_USER', 'NOT SET')}")
print(f"MIMICK_S3_PASSWORD: {os.environ.get('MIMICK_S3_PASSWORD', 'NOT SET')}")

Current environment variables:
AWS_ACCESS_KEY_ID: minio_user
AWS_SECRET_ACCESS_KEY: minio_password
MLFLOW_S3_ENDPOINT_URL: http://172.19.0.2:9000
MIMICK_S3_USER: minio_user
MIMICK_S3_PASSWORD: minio_password


## debug credential connections

In [16]:
%%bash
# verify the container is up and running
echo "Checking Docker containers:"
docker compose -f compose_mimick_S3.yaml ps

echo -e "\nChecking container logs for bucket creation:"
docker compose -f compose_mimick_S3.yaml logs minio-create-s3_mimick

Checking Docker containers:
NAME                                IMAGE             COMMAND                  SERVICE    CREATED       STATUS                 PORTS
end-to-end-ml-pipeline-minio-1      minio/minio       "/usr/bin/docker-ent…"   minio      3 hours ago   Up 3 hours (healthy)   0.0.0.0:9000-9001->9000-9001/tcp, [::]:9000-9001->9000-9001/tcp
end-to-end-ml-pipeline-postgres-1   postgres:latest   "docker-entrypoint.s…"   postgres   3 hours ago   Up 3 hours             0.0.0.0:5432->5432/tcp, [::]:5432->5432/tcp

Checking container logs for bucket creation:
minio-create-s3_mimick-1  | Added `minio` successfully.
minio-create-s3_mimick-1  | mc: <ERROR> Unable to list folder. Bucket `staging-models` does not exist.
minio-create-s3_mimick-1  | Bucket created successfully `minio/staging-models`.
minio-create-s3_mimick-1  | Bucket created successfully `minio/production-models`.


verify connection can be made to bucket

In [17]:
import os
import boto3
from botocore.client import Config

try:
    # Create S3 client with MinIO endpoint
    s3 = boto3.client(
        's3',
        endpoint_url=os.environ.get('MLFLOW_S3_ENDPOINT_URL'),
        aws_access_key_id=os.environ.get('MIMICK_S3_USER'),
        aws_secret_access_key=os.environ.get('MIMICK_S3_PASSWORD'),
        config=Config(signature_version='s3v4'),
        region_name='us-east-1'
    )
    
    # Test connection
    response = s3.list_buckets()
    print("S3 connection successful!")
    print("Available buckets:", [bucket['Name'] for bucket in response['Buckets']])
    
    # Test specific bucket
    s3.head_bucket(Bucket='s3mimick')
    print("Bucket 's3mimick' is accessible!")
    
except Exception as e:
    print(f"S3 connection failed: {e}")
    import traceback
    traceback.print_exc()

S3 connection failed: Connect timeout on endpoint URL: "http://172.19.0.2:9000/"


Traceback (most recent call last):
  File "/Applications/anaconda3/envs/ML_Flow/lib/python3.10/site-packages/urllib3/connection.py", line 199, in _new_conn
    sock = connection.create_connection(
  File "/Applications/anaconda3/envs/ML_Flow/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/Applications/anaconda3/envs/ML_Flow/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
TimeoutError: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Applications/anaconda3/envs/ML_Flow/lib/python3.10/site-packages/botocore/httpsession.py", line 465, in send
    urllib_response = conn.urlopen(
  File "/Applications/anaconda3/envs/ML_Flow/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "/Applications/anaconda3/envs/ML_Flow/lib/python3.10/si

## debug environment variables

bash
```
✗ conda activate ML_Flow
✗ source .env
✗ docker compose -f compose_mimick_S3.yaml down
✗ mlflow server --backend-store-uri postgresql://${MIMICK_POSTGRES_USER}:${MIMICK_POSTGRES_PASSWORD}@localhost:5432/${MIMICK_POSTGRES_DATABASE} --artifacts-destination ${MIMICK_S3_BUCKET} --host 0.0.0.0 --port 5003
```
==> unable to locate credentials


```
✗ conda activate ML_Flow
✗ export AWS_ACCESS_KEY_ID=minio_user
✗ export AWS_SECRET_ACCESS_KEY=minio_password
✗ source .env
✗ docker compose -f compose_mimick_S3.yaml down
✗ mlflow server --backend-store-uri postgresql://${MIMICK_POSTGRES_USER}:${MIMICK_POSTGRES_PASSWORD}@localhost:5432/${MIMICK_POSTGRES_DATABASE} --artifacts-destination ${MIMICK_S3_BUCKET} --host 0.0.0.0 --port 5003
```
==> The AWS Access Key Id you provided does not exist in our records.


* IN VSCODE TERMINAL, the .env FILE CAN'T BE ACCSESSS BY MLFLOW
```
✗ conda activate ML_Flow
✗ docker compose -f compose_mimick_S3.yaml up -d
✗ source .env
✗ mlflow server --backend-store-uri postgresql://${MIMICK_POSTGRES_USER}:${MIMICK_POSTGRES_PASSWORD}@localhost:5432/${MIMICK_POSTGRES_DATABASE} --artifacts-destination ${MIMICK_S3_BUCKET} --host 0.0.0.0 --port 5003
```
==> no credential
* Necessecary to export manually in VSCode terminal (it struggle finding .env path)
```
✗ export AWS_ENDPOINT_URL=http://localhost:9000
✗ export AWS_ACCESS_KEY_ID=minio_user
✗ export AWS_SECRET_ACCESS_KEY=minio_password
✗ mlflow server --backend-store-uri postgresql://${MIMICK_POSTGRES_USER}:${MIMICK_POSTGRES_PASSWORD}@localhost:5432/${MIMICK_POSTGRES_DATABASE} --artifacts-destination ${MIMICK_S3_BUCKET} --host 0.0.0.0 --port 5003
```
==> ok