#### Locking model dependencies in MLflow means freezing the exact software environment used to train a model so it can be reproduced later. By default, MLflow records only high-level dependencies, which can change over time and cause reproducibility issues. When locking is enabled, MLflow pins all dependencies, including transitive ones, to the exact versions used during training. This guarantees consistent behavior when the model is loaded or served in the future. Dependency locking is important for teaching, research reproducibility, model registry, and production deployment, but less critical during early experimentation. To use it, MLflow and the uv package (pip install uv) must be installed. Locking is enabled by setting MLFLOW_LOCK_MODEL_DEPENDENCIES=true before logging the model. In notebooks or IDEs, setting this variable inside Python with os.environ is usually the safest approach.

#### A simple example of a transitive dependency is numpy or when you use scikit-learn. You explicitly install and use scikit-learn in your code, but scikit-learn itself depends on other packages such as numpy, scipy, joblib, and threadpoolctl. Even though your code never imports joblib or threadpoolctl, they are installed automatically because scikit-learn needs them to function. Those indirect packages are transitive dependencies.

In [2]:
import os
os.environ["MLFLOW_LOCK_MODEL_DEPENDENCIES"] = "true"

import mlflow
import mlflow.sklearn
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

print("MLFLOW_LOCK_MODEL_DEPENDENCIES in Python:", os.environ.get("MLFLOW_LOCK_MODEL_DEPENDENCIES"))

mlflow.set_experiment("MLflow Dependencies Quickstart")
mlflow.sklearn.autolog()
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)