# Managed MLflow and Databricks Machine Learning

Thursday, May 15, 2025

[Invitation on Luma](https://lu.ma/fr1zsk86), [LinkedIn](https://www.linkedin.com/groups/9307761/), [Meetup](https://www.meetup.com/warsaw-data-engineering/events/307771412/)


# 📚 Agenda

1. MLflow Client API
1. Demo: Train and Register Model
1. Demo: Serve Model
1. Demo: Running MLflow's `examples/databricks` (and Editable Install in Python)
1. Demo: MLflow's `dev/pyproject.py` and uv

⏰ Całkowity czas trwania spotkania: **1h 15min**


# LinkedIn Poll

[Czy weźmiesz udział w spotkaniu stacjonarnym?](https://www.linkedin.com/feed/update/urn:li:groupPost:9307761-7327398491633156096/)


# Event Question

[O czym chciał(a)byś usłyszeć podczas meetupu? Rzuć ciekawym pomysłem na kolejne edycje](https://www.meetup.com/warsaw-data-engineering/events/307771412/attendees/) 🙏



# 📢 News

Things worth watching out for...


## 🎉 New members joined Warsaw Data Engineering

[You now have 594 members!](https://www.meetup.com/warsaw-data-engineering/)

Co zainteresowało Cię w Warsaw Data Engineering Meetup, że zdecydowałaś/-eś się przyłączyć?

1. Data Engineering
1. Zainteresowanie tematem
1. Tematyka grupy


## New Versions

What has changed in the tooling space since we last met? I.e. hunting down the features to learn more about.

* [Databricks CLI 0.252.0](https://github.com/databricks/cli/releases/tag/v0.252.0)
* [MLflow 3.0.0rc2](https://github.com/mlflow/mlflow/releases/tag/v3.0.0rc2)
* [uv 0.7.3](https://github.com/astral-sh/uv/releases/tag/0.7.3)
* [PydanticAI 0.2.4](https://github.com/pydantic/pydantic-ai/releases/tag/v0.2.4)
* [Zen ML 0.82.1](https://github.com/zenml-io/zenml/releases/tag/0.82.1)


# MLflow Client API

## Installation


MLflow's [Using the MLflow Client API](https://mlflow.org/docs/latest/getting-started/logging-first-model/step2-mlflow-client/)

`%pip install mlflow-skinny[databricks]` or use [Databricks Runtime for Machine Learning](https://docs.databricks.com/aws/en/machine-learning/databricks-runtime-ml) (**Databricks Runtime ML** compute with pre-built machine learning and deep learning infrastructure including the most common ML and DL libraries)

[MLflow-Databricks Runtime compatibility matrix](https://docs.databricks.com/aws/en/release-notes/runtime/#mlflow-databricks-runtime-compatibility-matrix)

In [0]:
%pip install mlflow-skinny[databricks]

In [0]:
%restart_python

In [0]:
import mlflow

In [0]:
assert mlflow.VERSION == "2.22.0"


```
>>> mlflow.set_
mlflow.set_experiment(                             mlflow.set_registry_uri(                           mlflow.set_tag(
mlflow.set_experiment_tag(                         mlflow.set_system_metrics_node_id(                 mlflow.set_tags(
mlflow.set_experiment_tags(                        mlflow.set_system_metrics_samples_before_logging(  mlflow.set_tracking_uri(
mlflow.set_prompt_alias(                           mlflow.set_system_metrics_sampling_interval(
```


From [Configure MLflow client to access models in Unity Catalog](https://docs.databricks.com/aws/en/machine-learning/manage-model-lifecycle/#configure-mlflow-client-to-access-models-in-unity-catalog):

1. IF your workspace's default catalog is in Unity Catalog (rather than `hive_metastore`)
1. AND you are running a cluster using Databricks Runtime 13.3 LTS or above
1. THEN models are automatically created in and loaded from the default catalog.

Otherwise, the MLflow Python client creates models in **Databricks Model Registry** (a Databricks workspace's model registry).


![Registered Models](./databricks_ml_registered_models.png)

👉 [Registered Models](https://curriculum-dev.cloud.databricks.com/ml/models?o=3551974319838082)


## set_registry_uri


1. Sets the registry server URI.
1. Especially useful if you have a registry server that's different from the tracking server.
1. [mlflow/tracking/_model_registry/utils.py](https://github.com/mlflow/mlflow/blob/v2.22.0/mlflow/tracking/_model_registry/utils.py#L42-L52)
1. [MLflow Model Registry](https://mlflow.org/docs/latest/model-registry/)

In [0]:
import mlflow
assert mlflow.get_registry_uri() == "databricks-uc"

In [0]:
help(mlflow.set_registry_uri)

## 🧑‍💻 Demo: Train and Register Model


* [mlflow/tracking/_model_registry/fluent.py](https://github.com/mlflow/mlflow/blob/v2.22.0/mlflow/tracking/_model_registry/fluent.py#L20-L49)
    * scikit-learn's `RandomForestRegressor`
* [Register a model to Unity Catalog using autologging](https://docs.databricks.com/aws/en/machine-learning/manage-model-lifecycle/#register-a-model-to-unity-catalog-using-autologging)
    * scikit-learn's `RandomForestClassifier`
* scikit-learn's [1.11.2. Random forests and other randomized tree ensembles](https://scikit-learn.org/stable/modules/ensemble.html#forest)
* Databricks Machine Learning's [Example notebook](https://docs.databricks.com/aws/en/machine-learning/manage-model-lifecycle/#example-notebook)

In [0]:
# Generate a random regression problem.
from sklearn.datasets import make_regression

X, y = make_regression(n_features=4, n_informative=2, random_state=0, shuffle=False)

In [0]:
from sklearn.ensemble import RandomForestRegressor

params = {"n_estimators": 3, "random_state": 42}
rfr = RandomForestRegressor(**params).fit(X, y)
rfr

In [0]:
rfr.predict(X)


Up to this cell, there was nothing MLflow-specific. It was all scikit-learn-specific.


Review [Registered Models](https://curriculum-dev.cloud.databricks.com/ml/models?o=3551974319838082) (Owned by me)

In [0]:
%sql

CREATE SCHEMA IF NOT EXISTS jacek_laskowski.mlflow


👉 [jacek_laskowski.mlflow](https://curriculum-dev.cloud.databricks.com/explore/data/jacek_laskowski/mlflow)

In [0]:
from mlflow.models import infer_signature

with mlflow.start_run() as run:
    rfr = RandomForestRegressor(**params).fit(X, y)
    signature = infer_signature(X, rfr.predict(X))
    mlflow.log_params(params)
    mlflow.sklearn.log_model(
        sk_model=rfr,
        artifact_path="sklearn-model",
        signature=signature,
        registered_model_name="jacek_laskowski.mlflow.sklearn_model",
    )

In [0]:
from mlflow.models import infer_signature

signature = infer_signature(X, rfr.predict(X))
signature

## 🧑‍💻 Demo: Serve Model


https://curriculum-dev.cloud.databricks.com/ml/endpoints/jacek_laskowski_demo?o=3551974319838082


databricks serving-endpoints list | grep jacek_laskowski


Once up and running, get the query schema of the serving endpoint in OpenAPI format.

databricks serving-endpoints get-open-api jacek_laskowski_demo


# 🧑‍💻 Demo: Running MLflow's examples/databricks (and Editable Install in Python)

[examples/databricks](https://github.com/mlflow/mlflow/tree/master/examples/databricks)


## Step 0. Clone MLflow Repo

`git clone` https://github.com/mlflow/mlflow


## Step 1. Install Dependencies


```
uv pip install databricks-connect
uv pip install scikit-learn
```


## Step 2. Run Experiment


```
❯ python examples/databricks/dbconnect.py --cluster-id xxx
2025/05/08 17:51:04 INFO mlflow.tracking.fluent: Experiment with name '/Users/jacek@japila.pl/dbconnect' does not exist. Creating a new experiment.
🏃 View run smiling-ox-667 at: https://curriculum-dev.cloud.databricks.com/ml/experiments/1275781889574864/runs/b88fd8406e7d410bac8992258093ef5d
🧪 View experiment at: https://curriculum-dev.cloud.databricks.com/ml/experiments/1275781889574864
Traceback (most recent call last):
  File "/Users/jacek/oss/mlflow/examples/databricks/dbconnect.py", line 56, in <module>
    main()
    ~~~~^^
  File "/Users/jacek/oss/mlflow/examples/databricks/dbconnect.py", line 37, in main
    model_info = mlflow.sklearn.log_model(model, name="model", signature=signature)
TypeError: log_model() got an unexpected keyword argument 'name'
```


## Step 3. Editable Install

[Development Mode (a.k.a. “Editable Installs”)](https://setuptools.pypa.io/en/latest/userguide/development_mode.html)


```
uv pip install -e .
```


```
❯ python examples/databricks/dbconnect.py --cluster-id xxx
🏃 View run carefree-duck-680 at: https://curriculum-dev.cloud.databricks.com/ml/experiments/1275781889574864/runs/89a774fb54da4a5c844764d3e40ad638
🧪 View experiment at: https://curriculum-dev.cloud.databricks.com/ml/experiments/1275781889574864
Traceback (most recent call last):
  File "/Users/jacek/oss/mlflow/examples/databricks/dbconnect.py", line 56, in <module>
    main()
    ~~~~^^
  File "/Users/jacek/oss/mlflow/examples/databricks/dbconnect.py", line 37, in main
    model_info = mlflow.sklearn.log_model(model, name="model", signature=signature)
  File "/Users/jacek/oss/mlflow/mlflow/sklearn/__init__.py", line 426, in log_model
    return Model.log(
           ~~~~~~~~~^
        artifact_path=artifact_path,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<18 lines>...
        model_id=model_id,
        ^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/jacek/oss/mlflow/mlflow/models/model.py", line 928, in log
    model = mlflow.initialize_logged_model(
        # TODO: Update model name
    ...<6 lines>...
        else None,
    )
  File "/Users/jacek/oss/mlflow/mlflow/tracking/fluent.py", line 2122, in initialize_logged_model
    model = _create_logged_model(
        name=name,
    ...<4 lines>...
        experiment_id=experiment_id,
    )
  File "/Users/jacek/oss/mlflow/mlflow/tracking/fluent.py", line 2232, in _create_logged_model
    return MlflowClient().create_logged_model(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        experiment_id=experiment_id,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
        model_type=model_type,
        ^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/jacek/oss/mlflow/mlflow/tracking/client.py", line 5218, in create_logged_model
    return self._tracking_client.create_logged_model(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        experiment_id, name, source_run_id, tags, params, model_type
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/jacek/oss/mlflow/mlflow/tracking/_tracking_service/client.py", line 815, in create_logged_model
    return self.store.create_logged_model(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        experiment_id=experiment_id,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<8 lines>...
        model_type=model_type,
        ^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/jacek/oss/mlflow/mlflow/store/tracking/rest_store.py", line 904, in create_logged_model
    response_proto = self._call_endpoint(CreateLoggedModel, req_body)
  File "/Users/jacek/oss/mlflow/mlflow/store/tracking/rest_store.py", line 129, in _call_endpoint
    return call_endpoint(
        self.get_host_creds(),
    ...<4 lines>...
        retry_timeout_seconds=retry_timeout_seconds,
    )
  File "/Users/jacek/oss/mlflow/mlflow/utils/rest_utils.py", line 474, in call_endpoint
    response = verify_rest_response(response, endpoint)
  File "/Users/jacek/oss/mlflow/mlflow/utils/rest_utils.py", line 261, in verify_rest_response
    raise RestException(json.loads(response.text))
mlflow.exceptions.RestException: BAD_REQUEST: This API is not enabled.
```

## Step 4. BAD_REQUEST: This API is not enabled.

Hunting down the root cause of the exception.


Modify `mlflow/utils/rest_utils.py:261`


```
❯ python examples/databricks/dbconnect.py --cluster-id xxx
>>> endpoint /api/2.0/mlflow/experiments/get-by-name
>>> endpoint /api/2.0/mlflow/runs/create
>>> endpoint /api/2.0/mlflow/runs/get
>>> endpoint /api/2.0/mlflow/logged-models
>>> endpoint /api/2.0/mlflow/runs/get
🏃 View run zealous-worm-360 at: https://curriculum-dev.cloud.databricks.com/ml/experiments/1275781889574864/runs/8cad690b9bdd45ab96658987f4039180
🧪 View experiment at: https://curriculum-dev.cloud.databricks.com/ml/experiments/1275781889574864
>>> endpoint /api/2.0/mlflow/runs/update
Traceback (most recent call last):
  File "/Users/jacek/oss/mlflow/examples/databricks/dbconnect.py", line 56, in <module>
    main()
    ~~~~^^
...
  File "/Users/jacek/oss/mlflow/mlflow/utils/rest_utils.py", line 262, in verify_rest_response
    raise RestException(json.loads(response.text))
mlflow.exceptions.RestException: BAD_REQUEST: This API is not enabled.
```


## Step 5. MLflow API reference

[MLflow API reference](https://docs.databricks.com/aws/en/reference/mlflow-api)


### Experiments

[Experiments](https://docs.databricks.com/api/workspace/experiments)

1. **Experiments** are the primary unit of organization in MLflow.
1. All **MLflow runs** belong to an experiment.
1. Each experiment lets you visualize, search, and compare runs, as well as download run artifacts or metadata for analysis in other tools.
1. Experiments are maintained in a Databricks-hosted MLflow tracking server.
1. Experiments are located in the workspace file tree.
1. You manage experiments using the same tools you use to manage other workspace objects such as folders, notebooks, and libraries.

### Databricks CLI

<br>

```
❯ databricks | more
...
Machine Learning
  experiments                            Experiments are the primary unit of organization in MLflow; all MLflow runs belong to an experiment.
  model-registry                         Note: This API reference documents APIs for the Workspace Model Registry.
Real-time Serving
  serving-endpoints                      The Serving Endpoints API allows you to create, update, and delete model serving endpoints.
Unity Catalog
  model-versions                         Databricks provides a hosted version of MLflow Model Registry in Unity Catalog.
  registered-models                      Databricks provides a hosted version of MLflow Model Registry in Unity Catalog.
...
```


```
❯ databricks experiments list-experiments | jq '.[].name' | grep 'jacek@japila.pl'
"/Users/jacek@japila.pl/dbconnect"
"/Users/jacek@japila.pl/demo-experiment"
```


```
databricks registered-models list | jq '.[].full_name'
```

# 🧑‍💻 Demo: MLflow's dev/pyproject.py and uv

1. What I learnt while reviewing the source code of MLflow and having found [dev/pyproject.py](https://github.com/mlflow/mlflow/blob/master/dev/pyproject.py) to execute locally.
1. And how uv helped.

Why it even matters?! 🤨


## Step 0. Clone MLflow Repo

`git clone` https://github.com/mlflow/mlflow


## Step 1. uvx python dev/pyproject.py

<br>

```
❯ uvx python dev/pyproject.py
Traceback (most recent call last):
  File "/Users/jacek/oss/mlflow/./dev/pyproject.py", line 10, in <module>
    import toml
ModuleNotFoundError: No module named 'toml'
```


## Step 2. Set Up Dev Env


`uv venv .dev_pyproject_py_deep_dive`

`source .dev_pyproject_py_deep_dive/bin/activate`


## Step 3. Virtual Envs in Python

Please note that I'm a JVM dev (and only very recently switched to Python).


`uv pip install toml`

`python ./dev/pyproject.py`

`type python` and it finally clicked how virtual envs work 🔥

[venv — Creation of virtual environments](https://docs.python.org/3/library/venv.html)

## Step 4. It Works 🥳


`uv pip install pyyaml`

> ⚠️ NOTE
>
> All the dev deps are in [dev/requirements.txt](https://github.com/mlflow/mlflow/blob/master/dev/requirements.txt)

`uv pip install packaging`

`brew install taplo`

`python ./dev/pyproject.py` seems to change nothing, huh?! 🤨

💎 Think what the script does and you will know why nothing seems changed 😉

# That's all Folks! 👋

![Warner Bros., Public domain, via Wikimedia Commons](https://upload.wikimedia.org/wikipedia/commons/e/ea/Thats_all_folks.svg)


# 🙋 Questions and Answers


# 💡 Ideas for Future Events

1. [Delta Live Tables](https://docs.databricks.com/en/delta-live-tables/index.html) with uv and pydantic
1. Explore more [Pydantic](https://docs.pydantic.dev/latest/) features
1. Create a new DAB template with `uv` as the project management tool (based on `default-python` template). Start from `databricks bundle init --help`.


## Databricks ML's Model training examples

Review [Model training examples](https://docs.databricks.com/aws/en/machine-learning/train-model/training-examples)


## Managed MLflow on Databricks

It all started with [Manage model lifecycle in Unity Catalog](https://docs.databricks.com/aws/en/machine-learning/manage-model-lifecycle/) and [Tutorials: Get started with AI and machine learning](https://docs.databricks.com/aws/en/machine-learning/ml-tutorials)


A Data Engineer's take on the matters:

> The key is to think about **model training workload** as a Python code and **ML model** as a directory with a bunch of files.


```py
mlflow.start_run()

model_run = mlflow.active_run()

mlflow.end_run()

print(model_run.info)
```


## MLflow Prompt Registry

In [MLflow 2.21.0](https://github.com/mlflow/mlflow/releases/tag/v2.21.0):

>  **Prompt Registry**: MLflow Prompt Registry is a powerful tool that streamlines prompt engineering and management in your GenAI applications. It enables you to version, track, and reuse prompts across your organization.

[MLflow Prompt Registry](https://mlflow.org/docs/latest/prompts/)

## MLflow Tracing

In [MLflow 2.21.0](https://github.com/mlflow/mlflow/releases/tag/v2.21.0):

>  **Enhanced Tracing Capabilities**: MLflow Tracing now supports synchronous/asynchronous generators and auto-tracing for Async OpenAI, providing more flexible and comprehensive tracing options.

[MLflow Tracing for LLM Observability](https://mlflow.org/docs/latest/tracing/)


## Databricks Asset Bundles and Library Dependencies

[PyPI package](https://docs.databricks.com/aws/en/dev-tools/bundles/library-dependencies#pypi-package)

Databricks CLI v0.244.0: [Support all version identifiers as per PEP440 in environment deps](https://github.com/databricks/cli/releases/tag/v0.244.0)


## Databricks Asset Bundles and Set the target catalog and schema

Databricks CLI v0.243.0: [Use schema field for pipeline in builtin template](https://github.com/databricks/cli/releases/tag/v0.243.0):

> The schema field implies the lifecycle of tables is no longer tied to the lifecycle of the pipeline, as was the case with the target field.

[Set the target catalog and schema](https://docs.databricks.com/aws/en/dlt/target-schema)

## uv with PyTorch

uv 0.6.9: [Add experimental --torch-backend to the PyTorch guide](https://github.com/astral-sh/uv/releases/tag/0.6.9)

[Using uv with PyTorch](https://docs.astral.sh/uv/guides/integration/pytorch/)