MLflow Container Setup

Setup focuses on experiment and artifact tracking using mlflow.

Quick start

Requires poetry, docker and docker compose.

poetry install

Build image, set environment variables, and start containers (within docker folder)

cd docker && \
./build_image.sh \
--repository localhost/mlflow \
--tag latest && \
\
echo '#!/bin/bash

# mlflow settings
export MLFLOW_PORT=5000

export POSTGRES_DATA=$(pwd)/data/pgdata
export STORAGE_DATA=$(pwd)/data/storage

# db settings
export POSTGRES_USER=mlflow
export POSTGRES_PASSWORD=mlflow123

# (optional) mlflow s3 storage backend settings (e.g. can be minio)
# export MLFLOW_ARTIFACTS_DESTINATION=s3://yourbucketname/yourfolder
# export AWS_ACCESS_KEY_ID=youraccesskey
# export AWS_SECRET_ACCESS_KEY=yoursecretaccesskey
# export MLFLOW_S3_ENDPOINT_URL=https://minio.yourdomain.com
# export MLFLOW_S3_IGNORE_TLS=true' > .env.sh && \
\
source .env.sh && \
\
if [ ! -d "./data/pgdata" ] ; then mkdir -p $POSTGRES_DATA; fi && \
if [ ! -d "./data/storage" ] ; then mkdir -p $STORAGE_DATA; fi && \
\
docker compose up -d

Now checkout http://localhost:5000.

Samples

Run sample tracking script

poetry run python samples/tracking.py

Run sample artifact script

poetry run python samples/artifacts.py

Navigate to http://localhost:5000 to see the MLflow UI and the experiment tracking.

Local Setup

Using plain python and mlflow server.

Basic

Using poetry. Runs and artifacts are stored in the mlruns and mlartifacts directories.

poetry install && \
poetry run mlflow server --host 0.0.0.0

Backends

Database

Using postgres as backend.

docker run -d --name ml-postgres -p 5432:5432 \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres_password \
-e POSTGRES_DB=mlflow \
postgres:latest

Runs mlflow server with postgres backend (only psycopg2 supported)

poetry run mlflow server --backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow --host 0.0.0.0

Run sample tracking script

poetry run python samples/tracking.py

Artifacts Store

s3

Set S3 credentials and endpoint URL

echo '
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export MLFLOW_S3_ENDPOINT_URL=...
' > .env.sh

Start mlflow server with s3 backend (default)

source .env.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--default-artifact-root s3://my-bucket/mlflow/test \
--host 0.0.0.0

Run (client reqpuires s3 credentials)

source .env.sh && \
poetry run python samples/artifacts.py

Proxied s3 backend for artifacts (client do not need to know s3 credentials)

source .env.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--artifacts-destination s3://my-bucket/mlflow/test \
--host 0.0.0.0

Run (client do not need to know s3 credentials)

poetry run python samples/artifacts.py

azure blob storage

Set azure credentials and endpoint URL - more info here and here.

echo "
export AZURE_STORAGE_CONNECTION_STRING='AccountName=<YOUR_ACCOUNT_NAME>;AccountKey=<YOUR_KEY>;EndpointSuffix=core.windows.net;DefaultEndpointsProtocol=https;'
export AZURE_STORAGE_ACCESS_KEY='<YOUR_KEY>'
" > .env_azure.sh

Proxied azure blob storage backend for artifacts (client do not need to know azure credentials)

source .env_azure.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--artifacts-destination wasbs://my-container@my-storage-account.blob.core.windows.net/my-folder \
--host 0.0.0.0

Run (client do not need to know azure credentials)

poetry run python samples/artifacts.py

Metrics

Using prometheus as metrics backend.

source .env.sh && \
poetry run mlflow server \
--backend-store-uri postgresql+psycopg2://postgres:postgres_password@localhost:5432/mlflow \
--artifacts-destination s3://my-bucket/mlflow/test \
--expose-prometheus ./metrics \
--host 0.0.0.0

Docker

Helm

For running mlflow as tracking service it is highly recommended to use gunicorn with gevent workers (non blocking io optimized). The following describes a corresponding config map

apiVersion: v1
kind: ConfigMap
metadata:
  name: mlflow-additional-config
data:
  MLFLOW_HOST: "0.0.0.0"
  MLFLOW_PORT: "5000"
  MLFLOW_ADDITIONAL_OPTIONS: "--gunicorn-opts '--worker-class gevent --threads 4 --timeout 300 --keep-alive 300 --log-level INFO'"

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
charts		charts
docker		docker
samples		samples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

charts

charts

docker

docker

samples

samples

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

poetry.lock

poetry.lock

pyproject.toml

pyproject.toml

Repository files navigation

MLflow Container Setup

Quick start

Samples

Local Setup

Basic

Backends

Database

Artifacts Store

s3

azure blob storage

Metrics

Docker

Helm

About

Releases

Packages

Languages

License

clemens33/mlflow

Folders and files

Latest commit

History

Repository files navigation

MLflow Container Setup

Quick start

Samples

Local Setup

Basic

Backends

Database

Artifacts Store

s3

azure blob storage

Metrics

About

Topics

Resources

License

Stars

Watchers

Forks

Languages