Modal-ZenML Model Deployment

A framework for training machine learning models with ZenML and deploying them to Modal.

Overview

This project demonstrates an end-to-end ML workflow:

Training ML models (scikit-learn and PyTorch)
Registering them with ZenML's Model Control Plane
Deploying them to Modal for scalable, serverless inference

Prerequisites

Python 3.12+ (recommended)
Modal account and CLI setup
ZenML server (if using remote registry)
Docker (for local development)

Installation

Clone the repository:

git clone <repository-url>
cd modal-deployment

Install dependencies:

# assuming you have uv installed
uv pip install -r pyproject.toml

Set up Modal CLI:

modal token new

Set up Modal environments:

modal environment create staging
modal environment create production

Set up Modal secrets for ZenML access:

# Set your ZenML server details as variables
ZENML_URL="<your-zenml-server-url>"
ZENML_API_KEY="<your-zenml-api-key>"

# Create secrets for staging environment
modal secret create modal-deployment-credentials \
   ZENML_STORE_URL=$ZENML_URL \
   ZENML_STORE_API_KEY=$ZENML_API_KEY \
   -e staging

# Create secrets for production environment
modal secret create modal-deployment-credentials \
   ZENML_STORE_URL=$ZENML_URL \
   ZENML_STORE_API_KEY=$ZENML_API_KEY \
   -e production

Set up Modal volumes:

# Create staging volume
modal volume create iris-staging-models -e staging

# Create production volume
modal volume create iris-prod-models -e production

Project Structure

run.py: Entry point for the training and deployment pipeline
app/: Modal deployment application code
- deployment_template.py: The main Modal app implementation with FastAPI integration
- schemas.py: Iris model, prediction, and API endpoint schemas for Modal deployment
src/: Core source code
- configs/: Environment-specific configuration files (staging/production)
- pipelines/: ZenML pipeline definitions
- steps/: ZenML step implementations for training and deployment
- schemas/: Iris model and prediction schemas for training
scripts/: Utility scripts
- format.sh: Code formatting script
- shutdown.sh: Script to stop deployments in staging and production

Configuration with YAML Anchors

This project uses YAML anchor keys for efficient configuration management across environments:

# In common.yaml, we define shared configuration with an anchor
&COMMON
modal:
  secret_name: "modal-deployment-credentials"
# ... more common configuration

# In environment-specific configs, we merge the common config
<<: *COMMON
# ... environment-specific overrides and additions

The &COMMON anchor in common.yaml defines shared settings, while <<: *COMMON in other config files merges these settings before adding environment-specific configurations. This approach maintains consistent base settings while allowing per-environment customization of parameters like volume names and deployment stages.

Usage

Training and Deployment

To run the pipeline for training models and/or deploying them:

# Train models only
python run.py --train

# Deploy to Modal (staging environment by default)
python run.py --deploy

# Train models and deploy to Modal (staging environment)
python run.py --train --deploy

# Deploy to production environment
python run.py --deploy -e production

# Train and deploy to production environment
python run.py --train --deploy -e production

API Endpoints

Once deployed, each model service (sklearn or pytorch) exposes the following endpoints:

GET /: Welcome message with deployment/model info
GET /health: Health check endpoint
GET /url: Returns the deployment URL
POST /predict: Make predictions using the model

Example prediction request (`/predict`):

{
  "sepal_length": 5.1,
  "sepal_width": 3.5,
  "petal_length": 1.4,
  "petal_width": 0.2
}

The response includes the predicted class index and probabilities:

{
  "prediction": 0,
  "prediction_probabilities": [0.97, 0.02, 0.01],
  "species_name": "setosa"
}

Sample API Requests

Here are sample curl commands to interact with the deployed endpoints. You can find the URL of your deployment in the Modal dashboard as well as in the ZenML dashboard. It will have been output to the terminal when you deployed the model. (Note that there are two URLs, one for the PyTorch deployment and one for the scikit-learn deployment.)

First, set your deployment URL as a variable (including the https:// prefix):

export MODAL_URL="https://your-modal-deployment-url"  # Replace with your actual URL
# For example: export MODAL_URL="https://someuser-staging--pytorch-iris-predictor-staging.modal.run"

Health Check

curl -X GET $MODAL_URL/health

Make Predictions (for either sklearn or pytorch deployment)

curl -X POST $MODAL_URL/predict \
  -H "Content-Type: application/json" \
  -d '{"sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2}'

Response:

{
  "prediction": 0,
  "prediction_probabilities": [0.97, 0.02, 0.01],
  "species_name": "setosa"
}

Advanced Features

Configuration Files

The project uses environment-specific configuration files located in src/configs/:

train_staging.yaml: Configuration for training in staging environment
train_production.yaml: Configuration for training in production environment
deploy_staging.yaml: Configuration for deployment in staging environment
deploy_production.yaml: Configuration for deployment in production environment

These configuration files control various aspects of the pipelines and deployments. You can modify these files to customize behavior without changing code.

Model Stages

The system integrates with ZenML's model registry and supports environment-specific deployments to Modal.

Integration with Modal

The deployment uses Modal's features like:

Secret management for ZenML credentials
Python package caching for fast deployments
Serverless scaling based on demand
Volume mount for model storage

Stopping Deployments

To stop all deployments in both staging and production environments:

./scripts/shutdown.sh

This is particularly useful during development or when you need to clean up resources.

Troubleshooting

Missing ZenML credentials: Ensure Modal secret is correctly set up
Model loading errors: Check ZenML model registry or /health endpoint
Deployment failures: Check logs in the Modal dashboard
Invalid function call error: Ensure you're using the correct URL format for your deployment

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
app		app
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
run.py		run.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modal-ZenML Model Deployment

Overview

Prerequisites

Installation

Project Structure

Configuration with YAML Anchors

Usage

Training and Deployment

API Endpoints

Example prediction request (`/predict`):

Sample API Requests

Health Check

Make Predictions (for either sklearn or pytorch deployment)

Advanced Features

Configuration Files

Model Stages

Integration with Modal

Stopping Deployments

Troubleshooting

About

Releases

Packages

Contributors 2

Languages

zenml-io/modal-deployment

Folders and files

Latest commit

History

Repository files navigation

Modal-ZenML Model Deployment

Overview

Prerequisites

Installation

Project Structure

Configuration with YAML Anchors

Usage

Training and Deployment

API Endpoints

Example prediction request (/predict):

Sample API Requests

Health Check

Make Predictions (for either sklearn or pytorch deployment)

Advanced Features

Configuration Files

Model Stages

Integration with Modal

Stopping Deployments

Troubleshooting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Example prediction request (`/predict`):

Packages