A framework for training machine learning models with ZenML and deploying them to Modal.
This project demonstrates an end-to-end ML workflow:
- Training ML models (scikit-learn and PyTorch)
- Registering them with ZenML's Model Control Plane
- Deploying them to Modal for scalable, serverless inference
- Python 3.12+ (recommended)
- Modal account and CLI setup
- ZenML server (if using remote registry)
- Docker (for local development)
- Clone the repository:
git clone <repository-url>
cd modal-deployment
- Install dependencies:
# assuming you have uv installed
uv pip install -r pyproject.toml
- Set up Modal CLI:
modal token new
- Set up Modal environments:
modal environment create staging
modal environment create production
- Set up Modal secrets for ZenML access:
# Set your ZenML server details as variables
ZENML_URL="<your-zenml-server-url>"
ZENML_API_KEY="<your-zenml-api-key>"
# Create secrets for staging environment
modal secret create modal-deployment-credentials \
ZENML_STORE_URL=$ZENML_URL \
ZENML_STORE_API_KEY=$ZENML_API_KEY \
-e staging
# Create secrets for production environment
modal secret create modal-deployment-credentials \
ZENML_STORE_URL=$ZENML_URL \
ZENML_STORE_API_KEY=$ZENML_API_KEY \
-e production
- Set up Modal volumes:
# Create staging volume
modal volume create iris-staging-models -e staging
# Create production volume
modal volume create iris-prod-models -e production
run.py
: Entry point for the training and deployment pipelineapp/
: Modal deployment application codedeployment_template.py
: The main Modal app implementation with FastAPI integrationschemas.py
: Iris model, prediction, and API endpoint schemas for Modal deployment
src/
: Core source codeconfigs/
: Environment-specific configuration files (staging/production)pipelines/
: ZenML pipeline definitionssteps/
: ZenML step implementations for training and deploymentschemas/
: Iris model and prediction schemas for training
scripts/
: Utility scriptsformat.sh
: Code formatting scriptshutdown.sh
: Script to stop deployments in staging and production
This project uses YAML anchor keys for efficient configuration management across environments:
# In common.yaml, we define shared configuration with an anchor
&COMMON
modal:
secret_name: "modal-deployment-credentials"
# ... more common configuration
# In environment-specific configs, we merge the common config
<<: *COMMON
# ... environment-specific overrides and additions
The &COMMON
anchor in common.yaml
defines shared settings, while <<: *COMMON
in other config files merges these settings before adding environment-specific configurations. This approach maintains consistent base settings while allowing per-environment customization of parameters like volume names and deployment stages.
To run the pipeline for training models and/or deploying them:
# Train models only
python run.py --train
# Deploy to Modal (staging environment by default)
python run.py --deploy
# Train models and deploy to Modal (staging environment)
python run.py --train --deploy
# Deploy to production environment
python run.py --deploy -e production
# Train and deploy to production environment
python run.py --train --deploy -e production
Once deployed, each model service (sklearn or pytorch) exposes the following endpoints:
GET /
: Welcome message with deployment/model infoGET /health
: Health check endpointGET /url
: Returns the deployment URLPOST /predict
: Make predictions using the model
{
"sepal_length": 5.1,
"sepal_width": 3.5,
"petal_length": 1.4,
"petal_width": 0.2
}
The response includes the predicted class index and probabilities:
{
"prediction": 0,
"prediction_probabilities": [0.97, 0.02, 0.01],
"species_name": "setosa"
}
Here are sample curl commands to interact with the deployed endpoints. You can
find the URL of your deployment in the Modal dashboard as well as in the ZenML
dashboard. It will have been output to the terminal when you deployed the model.
(Note that there are two URLs, one for the PyTorch deployment and one for the
scikit-learn
deployment.)
First, set your deployment URL as a variable (including the https:// prefix):
export MODAL_URL="https://your-modal-deployment-url" # Replace with your actual URL
# For example: export MODAL_URL="https://someuser-staging--pytorch-iris-predictor-staging.modal.run"
curl -X GET $MODAL_URL/health
curl -X POST $MODAL_URL/predict \
-H "Content-Type: application/json" \
-d '{"sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2}'
Response:
{
"prediction": 0,
"prediction_probabilities": [0.97, 0.02, 0.01],
"species_name": "setosa"
}
The project uses environment-specific configuration files located in src/configs/
:
train_staging.yaml
: Configuration for training in staging environmenttrain_production.yaml
: Configuration for training in production environmentdeploy_staging.yaml
: Configuration for deployment in staging environmentdeploy_production.yaml
: Configuration for deployment in production environment
These configuration files control various aspects of the pipelines and deployments. You can modify these files to customize behavior without changing code.
The system integrates with ZenML's model registry and supports environment-specific deployments to Modal.
The deployment uses Modal's features like:
- Secret management for ZenML credentials
- Python package caching for fast deployments
- Serverless scaling based on demand
- Volume mount for model storage
To stop all deployments in both staging and production environments:
./scripts/shutdown.sh
This is particularly useful during development or when you need to clean up resources.
- Missing ZenML credentials: Ensure Modal secret is correctly set up
- Model loading errors: Check ZenML model registry or
/health
endpoint - Deployment failures: Check logs in the Modal dashboard
- Invalid function call error: Ensure you're using the correct URL format for your deployment