- Watch the main flow demo video here
- Watch the docker-compose demo video here
- Watch the logging and monitoring demo video here
This project implements an end-to-end machine learning pipeline to predict wine quality (dataset source), using Metaflow and MLflow. The pipeline includes the following steps: data loading, exploratory data analysis (EDA), model training, hyperparameter optimization, and model comparison.
- Load wine quality dataset from MLflow's data source (view here)
- Version control using timestamps
- Log metadata and data statistics to MLflow
- Supports both local and remote (internet) data sources
- Generate detailed dataset statistics
- Create interactive charts using VegaChart:
- Wine quality distribution
- Feature-quality relationships
- Correlation matrix
- All charts are shown in Metaflow cards
- Split data into training and testing sets (80/20)
- Save split data for parallel model training
- Optional hyperparameter tuning with Optuna
- Track training time for each model
- Log all metrics and parameters to MLflow
- Display detailed results in Metaflow cards
- Compare performance of all models
- Analyze accuracy and training time
- Automatically select best model based on accuracy
- Visual comparison chart with color-coded training time
- Log best model to MLflow with signature and input example
- Automatic Hyperparameter Optimization: Uses Optuna for model tuning
- Parallel Model Training: Improves efficiency by training multiple models concurrently
- Performance Tracking: Measures both accuracy and training time
- Interactive Charts: Intuitive visual insights via Metaflow cards
- Reusable Pipeline: Data and models are version-controlled clearly
- Workflow orchestration
- Parallel execution support
- Resource management and scalability
- Interactive chart support (cards)
- Experiment tracking
- Model management
- Data versioning
- Metric logging and visualization
- Automated model selection and hyperparameter tuning
- Supports Bayesian (TPE), Random Search, Annealing, etc.
- Multiple scikit-learn algorithms supported
- Seamless integration with training pipeline and experiment tracking
- Create a virtual environment:
python -m venv .venv
source .venv/bin/activate # For Linux/Mac
- Install dependencies:
pip install -r requirements.txt
- Start MLflow server (in a separate terminal)
mlflow server --host 127.0.0.1 --port 5000
python main.py run
python main.py run --use_hyperopt true
Note: Running with all models and tuning may take ~20 minutes.
python main.py run --data-dir /path/to/data
mlflow ui
Access at: http://localhost:5000
In project directory, run in new terminal:
python main.py card server
Access at: http://localhost:8324
In the project directory, run:
docker-compose up --build
- Service
mlflow-server
: MLflow Tracking Server at http://localhost:5000 - Service
train-pipeline
: Automatically trains and registers best model in MLflow Model Registry - Service
model-serving
: Serves best model via REST API at http://localhost:5050/invocations
Once containers are running, send a prediction request to the model API:
curl -d '{"dataframe_split": {
"columns": ["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"],
"data": [[7,0.27,0.36,20.7,0.045,45,170,1.001,3,0.45,8.8]]}}' \
-H 'Content-Type: application/json' -X POST localhost:5050/invocations
Example response:
{"predictions": [3]}
Press Ctrl+C
in the running terminal or run:
docker-compose down
- Ensure Docker and Docker Compose are installed.
- First run may take time to build images and download data.
- To view logs for a specific service:
docker-compose logs <service-name>
- Ensure MLflow server is running before executing the pipeline
- First-time runs will download the dataset
- Hyperparameter tuning may significantly increase run time
- Prometheus: Metrics collection and storage (http://localhost:9090)
- Grafana: Metrics visualization (http://localhost:3000)
- AlertManager: Alert handling (http://localhost:9093)
- Node Exporter: System metrics collection
- cAdvisor: Container metrics collection
- FastAPI Instrumentator: API metrics collection
- CPU usage
- Memory usage
- Disk I/O
- Network I/O
- Container metrics
- Request rate
- Latency
- Error rates
- Status codes
- Inference latency
- Prediction confidence scores
- Model error rates
-
Grafana (http://localhost:3000):
- Default credentials: admin/admin
- Pre-configured dashboards:
- ML Model Metrics
- System Metrics
- API Metrics
-
Prometheus (http://localhost:9090):
- Query metrics directly
- View targets and alerts
-
AlertManager (http://localhost:9093):
- View and manage alerts
- Configure notifications
Use the provided load testing script to generate traffic and test the monitoring stack:
# Run basic load test
python load_test.py
# Run with custom parameters
python load_test.py --duration 600 --workers 20 --rps 10
# Generate some errors for testing alerts
python load_test.py --inject-errors
The following alerts are configured:
-
High Error Rate
- Triggers when error rate exceeds 50%
- 5-minute evaluation window
-
Low Model Confidence
- Triggers when average confidence falls below 0.6
- 5-minute evaluation window
-
High Latency
- Triggers when average prediction time exceeds 1 second
- 5-minute evaluation window
-
Down Instance
- Triggers when a service in the stack is down for a period of time
- 3-minute evaluation window
Alerts can be configured to send to Slack or email.
Use the provided alerts testing script to simulate faulty senarios:
python test_alerts.py