MLOps + AutoMl approach to Stroke prediction

This a demo MLOps deployment for a model training and stream procesing using AutoGluon AutoMl, MlFlow and Airflow stack for a stroke prediction case of use. This technological stack bring an approach on real time processing data, suing Airflow DAGS as ETAL, training and evaluation pipelines adn MlFLow as model storage and mange service. Besides, model training is performed authomatically using and AutoMl framework called AutoMl. Stroke data is fully available in a Kaggle open database.

Getting Started:

The arquitecture is ready to be deployed using docker and docker-compose. Before following next steps, make sure you have both technologies installed and ready to use. The docker-compose.yml file will build a MLflow Tracking Server with PostgeSQL as the metadata store and MinIO as the artifact repository. Note that MinIO is a standalone version of Amazon S3. Moreover, a NGINX server is used as a reversed proxy to secure the communications. An AirFlow container will also be deployed which will use PostgeSQL as the metadata store too. Finally, a Jupyter instance will be deployed to check training results.

Follow these steps to build the MLflow-AirFlow-AutoML stack:

Install docker (docker & docker-compose commands must be installed).
git clone
docker-compose up -d
Open MLfLow UI at http://your-docker-machine-ip:80
Open MinIO at at http://your-docker-machine-ip:9000
Open Airflow UI at http://your-docker-machine-ip:8080
Open JupyterLab UI at http://your-docker-machine-ip:1995
Within MinIO, create a new bucket named mlflow-bucket
Enjoy!

Following varibales will be used as enviroment variable and might be found within the file (see docker-compose.yml):

AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
MLFLOW_S3_ENDPOINT_URL=http://minio:9000
MLFLOW_TRACKING_URI=http://s3server:80

Note: These credentials are only for the demo.

Starting Airflow DAGS:

The process is orqhestrated using Apache Airflow with contains three DAGS:

dags/stroke_insert_data.py: Represents the ETL process.
dags/stroke_train_model.py: Trains a classification model form inserted data using AutGLuon AutoMl framewrok and updates model to MlFlow authomatically
dags/stroke_eval_data.py: Evaluates data isnertion on a batch process using best model updates to MlFlow. Airflow instantiates AutoGLuoon during training process, trying a bunch of classification models and sticking to the best fit. Then, Airflow will take best fit and update it as an experiment to a Mlflow bucket.

Turn on three DAGs. Every process will be run every our. If you want to make the stream and train process faster you may run DADS manually or reset cron schedule configuration

Tracking Models:

Open MLfLow UI at http://your-docker-machine-ip:80. If training process is runiing correctly you will find an stroke_demo_airflow experiment group. You might choose the model you want to be instantiated in the evaluation process changing model stage to staging. If None model is changed to Staging evalution process will fail.

Tracking traing metrics:

Open JupyterLab UI at http://your-docker-machine-ip:1995 and go to metrick.ipynb document. This document will show the evolution of trained models metrics so that you can evaluate the precissión, accuracy and perfonmance of your pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
dags		dags
docker		docker
img		img
notebooks		notebooks
script		script
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLOps + AutoMl approach to Stroke prediction

Getting Started:

Starting Airflow DAGS:

Tracking Models:

Tracking traing metrics:

About

Releases

Packages

Languages

AnderGarro/autogluon_airflow_mlflow

Folders and files

Latest commit

History

Repository files navigation

MLOps + AutoMl approach to Stroke prediction

Getting Started:

Starting Airflow DAGS:

Tracking Models:

Tracking traing metrics:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages