The financial sector is rapidly evolving, and Non-Banking Financial Institutions (NBFIs) are at the forefront of innovation. As digital lending grows, accurate loan default prediction has become critical for risk management and financial stability.
This repository presents an End-to-End MLOps Architecture for Loan Default Prediction designed to automate the entire machine learning lifecycle—from data ingestion to model deployment and monitoring—leveraging the power of Azure Cloud.
By implementing MLOps best practices, this solution enhances model reliability, scalability, and reproducibility—making it ready for real-world deployment in financial services.
The project follows the Medallion Architecture implemented using Azure SQL Database for structured data processing. The flow involves three main data layers:
- Data is collected from multiple sources and stored in the Azure SQL Database.
- Data is extracted from CSV files and loaded into the
bronze.Bronze_Customer
table. - Azure Data Factory (ADF) is used for incremental loading into Azure SQL.
- Data is cleaned and transformed in Python (pandas) in a local environment.
- Cleaned data is loaded into
silver.Silver_Customer
table in Azure SQL. - Databricks is used for transformations to ensure high-quality structured data.
- Feature engineering is performed on the silver layer data.
- Transformed data is stored in the gold layer (
gold.Gold_Customer
table). - The model training pipeline uses MLflow for experiment tracking and logs the best models.
- Data Extraction: Load the
gold.Gold_Customer
table for model training. - Preprocessing: Apply feature scaling, encoding, and missing value handling.
- Model Training: Train multiple models using Logistic Regression, XGBoost, and LightGBM.
- MLflow Integration: Store models as artifacts in Azure ML Studio.
- Model Selection: Choose the best-performing model and register it in MLflow.
- The best model is retrieved from Azure ML Studio and packaged inside a Flask Web API.
- The API is containerized using Docker and pushed to Azure Container Registry (ACR).
- The
Dockerfile
ensures that dependencies are installed correctly and the Flask app runs smoothly.
- Azure Service Principal is used to securely access Azure ML & ACR.
- The service principal credentials are stored in a Kubernetes Secret for authentication.
- Azure Kubernetes Service (AKS) is used to deploy the containerized model.
- Steps involved:
- Create an AKS Cluster.
- Deploy the Flask API as a Kubernetes pod.
- Use LoadBalancer service to expose the API.
- Store environment variables securely using Kubernetes Secrets.
- Every model prediction is stored in the
Predictions
table. - The stored data includes:
run_id
(MLflow Run ID)client_id
(User Identifier)input_raw_data
(JSON format)processed_data
(Final model input)prediction_probability
predicted_class
prediction_timestamp
- Azure Monitor & Log Analytics track API performance and error logs.
- Application Insights helps in real-time monitoring of API health.
- Prometheus & Grafana are planned for model drift detection.
The CI/CD pipeline automates model deployment when new code is pushed to GitHub.
- Code Push & Trigger CI/CD: GitHub Actions trigger on
main
branch commits. - Build & Test: Runs unit tests, linting, and security scans.
- Build Docker Image: Creates a new image with the latest model.
- Push to Azure Container Registry (ACR).
- Deploy to AKS using a Kubernetes YAML configuration.
- Monitor Deployment Status using
kubectl
commands.
- Implement Model Retraining Pipeline with AutoML in Azure ML.
- Enable Drift Detection using Evidently AI & Prometheus.
- Extend the system to streaming data using Azure Event Hubs.
This project demonstrates a fully automated ML pipeline from data ingestion to model deployment with Azure services. It ensures scalability, security, and automation for real-world financial applications.
Authors: [Rohit Kosamkar]
📌 Repository: GitHub Repo