Skip to content

Deployment Infrastructure

Debanjan Saha edited this page Apr 1, 2024 · 1 revision

Our deployment infrastructure will be hosted on Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE) for efficient container orchestration. This choice provides scalability, ease of management, and integration with various GCP services.

image

Infrastructure Components:

GCP GKE Cluster:

  • GKE will serve as the foundation for hosting our machine learning model containers. It allows for automated scaling, management, and Helm charts for Kubernetes orchestration, ensuring a robust and resilient deployment.

Docker Containers:

  • The machine learning model, along with its dependencies, will be containerized using Docker. This ensures consistent deployment across different environments, facilitating reproducibility.

MLFlow for Model Tracking:

  • MLFlow will be integrated into our MLOps pipeline for comprehensive model tracking and management. It provides capabilities for tracking experiments, packaging code, and sharing and deploying models.

Airflow for Orchestration:

  • Most of our non-kubernetes components will be orchestrated using Apache Airflow as it allows the creation of DAGs which facilitate seamless scheduling periodic tasks like executing data flows, data pre-processing, running experiments, executing custom tasks, and much more.

Deployment Process:

CI/CD Pipeline:

  • A continuous integration and continuous deployment (CI/CD) pipeline will be established to automate the deployment process using Code Build. This pipeline will include steps for testing, building Docker images, deploying to GKE, and managing MLFlow experiments.

Kubernetes Deployments:

  • Helm charts containing Kubernetes manifests will define the deployment specifications for our machine learning model and accompanying services. These manifests will be version-controlled and applied to the GKE cluster as part of the CI/CD process.

MLFlow Integration:

  • MLFlow server components will be deployed as part of the GKE cluster. MLFlow Tracking will be integrated to log and organize experiments, parameters, metrics, and artifacts. MLFlow Models will enable easy model versioning and deployment.

Clone this wiki locally