This project provides a robust, production-ready CI/CD framework for managing Machine Learning workloads on Databricks. It leverages Databricks Asset Bundles (DABs) for resource management, Azure Pipelines for automated testing and deployment, and Apache Airflow for orchestration.
- Compute: Azure Databricks.
- Resource Management: Databricks Asset Bundles (DABs).
- CI/CD: Azure Pipelines with ephemeral Kubernetes-based agents.
- Orchestration: Apache Airflow (running on Kubernetes).
- Quality Gate: SonarQube.
- Artifacts: Python Wheels hosted on Azure Artifacts.
.
├── azure-pipelines.yml # CI/CD pipeline definition
├── databricks.yml # DAB configuration (Dev/Prod targets)
├── setup.py # Python package configuration
├── dags/ # Airflow DAGs
│ └── train_model_dag.py # Model training orchestration
├── infra/ # Kubernetes deployment manifests
│ ├── azure-devops-agents/ # KEDA-scaled ephemeral build agents
│ │ ├── deploy.sh
│ │ ├── 01-namespace.yaml
│ │ ├── 02-secret.yaml
│ │ ├── 03-trigger-auth.yaml
│ │ └── 04-scaledjob.yaml
│ ├── airflow/ # Airflow Helm deployment with git-sync
│ │ ├── deploy.sh
│ │ ├── 01-namespace.yaml
│ │ ├── 02-secrets.yaml
│ │ └── 03-helm-values.yaml
│ └── README.md # Detailed deployment guide
├── src/ # Core logic (Python package)
│ └── example_package/
└── tests/ # Unit and integration tests
- Databricks CLI: Installed and configured.
- Python 3.10+: For local development and testing.
- Azure DevOps: Access to the project and agent pools.
-
Clone the repository:
git clone <repo-url> cd databricks-cicd
-
Install dependencies:
pip install -e . pip install pytest ruff -
Run tests:
pytest tests/
-
Validate DAB Configuration:
databricks configure --host "https://<Instance>" --token databricks bundle validate -t dev
The project follows a branch-based deployment strategy:
-
Continuous Integration (CI):
- Triggered on
developandmainbranches. - Runs linting (
ruff) and unit tests (pytest). - Performs SonarQube code analysis.
- Validates Databricks Bundles for both
devandprodtargets. - Packages the code as a Python Wheel and uploads it to Azure Artifacts.
- Triggered on
-
Continuous Deployment (CD):
- Development: Automatic deployment to Databricks when code is merged into the
developbranch. - Production: Automatic deployment to Databricks when code is merged into the
mainbranch. This stage requires manual approval in Azure DevOps Environments.
- Development: Automatic deployment to Databricks when code is merged into the
The train_model_orchestration DAG in dags/ is environment-aware. It uses an Airflow Variable env (defaulting to dev) to determine which Databricks Job to trigger.
- Dev: Triggers "[DEV] Train Model"
- Prod: Triggers "[PROD] Train Model"
It uses the DatabricksRunNowOperator and references jobs by name to ensure stability across bundle redeployments.
The infrastructure components are designed to run on Kubernetes. See infra/README.md for detailed deployment instructions.
- Airflow: Configured with
KubernetesExecutorfor scalable task execution. DAGs are synced via git-sync. - Build Agents: Uses KEDA (Kubernetes Event-driven Autoscaling) to spin up Azure Pipelines agents on-demand in the
k8s-ephemeral-pool.
Deploy both components:
./infra/azure-devops-agents/deploy.sh
./infra/airflow/deploy.sh