Sample Airflow setup with Kubernetes

Apache Airflow sample setup with Kubernetes.

Spins up a local Kubernetes cluster for Airflow with Kind:

Leverages Airflow's official Helm chart.
Uses a PostgreSQL database for Airflow.
Uses Kubernetes Executor to run each Airflow task on an isolated pod.

This is meant as a sample local setup. In order to run it in a production environment, please refer to the Airflow Helm chart production guide.

Requirements

Kind, Docker and Helm for local Kubernetes cluster.

Instructions

The repo includes a Makefile. You can run make help to see usage.

Basic setup:

Run make k8s-cluster-up to spin up local Kubernetes cluster with Kind.
Run make airflow-k8s-add-helm-chart to add the official Airflow Helm chart to the local repo.
Run make airflow-k8s-create-namespace to create a namespace for the Airflow deployment.
Run make airflow-k8s-up to deploy Airflow on the local Kubernetes cluster.
On a separate terminal, run make airflow-webserver-port-forward to be able to access the Airflow webserver on http://localhost:8080.

The credentials for the webserver are admin/admin.

Configuration

If you need to customize the Airflow deployment you can edit values.yaml accordingly.

if you need to tune Airflow configuration you can add the corresponding environment variables in the env section of values.yaml.

The default values.yaml of the source Helm chart can be seen here.

DAG deployment

DAGs are deployed via GitSync.

GitSync acts as a side car container alongside the other Airflow pods, synchronising the dags/ folder in the pods with the DAGs located in a Git repo of your choice (in this case https://github.com/guidok91/airflow/tree/master/dags).

Custom Docker image for pods

A custom Docker image is provided for the pods. Here we can install the Airflow dependencies we need.

Logs

So that Airflow logs don't get lost every time a task finishes (e.g. the pod gets deleted), the setup provides a PersistentVolume that shares the logs with the host system in the data/ folder.

Dependency management

Dependabot is configured to periodically upgrade repo dependencies. See dependabot.yml.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github		.github
dags		dags
data		data
k8s		k8s
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sample Airflow setup with Kubernetes

Requirements

Instructions

Configuration

DAG deployment

Custom Docker image for pods

Logs

Dependency management

About

Releases

Packages

Contributors 2

Languages

guidok91/airflow

Folders and files

Latest commit

History

Repository files navigation

Sample Airflow setup with Kubernetes

Requirements

Instructions

Configuration

DAG deployment

Custom Docker image for pods

Logs

Dependency management

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages