diff --git a/stable/airflow/.helmignore b/stable/airflow/.helmignore new file mode 100644 index 000000000000..6b8710a711f3 --- /dev/null +++ b/stable/airflow/.helmignore @@ -0,0 +1 @@ +.git diff --git a/stable/airflow/Chart.yaml b/stable/airflow/Chart.yaml new file mode 100644 index 000000000000..521e598f9361 --- /dev/null +++ b/stable/airflow/Chart.yaml @@ -0,0 +1,14 @@ +description: Airflow is a platform to programmatically author, schedule and monitor workflows +name: airflow +version: 0.9.0 +appVersion: 1.10.0 +icon: https://airflow.apache.org/_images/pin_large.png +home: https://airflow.apache.org/ +maintainers: + - name: gsemet + email: gaetan@xeberon.net +sources: + - https://airflow.apache.org/ +keywords: + - workflow + - dag diff --git a/stable/airflow/OWNERS b/stable/airflow/OWNERS new file mode 100644 index 000000000000..14e08be034e1 --- /dev/null +++ b/stable/airflow/OWNERS @@ -0,0 +1,6 @@ +approvers: +- gsemet +- maver1ck +reviewers: +- gsemet +- maver1ck diff --git a/stable/airflow/README.md b/stable/airflow/README.md new file mode 100644 index 000000000000..0ff5931c8618 --- /dev/null +++ b/stable/airflow/README.md @@ -0,0 +1,264 @@ +# Airflow / Celery + +[Airflow](https://airflow.apache.org/) is a platform to programmatically author, schedule and +monitor workflows. + + +## Install Chart + +To install the Airflow Chart into your Kubernetes cluster : + +```bash +helm install --namespace "airflow" --name "airflow" stable/airflow +``` + +After installation succeeds, you can get a status of Chart + +```bash +helm status "airflow" +``` + +If you want to delete your Chart, use this command: + +```bash +helm delete --purge "airflow" +``` + +### Helm ingresses + +The Chart provides ingress configuration to allow customization the installation by adapting +the `values.yaml` depending on your setup. +Please read the comments in the `values.yaml` file for more details on how to configure your reverse +proxy or load balancer. + +### Chart Prefix + +This Helm automatically prefixes all names using the release name to avoid collisions. + +### URL prefix + +This chart exposes 2 endpoints: + +- Airflow Web UI +- Flower, a debug UI for Celery + +Both can be placed either at the root of a domain or at a sub path, for example: + +``` +http://mycompany.com/airflow/ +http://mycompany.com/airflow/flower +``` + +NOTE: Mounting the Airflow UI under a subpath requires an airflow version >= 2.0.x. For the moment +(June 2018) this is **not** available on official package, you will have to use an image where +airflow has been updated to its current HEAD. You can use the following image: +`stibbons31/docker-airflow-dev:2.0dev`. It is rebase regularly on top of the `puckel/docker-airflow` +image. + +Please also note that the Airflow UI and Flower do not behave the same: + +- Airflow Web UI behaves transparently, to configure it one just needs to specify the + `ingress.web.path` value. +- Flower cannot handle this scheme directly and requires a URL rewrite mechanism in front + of it. In short, it is able to generate the right URLs in the returned HTML file but cannot + respond to these URL. It is commonly found in software that wasn't intended to work under + something else than a root URL or localhost port. To use it, see the `values.yaml` for how + to configure your ingress controller to rewrite the URL (or "strip" the prefix path). + + Note: unreleased Flower (as of June 2018) does not need the prefix strip feature anymore. It is + integrated in `docker-airflow-dev:2.0dev` image. + +### Airflow configuration + +`airflow.cfg` configuration can be changed by defining environment variables in the following form: +`AIRFLOW__
__`. + +See the +[Airflow documentation for more information](http://airflow.readthedocs.io/en/latest/configuration.html?highlight=__CORE__#setting-configuration-options) + +This helm chart allows you to add these additional settings with the value key `airflow.config`. +You can also add generic environment variables such as proxy or private pypi: + +```yaml +airflow: + config: + AIRFLOW__CORE__EXPOSE_CONFIG: True + PIP_INDEX_URL: http://pypi.mycompany.com/ + PIP_TRUSTED_HOST: pypi.mycompany.com + HTTP_PROXY: http://proxy.mycompany.com:1234 + HTTPS_PROXY: http://proxy.mycompany.com:1234 +``` + +If you are using a private image for your dags (see [Embedded Dags](#embedded-dags)) +or for use with the KubernetesPodOperator (available in version 1.10.0), then add +an image pull secret to the airflow config: +```yaml +airflow: + image: + pullSecret: my-docker-repo-secret +``` + +### Worker Statefulset + +Celery workers uses StatefulSet. +It is used to freeze their DNS using a Kubernetes Headless Service, and allow the webserver to +requests the logs from each workers individually. +This requires to expose a port (8793) and ensure the pod DNS is accessible to the web server pod, +which is why StatefulSet is for. + +#### Worker secrets + +You can add kubernetes secrets which will be mounted as volumes on the worker nodes +at `secretsDir/`. +```yaml +workers: + secretsDir: /var/airflow/secrets + secrets: + - redshift-user + - redshift-password + - elasticsearch-user + - elasticsearch-password +``` + +With the above configuration, you could read the `redshift-user` password +from within a dag or other function using: +```python +import os +from pathlib import Path + +def get_secret(secret_name): + secrets_dir = Path('/var/airflow/secrets') + secret_path = secrets_dir / secret_name + assert secret_path.exists(), f'could not find {secret_name} at {secret_path}' + secret_data = secret_path.read_text().strip() + return secret_data + +redshift_user = get_secret('redshift-user') +``` + +To create a secret, you can use: +```bash +$ kubectl create secret generic redshift-user --from-file=redshift-user=~/secrets/redshift-user.txt +``` +Where `redshift-user.txt` contains the user secret as a single text string. + +### Local binaries + +Please note a folder `~/.local/bin` will be automatically created and added to the PATH so that +Bash operators can use command line tools installed by `pip install --user` for instance. + +## DAGs Deployment + +Several options are provided for synchronizing your Airflow DAGs. + + +### Mount a Shared Persistent Volume + +You can store your DAG files on an external volume, and mount this volume into the relevant Pods +(scheduler, web, worker). In this scenario, your CI/CD pipeline should update the DAG files in the +PV. + +Since all Pods should have the same collection of DAG files, it is recommended to create just one PV +that is shared. This ensures that the Pods are always in sync about the DagBag. + +This is controlled by setting `persistance.enabled=true`. You will have to ensure yourself the +PVC are shared properly between your pods: +- If you are on AWS, you can use [Elastic File System (EFS)](https://aws.amazon.com/efs/). +- If you are on Azure, you can use +[Azure File Storage (AFS)](https://docs.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv). + +To share a PV with multiple Pods, the PV needs to have accessMode 'ReadOnlyMany' or 'ReadWriteMany'. + +### Use init-container + +If you enable set `dags.init_container.enabled=true`, the pods will try upon startup to fetch the +git repository defined by `dags.git_repo`, on branch `dags.git_branch` as DAG folder. + +You can also add a `requirements.txt` file at the root of your DAG project to have other +Python dependencies installed. + +This is the easiest way of deploying your DAGs to Airflow. + +### Embedded DAGs + +If you want more control on the way you deploy your DAGs, you can use embedded DAGs, where DAGs +are burned inside the Docker container deployed as Scheduler and Workers. + +Be aware this requires more tooling than using shared PVC, or init-container: + +- your CI/CD should be able to build a new docker image each time your DAGs are updated. +- your CI/CD should be able to control the deployment of this new image in your kubernetes cluster + +Example of procedure: + +- Fork the [puckel/docker-airflow](https://github.com/puckel/docker-airflow) repository +- Place your DAG inside the `dags` folder of the repository, and ensure your Python dependencies + are well installed (for example consuming a `requirements.txt` in your `Dockerfile`) +- Update the value of `airflow.image` in your `values.yaml` and deploy on your Kubernetes cluster + +## Helm chart Configuration + +The following table lists the configurable parameters of the Airflow chart and their default values. + +| Parameter | Description | Default | +|------------------------------------------|---------------------------------------------------------|---------------------------| +| `airflow.fernetKey` | Ferney key (see `values.yaml` for example) | (auto generated) | +| `airflow.service.type` | services type | `ClusterIP` | +| `airflow.executor` | the executor to run | `Celery` | +| `airflow.initRetryLoop` | max number of retries during container init | | +| `airflow.image.repository` | Airflow docker image | `puckel/docker-airflow` | +| `airflow.image.tag` | Airflow docker tag | `1.10.0-4` | +| `airflow.image.pullPolicy` | Image pull policy | `IfNotPresent` | +| `airflow.image.pullSecret` | Image pull secret | | +| `airflow.schedulerNumRuns` | -1 to loop indefinitively, 1 to restart after each exec | | +| `airflow.webReplicas` | how many replicas for web server | `1` | +| `airflow.config` | custom airflow configuration env variables | `{}` | +| `airflow.podDisruptionBudget` | control pod disruption budget | `{'maxUnavailable': 1}` | +| `workers.enabled` | enable workers | `true` | +| `workers.replicas` | number of workers pods to launch | `1` | +| `workers.resources` | custom resource configuration for worker pod | `{}` | +| `workers.celery.instances` | number of parallel celery tasks per worker | `1` | +| `workers.pod.annotations` | annotations for the worker pods | `{}` | +| `workers.secretsDir` | directory in which to mount secrets on worker nodes | /var/airflow/secrets | +| `workers.secrets` | secrets to mount as volumes on worker nodes | [] | +| `ingress.enabled` | enable ingress | `false` | +| `ingress.web.host` | hostname for the webserver ui | "" | +| `ingress.web.path` | path of the werbserver ui (read `values.yaml`) | `` | +| `ingress.web.annotations` | annotations for the web ui ingress | `{}` | +| `ingress.web.tls.enabled` | enables TLS termination at the ingress | `false` | +| `ingress.web.tls.secretName` | name of the secret containing the TLS certificate & key | `` | +| `ingress.flower.host` | hostname for the flower ui | "" | +| `ingress.flower.path` | path of the flower ui (read `values.yaml`) | `` | +| `ingress.flower.livenessPath` | path to the liveness probe (read `values.yaml`) | `/` | +| `ingress.flower.annotations` | annotations for the web ui ingress | `{}` | +| `ingress.flower.tls.enabled` | enables TLS termination at the ingress | `false` | +| `ingress.flower.tls.secretName` | name of the secret containing the TLS certificate & key | `` | +| `persistance.enabled` | enable persistance storage for DAGs | `false` | +| `persistance.existingClaim` | if using an existing claim, specify the name here | `nil` | +| `persistance.storageClass` | Persistent Volume Storage Class | (undefined) | +| `persistance.accessMode` | PVC access mode | `ReadWriteOnce` | +| `persistance.size` | Persistant storage size request | `1Gi` | +| `dags.doNotPickle` | should the scheduler disable DAG pickling | `false` | +| `dags.path` | mount path for persistent volume | `/usr/local/airflow/dags` | +| `dags.initContainer.enabled` | Fetch the source code when the pods starts | `false` | +| `dags.initContainer.installRequirements` | auto install requirements.txt deps | `true` | +| `dags.git.url` | url to clone the git repository | nil | +| `dags.git.ref` | branch name, tag or sha1 to reset to | `master` | +| `rbac.create` | create RBAC resources | `true` | +| `serviceAccount.create` | create a service account | `true` | +| `serviceAccount.name` | the service account name | `` | +| `postgres.enabled` | create a postgres server | `true` | +| `postgres.uri` | full URL to custom postgres setup | (undefined) | +| `postgres.portgresHost` | PostgreSQL Hostname | (undefined) | +| `postgres.postgresUser` | PostgreSQL User | `postgres` | +| `postgres.postgresPassword` | PostgreSQL Password | `airflow` | +| `postgres.postgresDatabase` | PostgreSQL Database name | `airflow` | +| `postgres.persistence.enabled` | Enable Postgres PVC | `true` | +| `postgres.persistance.storageClass` | Persistant class | (undefined) | +| `postgres.persistance.accessMode` | Access mode | `ReadWriteOnce` | +| `redis.enabled` | Create a Redis cluster | `true` | +| `redis.password` | Redis password | `airflow` | +| `redis.master.persistence.enabled` | Enable Redis PVC | `false` | +| `redis.cluster.enabled` | enable master-slave cluster | `false` | + +Full and up-to-date documentation can be found in the comments of the `values.yaml` file. diff --git a/stable/airflow/examples/minikube-values.yaml b/stable/airflow/examples/minikube-values.yaml new file mode 100644 index 000000000000..d06fd85b0eb6 --- /dev/null +++ b/stable/airflow/examples/minikube-values.yaml @@ -0,0 +1,36 @@ +airflow: + image: + repository: puckel/docker-airflow + tag: 1.10.0-4 + pullPolicy: IfNotPresent + service: + type: NodePort + webReplicas: 1 + config: + AIRFLOW__CORE__LOGGING_LEVEL: DEBUG + AIRFLOW__CORE__LOAD_EXAMPLES: True + +workers: + replicas: 1 + celery: + instances: 1 + +ingress: + enabled: true + web: + path: "/airflow" + host: "minikube" + annotations: + traefik.frontend.rule.type: PathPrefix + kubernetes.io/ingress.class: traefik + flower: + path: "/airflow/flower" + host: "minikube" + annotations: + traefik.frontend.rule.type: PathPrefixStrip + kubernetes.io/ingress.class: traefik + +persistence: + enabled: true + accessMode: ReadWriteOnce + size: 1Gi diff --git a/stable/airflow/requirements.yaml b/stable/airflow/requirements.yaml new file mode 100644 index 000000000000..ae00016bd4b8 --- /dev/null +++ b/stable/airflow/requirements.yaml @@ -0,0 +1,9 @@ +dependencies: +- name: postgresql + version: 0.13.1 + repository: https://kubernetes-charts.storage.googleapis.com/ + condition: postgresql.enabled +- name: redis + version: 3.3.5 + repository: https://kubernetes-charts.storage.googleapis.com/ + condition: redis.enabled diff --git a/stable/airflow/templates/NOTES.txt b/stable/airflow/templates/NOTES.txt new file mode 100644 index 000000000000..ce5e61b08d11 --- /dev/null +++ b/stable/airflow/templates/NOTES.txt @@ -0,0 +1,30 @@ +Congratulations. You have just deployed Apache Airflow + +{{- if .Values.ingress.enabled }} +URL to Airflow and Flower: + + - Web UI: http://{{ .Values.ingress.web.host }}{{ .Values.ingress.web.path }}/ + - Flower: http://{{ .Values.ingress.flower.host }}{{ .Values.ingress.flower.path }}/ + +{{- else if contains "NodePort" .Values.airflow.service.type }} + +1. Get the Airflow URL by running these commands: + + export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ template "airflow.fullname" . }}) + export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}") + echo http://$NODE_IP:$NODE_PORT/ + + {{- else if contains "LoadBalancer" .Values.airflow.service.type }} + + NOTE: It may take a few minutes for the LoadBalancer IP to be available. + You can watch the status of the service by running 'kubectl get svc -w {{ template "airflow.fullname" . }}' + export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ template "airflow.fullname" . }} -o jsonpath='{.status.loadBalancer.ingress[0].ip}') + echo http://$SERVICE_IP/ + + {{- else if contains "ClusterIP" .Values.airflow.service.type }} + export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "component={{ .Values.airflow.name }}" -o jsonpath="{.items[0].metadata.name}") + echo http://127.0.0.1:{{ .Values.airflow.externalPortHttp }} + kubectl port-forward --namespace {{ .Release.Namespace }} $POD_NAME {{ .Values.airflow.externalPortHttp }}:{{ .Values.airflow.internalPortHttp }} + +2. Open Airflow in your web browser +{{- end }} diff --git a/stable/airflow/templates/_helpers.tpl b/stable/airflow/templates/_helpers.tpl new file mode 100644 index 000000000000..2285736db27a --- /dev/null +++ b/stable/airflow/templates/_helpers.tpl @@ -0,0 +1,66 @@ +{{/* vim: set filetype=mustache: */}} +{{/* +Expand the name of the chart. +*/}} +{{- define "airflow.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{/* +Create a default fully qualified app name. +We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). +*/}} +{{- define "airflow.fullname" -}} +{{- printf "%s" .Release.Name | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{/* +Create chart name and version as used by the chart label. +*/}} +{{- define "airflow.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{/* +Create the name of the service account to use +*/}} +{{- define "airflow.serviceAccountName" -}} +{{- if .Values.serviceAccount.create -}} + {{ default (include "airflow.fullname" .) .Values.serviceAccount.name }} +{{- else -}} + {{ default "default" .Values.serviceAccount.name }} +{{- end -}} +{{- end -}} + +{{/* +Create a default fully qualified postgresql name or use the `postgresHost` value if defined. +We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). +*/}} +{{- define "airflow.postgresql.fullname" -}} +{{- if .Values.postgresql.postgresHost }} + {{- printf "%s" .Values.postgresql.postgresHost -}} +{{- else }} + {{- $name := default "postgresql" .Values.postgresql.nameOverride -}} + {{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}} +{{- end -}} +{{- end -}} + +{{/* +Create a default fully qualified redis cluster name. +We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). +*/}} +{{- define "airflow.redis.fullname" -}} +{{- $name := default "redis" .Values.redis.nameOverride -}} +{{- printf "%s-%s-master" .Release.Name $name | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{/* +Create a random string if the supplied key does not exist +*/}} +{{- define "airflow.defaultsecret" -}} +{{- if . -}} +{{- . | b64enc | quote -}} +{{- else -}} +{{- randAlphaNum 10 | b64enc | quote -}} +{{- end -}} +{{- end -}} diff --git a/stable/airflow/templates/configmap-airflow.yaml b/stable/airflow/templates/configmap-airflow.yaml new file mode 100644 index 000000000000..80c574ece467 --- /dev/null +++ b/stable/airflow/templates/configmap-airflow.yaml @@ -0,0 +1,40 @@ +apiVersion: v1 +kind: ConfigMap +metadata: + name: "{{ template "airflow.fullname" . }}-env" + labels: + app: {{ template "airflow.name" . }} + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +data: + ## Force UTC timezone + TZ: Etc/UTC + ## Postgres DB configuration + POSTGRES_HOST: "{{ template "airflow.postgresql.fullname" . }}" + POSTGRES_PORT: "{{ .Values.postgresql.postgresPort }}" + POSTGRES_DB: "{{ .Values.postgresql.postgresDatabase }}" + ## Redis DB configuration + REDIS_HOST: "{{ template "airflow.redis.fullname" . }}" + REDIS_PORT: "{{ .Values.redis.master.port }}" + ## Flower PORT + FLOWER_PORT: "5555" + # Configure puckel's docker-airflow entrypoint + EXECUTOR: "{{ .Values.airflow.executor }}" + FERNET_KEY: "{{ .Values.airflow.fernetKey }}" + DO_WAIT_INITDB: "false" + ## Custom Airflow settings + AIRFLOW__CORE__DONOT_PICKLE: "{{ .Values.dags.doNotPickle }}" + AIRFLOW__CELERY__FLOWER_URL_PREFIX: "{{ .Values.ingress.flower.path }}" + AIRFLOW__CELERY__WORKER_CONCURRENCY: "{{ .Values.workers.celery.instances }}" + AIRFLOW__CORE__DAGS_FOLDER: "{{ .Values.dags.path }}" + AIRFLOW__WEBSERVER__BASE_URL: "http://localhost:8080{{ .Values.ingress.web.path }}" + # Disabling XCom pickling for forward compatibility + AIRFLOW__CODE__ENABLE_XCOM_PICKLING: "false" + # For backwards compat with AF < 1.10, CELERY_CONCURRENCY got renamed to WORKER_CONCURRENCY + AIRFLOW__CELERY__CELERY_CONCURRENCY: "{{ .Values.workers.celery.instances }}" + # Note: changing `Values.airflow.config` won't change the configmap checksum and so won't make + # the pods to restart + {{- range $setting, $option := .Values.airflow.config }} + {{ $setting }}: "{{ $option }}" + {{- end }} diff --git a/stable/airflow/templates/configmap-git-clone.yaml b/stable/airflow/templates/configmap-git-clone.yaml new file mode 100644 index 000000000000..0d2abbcc3cca --- /dev/null +++ b/stable/airflow/templates/configmap-git-clone.yaml @@ -0,0 +1,23 @@ +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ template "airflow.fullname" . }}-git-clone + labels: + app: {{ template "airflow.name" . }} + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +data: + git-clone.sh: | + #!/bin/sh -e + REPO=$1 + REF=$2 + DIR=$3 + # Init Containers will re-run on Pod restart. Remove the directory's contents + # and reprovision when this happens. + if [ -d "$DIR" ]; then + rm -rf $( find $DIR -mindepth 1 ) + fi + git clone $REPO $DIR + cd $DIR + git reset --hard $REF diff --git a/stable/airflow/templates/configmap-scripts.yaml b/stable/airflow/templates/configmap-scripts.yaml new file mode 100644 index 000000000000..2be3d7b35fd6 --- /dev/null +++ b/stable/airflow/templates/configmap-scripts.yaml @@ -0,0 +1,22 @@ +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ template "airflow.fullname" . }}-scripts + labels: + app: {{ template "airflow.name" . }} + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +data: + install-requirements.sh: | + #!/bin/sh -e + if [ ! -d {{ .Values.dags.path }} ]; then + echo "No folder {{ .Values.dags.path }}" + exit 0 + fi + cd {{ .Values.dags.path }} + if [ -f requirements.txt ]; then + pip install --user -r requirements.txt + else + exit 0 + fi diff --git a/stable/airflow/templates/deployments-flower.yaml b/stable/airflow/templates/deployments-flower.yaml new file mode 100644 index 000000000000..6277040b08d9 --- /dev/null +++ b/stable/airflow/templates/deployments-flower.yaml @@ -0,0 +1,77 @@ +{{- if .Values.workers.enabled -}} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ template "airflow.fullname" . }}-flower + labels: + app: {{ template "airflow.name" . }} + component: flower + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + replicas: 1 + minReadySeconds: 10 + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + selector: + matchLabels: + app: {{ template "airflow.name" . }} + component: flower + release: {{ .Release.Name }} + template: + metadata: + annotations: + checksum/config: {{ include (print $.Template.BasePath "/configmap-airflow.yaml") . | sha256sum }} + configmap.fabric8.io/update-on-change: "{{ template "airflow.fullname" . }}-env" + labels: + app: {{ template "airflow.name" . }} + component: flower + release: {{ .Release.Name }} + spec: + {{- if .Values.airflow.image.pullSecret }} + imagePullSecrets: + - name: {{ .Values.airflow.image.pullSecret }} + {{- end }} + restartPolicy: Always + containers: + - name: {{ .Chart.Name }}-flower + image: {{ .Values.airflow.image.repository }}:{{ .Values.airflow.image.tag }} + imagePullPolicy: {{ .Values.airflow.image.pullPolicy }} + envFrom: + - configMapRef: + name: "{{ template "airflow.fullname" . }}-env" + env: + - name: POSTGRES_USER + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresUser + - name: POSTGRES_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresPassword + - name: REDIS_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: redisPassword + ports: + - name: flower + containerPort: 5555 + protocol: TCP + args: ["flower"] + livenessProbe: + httpGet: + path: "{{ .Values.ingress.flower.livenessPath }}/" + port: flower + initialDelaySeconds: 60 + periodSeconds: 60 + timeoutSeconds: 1 + successThreshold: 1 + failureThreshold: 5 +{{- end }} diff --git a/stable/airflow/templates/deployments-scheduler.yaml b/stable/airflow/templates/deployments-scheduler.yaml new file mode 100644 index 000000000000..50f2c8a60bcc --- /dev/null +++ b/stable/airflow/templates/deployments-scheduler.yaml @@ -0,0 +1,133 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ template "airflow.fullname" . }}-scheduler + labels: + app: {{ template "airflow.name" . }} + component: scheduler + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + replicas: 1 + strategy: + # Kill the scheduler as soon as possible. It will restart quickly with all the workers, + # minimizing the time they are not synchronized. + type: RollingUpdate + rollingUpdate: + maxSurge: 0 + maxUnavailable: 100% + selector: + matchLabels: + app: {{ template "airflow.name" . }} + component: scheduler + release: {{ .Release.Name }} + template: + metadata: + annotations: + checksum/config: {{ include (print $.Template.BasePath "/configmap-airflow.yaml") . | sha256sum }} + configmap.fabric8.io/update-on-change: "{{ template "airflow.fullname" . }}-env" + labels: + app: {{ template "airflow.name" . }} + component: scheduler + release: {{ .Release.Name }} + spec: + {{- if .Values.airflow.image.pullSecret }} + imagePullSecrets: + - name: {{ .Values.airflow.image.pullSecret }} + {{- end }} + restartPolicy: Always + serviceAccountName: {{ template "airflow.serviceAccountName" . }} + {{- if .Values.dags.initContainer.enabled }} + initContainers: + - name: git-clone + image: alpine/git # Any image with git will do + command: + - /usr/local/git/git-clone.sh + args: + - "{{ .Values.dags.git.url }}" + - "{{ .Values.dags.git.ref }}" + - "/dags" + volumeMounts: + - name: git-clone + mountPath: /usr/local/git + - name: dags-data + mountPath: /dags + {{- end }} + containers: + - name: {{ .Chart.Name }}-scheduler + image: {{ .Values.airflow.image.repository }}:{{ .Values.airflow.image.tag }} + imagePullPolicy: {{ .Values.airflow.image.pullPolicy}} + envFrom: + - configMapRef: + name: "{{ template "airflow.fullname" . }}-env" + env: + - name: POSTGRES_USER + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresUser + - name: POSTGRES_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresPassword + - name: REDIS_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: redisPassword + volumeMounts: + {{- if .Values.persistence.enabled }} + - name: dags-data + mountPath: {{ .Values.dags.path }} + {{- else if .Values.dags.initContainer.enabled }} + - name: dags-data + mountPath: {{ .Values.dags.path }} + - name: scripts + mountPath: /usr/local/scripts + {{- end }} + args: + - "bash" + - "-c" + {{- if and ( .Values.dags.initContainer.enabled ) ( .Values.dags.initContainer.installRequirements ) }} + - > + echo 'waiting 10s...' && + sleep 10 && + echo 'installing requirements...' && + mkdir -p /usr/local/airflow/.local/bin && + export PATH=/usr/local/airflow/.local/bin:$PATH && + /usr/local/scripts/install-requirements.sh && + echo "executing initdb" && + airflow initdb && + echo "executing scheduler" && + airflow scheduler -n {{ .Values.airflow.schedulerNumRuns }} + {{- else }} + - > + echo 'waiting 10s...' && + sleep 10 && + mkdir -p /usr/local/airflow/.local/bin && + export PATH=/usr/local/airflow/.local/bin:$PATH && + echo "executing initdb" && + airflow initdb && + echo "executing scheduler" && + airflow scheduler -n {{ .Values.airflow.schedulerNumRuns }} + {{- end }} + volumes: + - name: dags-data + {{- if .Values.persistence.enabled }} + persistentVolumeClaim: + claimName: {{ .Values.persistence.existingClaim | default (include "airflow.fullname" .) }} + {{- else }} + emptyDir: {} + {{- end }} + {{- if .Values.dags.initContainer.enabled }} + - name: scripts + configMap: + name: {{ template "airflow.fullname" . }}-scripts + defaultMode: 0755 + - name: git-clone + configMap: + name: {{ template "airflow.fullname" . }}-git-clone + defaultMode: 0755 + {{- end }} diff --git a/stable/airflow/templates/deployments-web.yaml b/stable/airflow/templates/deployments-web.yaml new file mode 100644 index 000000000000..03763c1c3c3e --- /dev/null +++ b/stable/airflow/templates/deployments-web.yaml @@ -0,0 +1,142 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ template "airflow.fullname" . }}-web + labels: + app: {{ template "airflow.name" . }} + component: web + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + replicas: {{ .Values.airflow.webReplicas }} + minReadySeconds: 120 + strategy: + # Smooth rolling update of the Web UI + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 + selector: + matchLabels: + app: {{ template "airflow.name" . }} + component: web + release: {{ .Release.Name }} + template: + metadata: + annotations: + checksum/config: {{ include (print $.Template.BasePath "/configmap-airflow.yaml") . | sha256sum }} + configmap.fabric8.io/update-on-change: "{{ template "airflow.fullname" . }}-env" + labels: + app: {{ template "airflow.name" . }} + component: web + release: {{ .Release.Name }} + spec: + {{- if .Values.airflow.image.pullSecret }} + imagePullSecrets: + - name: {{ .Values.airflow.image.pullSecret }} + {{- end }} + restartPolicy: Always + {{- if .Values.dags.initContainer.enabled }} + initContainers: + - name: git-clone + image: alpine/git # Any image with git will do + command: + - /usr/local/git/git-clone.sh + args: + - "{{ .Values.dags.git.url }}" + - "{{ .Values.dags.git.ref }}" + - "/dags" + volumeMounts: + - name: git-clone + mountPath: /usr/local/git + - name: dags-data + mountPath: /dags + {{- end }} + containers: + - name: {{ .Chart.Name }}-web + image: {{ .Values.airflow.image.repository }}:{{ .Values.airflow.image.tag }} + imagePullPolicy: {{ .Values.airflow.image.pullPolicy}} + ports: + - name: web + containerPort: 8080 + protocol: TCP + envFrom: + - configMapRef: + name: "{{ template "airflow.fullname" . }}-env" + env: + - name: POSTGRES_USER + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresUser + - name: POSTGRES_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresPassword + - name: REDIS_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: redisPassword + volumeMounts: + {{- if .Values.persistence.enabled }} + - name: dags-data + mountPath: {{ .Values.dags.path }} + {{- else if .Values.dags.initContainer.enabled }} + - name: dags-data + mountPath: {{ .Values.dags.path }} + - name: scripts + mountPath: /usr/local/scripts + {{- end }} + args: + - "bash" + - "-c" + {{- if and ( .Values.dags.initContainer.enabled ) ( .Values.dags.initContainer.installRequirements ) }} + - > + echo 'waiting 60s...' && + sleep 60 && + echo 'installing requirements...' && + mkdir -p /usr/local/airflow/.local/bin && + export PATH=/usr/local/airflow/.local/bin:$PATH && + /usr/local/scripts/install-requirements.sh && + echo 'executing webserver...' && + airflow webserver + {{- else }} + - > + echo 'waiting 60s...' && + sleep 60 && + mkdir -p /usr/local/airflow/.local/bin && + export PATH=/usr/local/airflow/.local/bin:$PATH && + echo 'executing webserver...' && + airflow webserver + {{- end }} + livenessProbe: + httpGet: + path: "{{ .Values.ingress.web.path }}/health" + port: web + ## Keep 6 minutes the delay to allow clean wait of postgres and redis containers + initialDelaySeconds: 360 + periodSeconds: 60 + timeoutSeconds: 1 + successThreshold: 1 + failureThreshold: 5 + volumes: + - name: dags-data + {{- if .Values.persistence.enabled }} + persistentVolumeClaim: + claimName: {{ .Values.persistence.existingClaim | default (include "airflow.fullname" .) }} + {{- else }} + emptyDir: {} + {{- end }} + {{- if .Values.dags.initContainer.enabled }} + - name: scripts + configMap: + name: {{ template "airflow.fullname" . }}-scripts + defaultMode: 0755 + - name: git-clone + configMap: + name: {{ template "airflow.fullname" . }}-git-clone + defaultMode: 0755 + {{- end }} diff --git a/stable/airflow/templates/ingress-flower.yaml b/stable/airflow/templates/ingress-flower.yaml new file mode 100644 index 000000000000..c5f0e27f7071 --- /dev/null +++ b/stable/airflow/templates/ingress-flower.yaml @@ -0,0 +1,31 @@ +{{- if and (.Values.workers.enabled) (.Values.ingress.enabled) -}} +apiVersion: extensions/v1beta1 +kind: Ingress +metadata: + name: {{ template "airflow.fullname" . }}-flower + labels: + app: {{ template "airflow.name" . }} + component: flower + chart: "{{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}" + release: "{{ .Release.Name }}" + heritage: "{{ .Release.Service }}" + annotations: + {{ range $key, $value := .Values.ingress.flower.annotations }} + {{ $key }}: {{ $value | quote }} + {{- end }} +spec: +{{- if .Values.ingress.flower.tls.enabled }} + tls: + - hosts: + - {{ .Values.ingress.flower.host }} + secretName: {{ .Values.ingress.flower.tls.secretName }} +{{- end }} + rules: + - http: + paths: + - path: {{ .Values.ingress.flower.path }} + backend: + serviceName: {{ template "airflow.fullname" . }}-flower + servicePort: flower + host: {{ .Values.ingress.flower.host }} +{{- end }} diff --git a/stable/airflow/templates/ingress-web.yaml b/stable/airflow/templates/ingress-web.yaml new file mode 100644 index 000000000000..ba61d8730b6e --- /dev/null +++ b/stable/airflow/templates/ingress-web.yaml @@ -0,0 +1,31 @@ +{{- if .Values.ingress.enabled -}} +apiVersion: extensions/v1beta1 +kind: Ingress +metadata: + name: {{ template "airflow.fullname" . }}-web + labels: + app: {{ template "airflow.name" . }} + component: web + chart: "{{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}" + release: "{{ .Release.Name }}" + heritage: "{{ .Release.Service }}" + annotations: + {{ range $key, $value := .Values.ingress.web.annotations }} + {{ $key }}: {{ $value | quote }} + {{- end }} +spec: +{{- if .Values.ingress.web.tls.enabled }} + tls: + - hosts: + - {{ .Values.ingress.web.host }} + secretName: {{ .Values.ingress.web.tls.secretName }} +{{- end }} + rules: + - http: + paths: + - path: {{ .Values.ingress.web.path }} + backend: + serviceName: {{ template "airflow.fullname" . }}-web + servicePort: web + host: {{ .Values.ingress.web.host }} +{{- end }} diff --git a/stable/airflow/templates/poddisruptionbudget.yaml b/stable/airflow/templates/poddisruptionbudget.yaml new file mode 100644 index 000000000000..116d96b6281d --- /dev/null +++ b/stable/airflow/templates/poddisruptionbudget.yaml @@ -0,0 +1,17 @@ +apiVersion: policy/v1beta1 +kind: PodDisruptionBudget +metadata: + name: {{ template "airflow.fullname" . }}-pdb + labels: + app: {{ template "airflow.name" . }} + component: scheduler + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + selector: + matchLabels: + app: {{ template "airflow.name" . }} + component: scheduler + release: {{ .Release.Name }} +{{ toYaml .Values.airflow.podDisruptionBudget | indent 2 }} diff --git a/stable/airflow/templates/pvc.yaml b/stable/airflow/templates/pvc.yaml new file mode 100644 index 000000000000..f13a2a408c8d --- /dev/null +++ b/stable/airflow/templates/pvc.yaml @@ -0,0 +1,24 @@ +{{- if and .Values.persistence.enabled (not .Values.persistence.existingClaim) }} +kind: PersistentVolumeClaim +apiVersion: v1 +metadata: + name: {{ template "airflow.fullname" . }} + labels: + app: {{ template "airflow.fullname" . }} + chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" + release: "{{ .Release.Name }}" + heritage: "{{ .Release.Service }}" +spec: + accessModes: + - {{ .Values.persistence.accessMode | quote }} + resources: + requests: + storage: {{ .Values.persistence.size | quote }} +{{- if .Values.persistence.storageClass }} +{{- if (eq "-" .Values.persistence.storageClass) }} + storageClassName: "" +{{- else }} + storageClassName: "{{ .Values.persistence.storageClass }}" +{{- end }} +{{- end }} +{{- end }} diff --git a/stable/airflow/templates/role-binding.yaml b/stable/airflow/templates/role-binding.yaml new file mode 100644 index 000000000000..654f429c32d1 --- /dev/null +++ b/stable/airflow/templates/role-binding.yaml @@ -0,0 +1,19 @@ +{{ if .Values.rbac.create }} +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: RoleBinding +metadata: + name: {{ template "airflow.fullname" . }} + labels: + app: {{ template "airflow.name" . }} + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + name: {{ template "airflow.fullname" . }} +subjects: +- kind: ServiceAccount + name: {{ template "airflow.serviceAccountName" . }} + namespace: {{ .Release.Namespace }} +{{ end }} \ No newline at end of file diff --git a/stable/airflow/templates/role.yaml b/stable/airflow/templates/role.yaml new file mode 100644 index 000000000000..d049c1f922b2 --- /dev/null +++ b/stable/airflow/templates/role.yaml @@ -0,0 +1,20 @@ +{{ if .Values.rbac.create }} +apiVersion: rbac.authorization.k8s.io/v1beta1 +kind: Role +metadata: + name: {{ template "airflow.fullname" . }} + labels: + app: {{ template "airflow.name" . }} + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +rules: +- apiGroups: [""] + resources: + - pods + verbs: ["create", "get", "delete", "list", "watch"] +- apiGroups: [""] + resources: + - "pods/log" + verbs: ["get", "list"] +{{ end }} \ No newline at end of file diff --git a/stable/airflow/templates/secrets.yaml b/stable/airflow/templates/secrets.yaml new file mode 100644 index 000000000000..17e23cdded84 --- /dev/null +++ b/stable/airflow/templates/secrets.yaml @@ -0,0 +1,14 @@ +apiVersion: v1 +kind: Secret +metadata: + name: {{ template "airflow.fullname" . }} + labels: + app: {{ template "airflow.fullname" . }} + chart: "{{ .Chart.Name }}-{{ .Chart.Version }}" + release: "{{ .Release.Name }}" + heritage: "{{ .Release.Service }}" +type: Opaque +data: + postgresUser: {{ .Values.postgresql.postgresUser | b64enc | quote }} + postgresPassword: {{ .Values.postgresql.postgresPassword | b64enc | quote }} + redisPassword: {{ .Values.redis.password | b64enc | quote }} diff --git a/stable/airflow/templates/service-account.yaml b/stable/airflow/templates/service-account.yaml new file mode 100644 index 000000000000..0f6022e6d07d --- /dev/null +++ b/stable/airflow/templates/service-account.yaml @@ -0,0 +1,11 @@ +{{ if .Values.serviceAccount.create }} +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ template "airflow.serviceAccountName" . }} + labels: + app: {{ template "airflow.name" . }} + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +{{ end }} \ No newline at end of file diff --git a/stable/airflow/templates/service-flower.yaml b/stable/airflow/templates/service-flower.yaml new file mode 100644 index 000000000000..8da285f093a6 --- /dev/null +++ b/stable/airflow/templates/service-flower.yaml @@ -0,0 +1,22 @@ +{{- if .Values.workers.enabled -}} +apiVersion: v1 +kind: Service +metadata: + name: {{ template "airflow.fullname" . }}-flower + labels: + app: {{ template "airflow.name" . }} + component: flower + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + type: {{ .Values.airflow.service.type }} + selector: + app: {{ template "airflow.name" . }} + component: flower + release: {{ .Release.Name }} + ports: + - name: flower + protocol: TCP + port: 5555 +{{- end }} diff --git a/stable/airflow/templates/service-web.yaml b/stable/airflow/templates/service-web.yaml new file mode 100644 index 000000000000..3aba2ab1c392 --- /dev/null +++ b/stable/airflow/templates/service-web.yaml @@ -0,0 +1,20 @@ +apiVersion: v1 +kind: Service +metadata: + name: {{ template "airflow.fullname" . }}-web + labels: + app: {{ template "airflow.name" . }} + component: web + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + type: {{ .Values.airflow.service.type }} + selector: + app: {{ template "airflow.name" . }} + component: web + release: {{ .Release.Name }} + ports: + - name: web + protocol: TCP + port: 8080 diff --git a/stable/airflow/templates/service-worker.yaml b/stable/airflow/templates/service-worker.yaml new file mode 100644 index 000000000000..02c7a728016e --- /dev/null +++ b/stable/airflow/templates/service-worker.yaml @@ -0,0 +1,22 @@ +{{- if .Values.workers.enabled -}} +# Headless service for stable DNS entries of StatefulSet members. +apiVersion: v1 +kind: Service +metadata: + name: {{ template "airflow.fullname" . }}-worker + labels: + app: {{ template "airflow.name" . }} + component: worker + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + ports: + - name: worker + protocol: TCP + port: 8793 + clusterIP: None + selector: + app: {{ template "airflow.name" . }} + component: worker +{{- end }} diff --git a/stable/airflow/templates/statefulsets-workers.yaml b/stable/airflow/templates/statefulsets-workers.yaml new file mode 100644 index 000000000000..49dcf2993cc4 --- /dev/null +++ b/stable/airflow/templates/statefulsets-workers.yaml @@ -0,0 +1,156 @@ +{{- if .Values.workers.enabled -}} +## Workers are not in deployment, but in StatefulSet, to allow each worker expose a mini-server +## that only serve logs, that will be used by the web server. + +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: {{ template "airflow.fullname" . }}-worker + labels: + app: {{ template "airflow.name" . }} + component: worker + chart: {{ template "airflow.chart" . }} + release: {{ .Release.Name }} + heritage: {{ .Release.Service }} +spec: + serviceName: "{{ template "airflow.fullname" . }}-worker" + updateStrategy: + ## Kill the workers as soon as possible, the scheduler will restart the failed job later + type: RollingUpdate + ## Use experimental burst mode for faster StatefulSet scaling + ## https://github.com/kubernetes/kubernetes/commit/c2c5051adf096ffd48bf1dcf5b11cb47e464ecdd + podManagementPolicy: Parallel + replicas: {{ .Values.workers.replicas }} + selector: + matchLabels: + app: {{ template "airflow.name" . }} + component: worker + release: {{ .Release.Name }} + template: + metadata: + annotations: + checksum/config: {{ include (print $.Template.BasePath "/configmap-airflow.yaml") . | sha256sum }} + configmap.fabric8.io/update-on-change: "{{ template "airflow.fullname" . }}-env" + {{ range $key, $value := .Values.workers.pod.annotations }} + {{ $key }}: {{ $value | quote }} + {{- end }} + labels: + app: {{ template "airflow.name" . }} + component: worker + release: {{ .Release.Name }} + spec: + {{- if .Values.airflow.image.pullSecret }} + imagePullSecrets: + - name: {{ .Values.airflow.image.pullSecret }} + {{- end }} + restartPolicy: Always + terminationGracePeriodSeconds: 30 + serviceAccountName: {{ template "airflow.serviceAccountName" . }} + {{- if .Values.dags.initContainer.enabled }} + initContainers: + - name: git-clone + image: alpine/git # Any image with git will do + command: + - /usr/local/git/git-clone.sh + args: + - "{{ .Values.dags.git.url }}" + - "{{ .Values.dags.git.ref }}" + - "/dags" + volumeMounts: + - name: git-clone + mountPath: /usr/local/git + - name: dags-data + mountPath: /dags + {{- end }} + containers: + - name: {{ .Chart.Name }}-worker + imagePullPolicy: {{ .Values.airflow.image.pullPolicy }} + image: "{{ .Values.airflow.image.repository }}:{{ .Values.airflow.image.tag }}" + envFrom: + - configMapRef: + name: "{{ template "airflow.fullname" . }}-env" + env: + - name: POSTGRES_USER + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresUser + - name: POSTGRES_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: postgresPassword + - name: REDIS_PASSWORD + valueFrom: + secretKeyRef: + name: {{ template "airflow.fullname" . }} + key: redisPassword + volumeMounts: + {{- $secretsDir := .Values.workers.secretsDir -}} + {{- range .Values.workers.secrets }} + - name: {{ . }}-volume + readOnly: true + mountPath: {{ $secretsDir }}/{{ . }} + subPath: {{ . }} + {{- end }} + {{- if .Values.persistence.enabled }} + - name: dags-data + mountPath: {{ .Values.dags.path }} + {{- else if .Values.dags.initContainer.enabled }} + - name: dags-data + mountPath: {{ .Values.dags.path }} + - name: scripts + mountPath: /usr/local/scripts + {{- end }} + args: + - "bash" + - "-c" + {{- if and ( .Values.dags.initContainer.enabled ) ( .Values.dags.initContainer.installRequirements ) }} + - > + echo 'waiting 60s...' && + sleep 60 && + echo 'installing requirements...' && + mkdir -p /usr/local/airflow/.local/bin && + export PATH=/usr/local/airflow/.local/bin:$PATH && + /usr/local/scripts/install-requirements.sh && + echo 'executing worker...' && + airflow worker + {{- else }} + - > + echo 'waiting 60s...' && + sleep 60 && + mkdir -p /usr/local/airflow/.local/bin && + export PATH=/usr/local/airflow/.local/bin:$PATH && + echo 'executing worker...' && + airflow worker + {{- end }} + ports: + - name: wlog + containerPort: 8793 + protocol: TCP + resources: +{{ toYaml .Values.workers.resources | indent 12 }} + volumes: + {{- range .Values.workers.secrets }} + - name: {{ . }}-volume + secret: + secretName: {{ . }} + {{- end }} + - name: dags-data + {{- if .Values.persistence.enabled }} + persistentVolumeClaim: + claimName: {{ .Values.persistence.existingClaim | default (include "airflow.fullname" .) }} + {{- else }} + emptyDir: {} + {{- end }} + {{- if .Values.dags.initContainer.enabled }} + - name: scripts + configMap: + name: {{ template "airflow.fullname" . }}-scripts + defaultMode: 0755 + - name: git-clone + configMap: + name: {{ template "airflow.fullname" . }}-git-clone + defaultMode: 0755 + {{- end }} +{{- end }} diff --git a/stable/airflow/values.yaml b/stable/airflow/values.yaml new file mode 100644 index 000000000000..b9d12da45fd1 --- /dev/null +++ b/stable/airflow/values.yaml @@ -0,0 +1,365 @@ +# Duplicate this file and put your customization here + + +## +## common settings and setting for the webserver +airflow: + ## + ## You will need to define your fernet key: + ## Generate fernetKey with: + ## python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)" + ## fernetKey: ABCDABCDABCDABCDABCDABCDABCDABCDABCDABCD + fernetKey: "" + service: + type: ClusterIP + ## + ## The executor to use. + ## + executor: Celery + ## + ## set the max number of retries during container initialization + initRetryLoop: + ## + ## base image for webserver/scheduler/workers + ## Note: If you want to use airflow HEAD (2.0dev), use the following image: + # image + # repository: stibbons31/docker-airflow-dev + # tag: 2.0dev + ## Airflow 2.0 allows changing the value ingress.web.path and ingress.flower.path (see bellow). + ## In version < 2.0, changing these paths won't have any effect. + image: + ## + ## docker-airflow image + repository: puckel/docker-airflow + ## + ## image tag + tag: 1.10.0-4 + ## + ## Image pull policy + ## values: Always or IfNotPresent + pullPolicy: IfNotPresent + ## + ## image pull secret for private images + pullSecret: + ## + ## Set schedulerNumRuns to control how the schduler behaves: + ## -1 will let him looping indefinitively but it will never update the DAG + ## 1 will have the scheduler quit after each refresh, but kubernetes will restart it. + ## + ## A long running scheduler process, at least with the CeleryExecutor, ends up not scheduling + ## some tasks. We still don’t know the exact cause, unfortunately. Airflow has a built-in + ## workaround in the form of the `num_runs` flag. + ## Airflow runs with num_runs set to 5. + ## + ## If set to a value != -1, you will see your scheduler regularly restart. This is its normal + ## behavior under these conditions. + schedulerNumRuns: "-1" + ## + ## Set schedulerDoPickle to toggle whether to have the scheduler + ## attempt to pickle the DAG object to send over to the workers, + ## instead of letting workers run their version of the code. + ## See the Airflow documentation for the --do_pickle argument: https://airflow.apache.org/cli.html#scheduler + schedulerDoPickle: true + ## + ## Number of replicas for web server. + ## For the moment, we recommend to leave this value to 1, since the webserver instance performs + ## the 'initdb' operation, starting more replicas will cause all the web containers to execute + ## it, which may cause unwanted issues on the database. + webReplicas: 1 + ## + ## Custom airflow configuration environment variables + ## Use this to override any airflow setting settings defining environment variables in the + ## following form: AIRFLOW__
__. + ## See the Airflow documentation: http://airflow.readthedocs.io/en/latest/configuration.html?highlight=__CORE__#setting-configuration-options) + ## Example: + ## config: + ## AIRFLOW__CORE__EXPOSE_CONFIG: "True" + ## HTTP_PROXY: "http://proxy.mycompany.com:123" + config: {} + ## + ## Configure pod disruption budget for the scheduler + podDisruptionBudget: + maxUnavailable: 1 + +## +## Workers configuration +workers: + enabled: true + ## + ## Number of workers pod to launch + replicas: 1 + ## + ## Custom resource configuration + resources: {} + # limits: + # cpu: "1" + # memory: "2G" + # requests: + # cpu: "0.5" + # memory: "512Mi" + ## + ## Annotations for the Worker pods + pod: + annotations: + ## Example: + ## iam.amazonaws.com/role: airflow-worker-Role + ## + ## Celery worker configuration + celery: + ## + ## number of parallel celery tasks per worker + instances: 1 + ## + ## Directory in which to mount secrets on worker nodes. + secretsDir: /var/airflow/secrets + ## + ## Secrets which will be mounted as a file at `secretsDir/`. + secrets: [] + + +## +## Ingress configuration +ingress: + ## + ## enable ingress + ## Note: If you want to change url prefix for web ui or flower even if you do not use ingress, + ## you can still change ingress.web.path and ingress.flower.path + enabled: false + ## + ## Configure the webserver endpoint + web: + ## NOTE: This requires an airflow version > 1.9.x + ## For the moment (March 2018) this is **not** available on official package, you will have + ## to use an image where airflow has been updated to its current HEAD. + ## You can use the following one: + ## stibbons31/docker-airflow-dev:2.0dev + ## + ## if path is '/airflow': + ## - UI will be accessible at 'http://mycompany.com/airflow/admin' + ## - Healthcheck is at 'http://mycompany.com/airflow/health' + ## - api is at 'http://mycompany.com/airflow/api' + ## NOTE: do NOT keep trailing slash. For root configuration, set and empty string + path: "" + ## + ## hostname for the webserver + host: "" + ## + ## Annotations for the webserver + ## Airflow webserver handles relative path completely, just let your load balancer give the HTTP + ## header like the requested URL (no special configuration neeed) + annotations: + ## + ## Example for Traefik: + # traefik.frontend.rule.type: PathPrefix + # kubernetes.io/ingress.class: traefik + tls: + ## Set to "true" to enable TLS termination at the ingress + enabled: false + ## If enabled, set "secretName" to the secret containing the TLS private key and certificate + ## Example: + ## secretName: example-com-crt + ## + ## Configure the flower endpoind + flower: + ## + ## If flower is '/airflow/flower': + ## - Flower UI is at 'http://mycompany.com/airflow/flower' + ## NOTE: you need to have a reverse proxy/load balancer able to do URL rewrite in order to have + ## flower mounted on other path than root. Flower only does half the job in url prefixing: it + ## only generates the right URL/relative paths in the **returned HTML files**, but expects the + ## request to have been be at the root. + ## That's why we need a reverse proxy/load balancer that is able to strip the path + ## NOTE: do NOT keep trailing slash. For root configuration, set and empty string + path: "" + ## + ## Configure the liveness path. Keep to "/" for Flower >= jan 2018. + ## For previous version, enter the same path than in the 'path' key + ## NOTE: keep the trailing slash. + livenessPath: / + ## + ## hostname for flower + host: "" + ## + ## Annotation for the Flower endpoint + ## + ## ==== SKIP THE FOLLOWING BLOCK IF YOU HAVE FLOWER > JANUARY 2018 ============================= + ## Please note their is a small difference between the way Airflow Web server and Flower handles + ## URL prefixes in HTTP requests: + ## Flower wants HTTP header to behave like there was no URL prefix, and but still generates + ## the right URL in html pages thanks to its `--url-prefix` parameter + ## + ## Extracted from the Flower documentation: + ## (https://github.com/mher/flower/blob/master/docs/config.rst#url_prefix) + ## + ## To access Flower on http://example.com/flower run it with: + ## flower --url-prefix=/flower + ## + ## Use the following nginx configuration: + ## server { + ## listen 80; + ## server_name example.com; + ## + ## location /flower/ { + ## rewrite ^/flower/(.*)$ /$1 break; + ## proxy_pass http://example.com:5555; + ## proxy_set_header Host $host; + ## } + ## } + ## ==== IF YOU HAVE FLOWER > JANUARY 2018, NO MORE NEED TO STRIP THE PREFIX ==================== + annotations: + ## + ## NOTE: it is important here to have your reverse proxy strip the path/rewrite the URL + ## Example for Traefik: + # traefik.frontend.rule.type: PathPrefix ## Flower >= Jan 2018 + # traefik.frontend.rule.type: PathPrefixStrip ## Flower < Jan 2018 + # kubernetes.io/ingress.class: traefik + tls: + ## Set to "true" to enable TLS termination at the ingress + enabled: false + ## If enabled, set "secretName" to the secret containing the TLS private key and certificate + ## Example: + ## secretName: example-com-crt + + +## +## Storage configuration for DAGs +persistence: + ## + ## enable persistance storage + enabled: false + ## + ## Existing claim to use + # existingClaim: nil + ## + ## Persistent Volume Storage Class + ## If defined, storageClassName: + ## If set to "-", storageClassName: "", which disables dynamic provisioning + ## If undefined (the default) or set to null, no storageClassName spec is + ## set, choosing the default provisioner. (gp2 on AWS, standard on + ## GKE, AWS & OpenStack) + # storageClass: default + accessMode: ReadWriteOnce + ## + ## Persistant storage size request + size: 1Gi + + +## +## Configure DAGs deployment and update +dags: + ## + ## mount path for persistent volume. + ## Note that this location is referred to in airflow.cfg, so if you change it, you must update airflow.cfg accordingly. + path: /usr/local/airflow/dags + ## + ## Set to True to prevent pickling DAGs from scheduler to workers + doNotPickle: false + ## + ## Configure Git repository to fetch DAGs + git: + ## + ## url to clone the git repository + url: + ## + ## branch name, tag or sha1 to reset to + ref: master + initContainer: + ## Fetch the source code when the pods starts + enabled: false + ## install requirements.txt dependencies automatically + installRequirements: true + +## +## Enable RBAC +rbac: + ## + ## Specifies whether RBAC resources should be created + create: true + +## +## Create or use ServiceAccount +serviceAccount: + ## + ## Specifies whether a ServiceAccount should be created + create: true + ## The name of the ServiceAccount to use. + ## If not set and create is true, a name is generated using the fullname template + name: + +## +## Configuration values for the postgresql dependency. +## ref: https://github.com/kubernetes/charts/blob/master/stable/postgresql/README.md +postgresql: + ## + ## Use the PostgreSQL chart dependency. + ## Set to false if bringing your own PostgreSQL. + enabled: true + ## + ## If bringing your own PostgreSQL, the full uri to use + ## e.g. postgres://airflow:changeme@my-postgres.com:5432/airflow?sslmode=disable + # uri: + ## + ## PostgreSQL hostname + ## postgresHost: + ## + ## PostgreSQL port + postgresPort: 5432 + ## PostgreSQL User to create. + postgresUser: postgres + ## + ## PostgreSQL Password for the new user. + ## If not set, a random 10 characters password will be used. + postgresPassword: airflow + ## + ## PostgreSQL Database to create. + postgresDatabase: airflow + ## + ## Persistent Volume Storage configuration. + ## ref: https://kubernetes.io/docs/user-guide/persistent-volumes + persistence: + ## + ## Enable PostgreSQL persistence using Persistent Volume Claims. + enabled: true + ## + ## Persistant class + # storageClass: classname + ## + ## Access mode: + accessMode: ReadWriteOnce + + +## Configuration values for the Redis dependency. +## ref: https://github.com/kubernetes/charts/blob/master/stable/redis/README.md +redis: + ## + ## Use the redis chart dependency. + ## Set to false if bringing your own redis. + enabled: true + ## + ## Redis password + password: airflow + ## + ## Master configuration + master: + ## + ## Image configuration + # image: + ## + ## docker registry secret names (list) + # pullSecrets: nil + ## + ## Configure persistance + persistence: + ## + ## Use a PVC to persist data. + enabled: false + ## + ## Persistant class + # storageClass: classname + ## + ## Access mode: + accessMode: ReadWriteOnce + ## + ## Disable cluster management by default. + cluster: + enabled: false