Skip to content

Commit

Permalink
Enables Kerberos sidecar support (#11130)
Browse files Browse the repository at this point in the history
Some of the users of Airflow are using Kerberos to authenticate
their worker workflows. Airflow has a basic support for Kerberos
for some of the operators and it has support to refresh the
temporary Kerberos tokens via `airflow kerberos` command.

This change adds support for the Kerberos side-car that connects
to the Kerberos Key Distribution Center and retrieves the
token using Keytab that should be deployed as Kubernetes Secret.

It uses shared volume to share the temporary token. The nice
thing about setting it up as a sidecar is that the Keytab
is never shared with the workers - the secret is only mounted
by the sidecar and the workers have only access to the temporary
token.

Depends on #11129

(cherry picked from commit 4d2a787)
  • Loading branch information
potiuk authored and kaxil committed Nov 18, 2020
1 parent a8b5048 commit 5f40cbc
Show file tree
Hide file tree
Showing 6 changed files with 156 additions and 8 deletions.
2 changes: 1 addition & 1 deletion breeze
Original file line number Diff line number Diff line change
Expand Up @@ -3073,7 +3073,7 @@ function breeze::run_breeze_command() {
# 3. last used version stored in ./build/PYTHON_MAJOR_MINOR_VERSION
# 4. DEFAULT_PYTHON_MAJOR_MINOR_VERSION from scripts/ci/libraries/_initialization.sh
#
# Here points 2. and 3. are realised. If result is empty string , the 4. will be set in
# Here points 2. and 3. are realized. If result is empty string , the 4. will be set in
# the next step (sanity_checks::basic_sanity_checks() is called and the version is still not set by then)
# finally, if --python flag is specified, it will override whatever is set above.
#
Expand Down
22 changes: 15 additions & 7 deletions chart/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,7 @@ helm upgrade airflow . \
--set images.airflow.tag=8a0da78
```

For local development purppose you can also u
You can also build the image locally and use it via deployment method described by Breeze.
For local development purpose you can also build the image locally and use it via deployment method described by Breeze.

## Mounting DAGS using Git-Sync side car with Persistence enabled

Expand Down Expand Up @@ -129,7 +128,7 @@ The following tables lists the configurable parameters of the Airflow chart and
| `privateRegistry.repository` | Repository where base image lives (eg: quay.io) | `~` |
| `networkPolicies.enabled` | Enable Network Policies to restrict traffic | `true` |
| `airflowHome` | Location of airflow home directory | `/opt/airflow` |
| `rbacEnabled` | Deploy pods with Kubernets RBAC enabled | `true` |
| `rbacEnabled` | Deploy pods with Kubernetes RBAC enabled | `true` |
| `executor` | Airflow executor (eg SequentialExecutor, LocalExecutor, CeleryExecutor, KubernetesExecutor) | `KubernetesExecutor` |
| `allowPodLaunching` | Allow airflow pods to talk to Kubernetes API to launch more pods | `true` |
| `defaultAirflowRepository` | Fallback docker repository to pull airflow image from | `apache/airflow` |
Expand Down Expand Up @@ -158,13 +157,22 @@ The following tables lists the configurable parameters of the Airflow chart and
| `data.resultBackendSecretName` | Secret name to mount Celery result backend connection string from | `~` |
| `data.metadataConection` | Field separated connection data (alternative to secret name) | `{}` |
| `data.resultBackendConnection` | Field separated connection data (alternative to secret name) | `{}` |
| `fernetKey` | String representing an Airflow fernet key | `~` |
| `fernetKeySecretName` | Secret name for Airlow fernet key | `~` |
| `fernetKey` | String representing an Airflow Fernet key | `~` |
| `fernetKeySecretName` | Secret name for Airflow Fernet key | `~` |
| `kerberos.enabled` | Enable kerberos support for workers | `false` |
| `kerberos.ccacheMountPath` | Location of the ccache volume | `/var/kerberos-ccache` |
| `kerberos.ccacheFileName` | Name of the ccache file | `ccache` |
| `kerberos.configPath` | Path for the Kerberos config file | `/etc/krb5.conf` |
| `kerberos.keytabPath` | Path for the Kerberos keytab file | `/etc/airflow.keytab` |
| `kerberos.principal` | Name of the Kerberos principal | `airflow` |
| `kerberos.reinitFrequency` | Frequency of reinitialization of the Kerberos token | `3600` |
| `kerberos.confg` | Content of the configuration file for kerberos (might be templated using Helm templates) | `<see values.yaml>` |
| `workers.replicas` | Replica count for Celery workers (if applicable) | `1` |
| `workers.keda.enabled` | Enable KEDA autoscaling features | `false` |
| `workers.keda.pollingInverval` | How often KEDA should poll the backend database for metrics in seconds | `5` |
| `workers.keda.cooldownPeriod` | How often KEDA should wait before scaling down in seconds | `30` |
| `workers.keda.maxReplicaCount` | Maximum number of Celery workers KEDA can scale to | `10` |
| `workers.kerberosSideCar.enabled` | Enable Kerberos sidecar for the worker | `false` |
| `workers.persistence.enabled` | Enable log persistence in workers via StatefulSet | `false` |
| `workers.persistence.size` | Size of worker volumes if enabled | `100Gi` |
| `workers.persistence.storageClassName` | StorageClass worker volumes should use if enabled | `default` |
Expand Down Expand Up @@ -196,8 +204,8 @@ The following tables lists the configurable parameters of the Airflow chart and
| `webserver.resources.requests.cpu` | CPU Request of webserver | `~` |
| `webserver.resources.requests.memory` | Memory Request of webserver | `~` |
| `webserver.defaultUser` | Optional default airflow user information | `{}` |
| `dags.persistence.*` | Dag persistence configutation | Please refer to `values.yaml` |
| `dags.gitSync.*` | Git sync configuration | Please refer to `values.yaml` |
| `dags.persistence.*` | Dag persistence configuration | Please refer to `values.yaml` |
| `dags.gitSync.*` | Git sync configuration | Please refer to `values.yaml` |


Specify each parameter using the `--set key=value[,key=value]` argument to `helm install`. For example,
Expand Down
12 changes: 12 additions & 0 deletions chart/templates/_helpers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,14 @@
{{ default (printf "%s-elasticsearch" .Release.Name) .Values.elasticsearch.secretName }}
{{- end }}

{{ define "kerberos_keytab_secret" -}}
{{ .Release.Name }}-kerberos-keytab
{{- end }}

{{ define "kerberos_ccache_path" -}}
{{ printf "%s/%s" .Values.kerberos.ccacheMountPath .Values.kerberos.ccacheFileName }}
{{- end }}

{{ define "pgbouncer_config" }}
{{- $pgMetadataHost := .Values.data.metadataConnection.host | default (printf "%s-%s.%s.svc.cluster.local" .Release.Name "postgresql" .Release.Namespace) }}
{{- $pgResultBackendHost := .Values.data.resultBackendConnection.host | default (printf "%s-%s.%s.svc.cluster.local" .Release.Name "postgresql" .Release.Namespace) }}
Expand Down Expand Up @@ -265,6 +273,10 @@ log_connections = {{ .Values.pgbouncer.logConnections }}
{{ (printf "%s/logs" .Values.airflowHome) | quote }}
{{- end }}

{{ define "airflow_logs_no_quote" -}}
{{ (printf "%s/logs" .Values.airflowHome) }}
{{- end }}

{{ define "airflow_dags" -}}
{{- if .Values.dags.gitSync.enabled -}}
{{ (printf "%s/dags/%s/%s" .Values.airflowHome .Values.dags.gitSync.dest .Values.dags.gitSync.subPath ) }}
Expand Down
4 changes: 4 additions & 0 deletions chart/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,8 @@ data:
{{- else }}
{{ tpl (.Files.Get "files/pod-template-file.kubernetes-helm-yaml") . | nindent 4 }}
{{- end }}
{{- if .Values.kerberos.enabled }}
krb5.conf: |
{{ tpl .Values.kerberos.config . | nindent 4 }}
{{- end }}
{{- end }}
61 changes: 61 additions & 0 deletions chart/templates/workers/worker-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,15 @@ spec:
mountPath: {{ template "airflow_config_path" . }}
subPath: airflow.cfg
readOnly: true
{{- if .Values.workers.kerberosSidecar.enabled }}
- name: config
mountPath: {{ .Values.kerberos.configPath | quote }}
subPath: krb5.conf
readOnly: true
- name: kerberos-ccache
mountPath: {{ .Values.kerberos.ccacheMountPath | quote }}
readOnly: true
{{- end }}
{{- if .Values.scheduler.airflowLocalSettings }}
- name: config
mountPath: {{ template "airflow_local_setting_path" . }}
Expand All @@ -145,10 +154,62 @@ spec:
- name: logs
mountPath: {{ template "airflow_logs" . }}
{{- end }}
{{- if .Values.workers.kerberosSidecar.enabled }}
- name: KRB5_CONFIG
value: {{ .Values.kerberos.configPath | quote }}
- name: KRB5CCNAME
value: {{ include "kerberos_ccache_path" . | quote }}
{{- end }}
{{- if .Values.workers.kerberosSidecar.enabled }}
- name: worker-kerberos
image: {{ template "airflow_image" . }}
imagePullPolicy: {{ .Values.images.airflow.pullPolicy }}
args: ["kerberos"]
resources:
{{ toYaml .Values.workers.resources | indent 12 }}
volumeMounts:
- name: logs
mountPath: {{ template "airflow_logs" . }}
- name: config
mountPath: {{ template "airflow_config_path" . }}
subPath: airflow.cfg
readOnly: true
- name: config
mountPath: {{ .Values.kerberos.configPath | quote }}
subPath: krb5.conf
readOnly: true
{{- if .Values.scheduler.airflowLocalSettings }}
- name: config
mountPath: {{ template "airflow_local_setting_path" . }}
subPath: airflow_local_settings.py
readOnly: true
{{- end }}
- name: kerberos-keytab
subPath: "kerberos.keytab"
mountPath: {{ .Values.kerberos.keytabPath | quote }}
readOnly: true
- name: kerberos-ccache
mountPath: {{ .Values.kerberos.ccacheMountPath | quote }}
readOnly: false
env:
- name: KRB5_CONFIG
value: {{ .Values.kerberos.configPath | quote }}
- name: KRB5CCNAME
value: {{ include "kerberos_ccache_path" . | quote }}
{{- include "custom_airflow_environment" . | indent 10 }}
{{- include "standard_airflow_environment" . | indent 10 }}
{{- end }}
volumes:
- name: kerberos-keytab
secret:
secretName: {{ include "kerberos_keytab_secret" . | quote }}
- name: config
configMap:
name: {{ template "airflow_config" . }}
{{- if .Values.kerberos.enabled }}
- name: kerberos-ccache
emptyDir: {}
{{- end }}
{{- if .Values.dags.persistence.enabled }}
- name: dags
persistentVolumeClaim:
Expand Down
63 changes: 63 additions & 0 deletions chart/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,59 @@ data:
fernetKey: ~
fernetKeySecretName: ~


# In order to use kerberos you need to create secret containing the keytab file
# The secret name should follow naming convention of the application where resources are
# name {{ .Release-name }}-<POSTFIX>. In case of the keytab file, the postfix is "kerberos-keytab"
# So if your release is named "my-release" the name of the secret should be "my-release-kerberos-keytab"
#
# The Keytab content should be available in the "kerberos.keytab" key of the secret.
#
# apiVersion: v1
# kind: Secret
# data:
# kerberos.keytab: <base64_encoded keytab file content>
# type: Opaque
#
#
# If you have such keytab file you can do it with similar
#
# kubectl create secret generic {{ .Release.name }}-kerberos-keytab --from-file=kerberos.keytab
#
kerberos:
enabled: false
ccacheMountPath: '/var/kerberos-ccache'
ccacheFileName: 'cache'
configPath: '/etc/krb5.conf'
keytabPath: '/etc/airflow.keytab'
principal: 'airflow@FOO.COM'
reinitFrequency: 3600
config: |
# This is an example config showing how you can use templating and how "example" config
# might look like. It works with the test kerberos server that we are using during integration
# testing at Apache Airflow (see `scripts/ci/docker-compose/integration-kerberos.yml` but in
# order to make it production-ready you must replace it with your own configuration that
# Matches your kerberos deployment. Administrators of your Kerberos instance should
# provide the right configuration.
[logging]
default = "FILE:{{ template "airflow_logs_no_quote" . }}/kerberos_libs.log"
kdc = "FILE:{{ template "airflow_logs_no_quote" . }}/kerberos_kdc.log"
admin_server = "FILE:{{ template "airflow_logs_no_quote" . }}/kadmind.log"
[libdefaults]
default_realm = FOO.COM
ticket_lifetime = 10h
renew_lifetime = 7d
forwardable = true
[realms]
FOO.COM = {
kdc = kdc-server.foo.com
admin_server = admin_server.foo.com
}
# Airflow Worker Config
workers:
# Number of airflow celery workers in StatefulSet
Expand Down Expand Up @@ -161,6 +214,10 @@ workers:
# of local-path provisioner.
fixPermissions: false

kerberosSidecar:
# Enable kerberos sidecar
enabled: false

resources: {}
# limits:
# cpu: 100m
Expand Down Expand Up @@ -495,6 +552,12 @@ config:
timeout: 30
retry_timeout: 'True'

kerberos:
keytab: '{{ .Values.kerberos.keytabPath }}'
reinit_frequency: '{{ .Values.kerberos.reinitFrequency }}'
principal: '{{ .Values.kerberos.principal }}'
ccache: '{{ .Values.kerberos.ccacheMountPath }}/{{ .Values.kerberos.ccacheFileName }}'

kubernetes:
namespace: '{{ .Release.Namespace }}'
airflow_configmap: '{{ include "airflow_config" . }}'
Expand Down

0 comments on commit 5f40cbc

Please sign in to comment.