Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions demos/airflow-scheduled-job/01-airflow-demo-clusterrole.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: airflow-demo-clusterrole
rules:
- apiGroups:
- spark.stackable.tech
resources:
- sparkapplications
verbs:
- create
- get
- list
- apiGroups:
- apps
resources:
- statefulsets
verbs:
- get
- watch
- list
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- list
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- watch
- list
- apiGroups:
- ""
resources:
- pods/exec
verbs:
- create
36 changes: 0 additions & 36 deletions demos/airflow-scheduled-job/01-airflow-spark-clusterrole.yaml

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: airflow-spark-clusterrole-binding
name: airflow-demo-clusterrole-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: airflow-spark-clusterrole
name: airflow-demo-clusterrole
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
Expand Down
10 changes: 6 additions & 4 deletions demos/airflow-scheduled-job/03-enable-and-run-spark-dag.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ spec:
containers:
- name: start-pyspark-job
image: oci.stackable.tech/sdp/tools:1.0.0-stackable0.0.0-dev
# N.B. it is possible for the scheduler to report that a DAG exists, only for the worker task to fail if a pod is unexpectedly
# restarted. Additionally, the db-init job takes a few minutes to complete before the cluster is deployed. The wait/watch steps
# below are not "water-tight" but add a layer of stability by at least ensuring that the db is initialized and ready and that
# all pods are reachable (albeit independent of each other).
# N.B. it is possible for the scheduler to report that a DAG exists,
# only for the worker task to fail if a pod is unexpectedly
# restarted. The wait/watch steps below are not "water-tight" but add
# a layer of stability by at least ensuring that the cluster is
# initialized and ready and that all pods are reachable (albeit
# independent of each other).
command:
- bash
- -euo
Expand Down
10 changes: 6 additions & 4 deletions demos/airflow-scheduled-job/04-enable-and-run-date-dag.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ spec:
containers:
- name: start-date-job
image: oci.stackable.tech/sdp/tools:1.0.0-stackable0.0.0-dev
# N.B. it is possible for the scheduler to report that a DAG exists, only for the worker task to fail if a pod is unexpectedly
# restarted. Additionally, the db-init job takes a few minutes to complete before the cluster is deployed. The wait/watch steps
# below are not "water-tight" but add a layer of stability by at least ensuring that the db is initialized and ready and that
# all pods are reachable (albeit independent of each other).
# N.B. it is possible for the scheduler to report that a DAG exists,
# only for the worker task to fail if a pod is unexpectedly
# restarted. The wait/watch steps below are not "water-tight" but add
# a layer of stability by at least ensuring that the cluster is
# initialized and ready and that all pods are reachable (albeit
# independent of each other).
command:
- bash
- -euo
Expand Down
68 changes: 68 additions & 0 deletions demos/airflow-scheduled-job/05-enable-and-run-kafka-dag.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
apiVersion: batch/v1
kind: Job
metadata:
name: start-kafka-job
spec:
template:
spec:
containers:
- name: start-kafka-job
image: oci.stackable.tech/sdp/tools:1.0.0-stackable0.0.0-dev
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# N.B. it is possible for the scheduler to report that a DAG exists,
# only for the worker task to fail if a pod is unexpectedly
# restarted. The wait/watch steps below are not "water-tight" but add
# a layer of stability by at least ensuring that the cluster is
# initialized and ready and that all pods are reachable (albeit
# independent of each other).
command:
- bash
- -euo
- pipefail
- -c
- |
# Kafka: wait for cluster
kubectl rollout status --watch statefulset/kafka-broker-default
kubectl rollout status --watch statefulset/kafka-controller-default

# Kafka: create consumer offsets topics (required for group coordinator)
kubectl exec kafka-broker-default-0 -c kafka -- \
/stackable/kafka/bin/kafka-topics.sh \
--bootstrap-server kafka-broker-default-0-listener-broker.$(NAMESPACE).svc.cluster.local:9093 \
--create \
--if-not-exists \
--topic __consumer_offsets \
--partitions 50 \
--replication-factor 1 \
--config cleanup.policy=compact \
--command-config /stackable/config/client.properties

# Airflow: wait for cluster
kubectl rollout status --watch statefulset/airflow-webserver-default
kubectl rollout status --watch statefulset/airflow-scheduler-default

# Airflow: activate DAG
AIRFLOW_ADMIN_PASSWORD=$(cat /airflow-credentials/adminUser.password)
ACCESS_TOKEN=$(curl -XPOST http://airflow-webserver-default-headless:8080/auth/token -H 'Content-Type: application/json' -d '{"username": "admin", "password": "'$AIRFLOW_ADMIN_PASSWORD'"}' | jq -r .access_token)
curl -H "Authorization: Bearer $ACCESS_TOKEN" -H 'Content-Type: application/json' -XPATCH http://airflow-webserver-default-headless:8080/api/v2/dags/kafka_watcher -d '{"is_paused": false}' | jq

# Kafka: produce a message to create the topic
kubectl exec kafka-broker-default-0 -c kafka -- bash -c \
'echo "Hello World at: $(date)" | /stackable/kafka/bin/kafka-console-producer.sh \
--bootstrap-server kafka-broker-default-0-listener-broker.$(NAMESPACE).svc.cluster.local:9093 \
--producer.config /stackable/config/client.properties \
--topic test-topic'
volumeMounts:
- name: airflow-credentials
mountPath: /airflow-credentials
volumes:
- name: airflow-credentials
secret:
secretName: airflow-credentials
restartPolicy: OnFailure
backoffLimit: 20 # give some time for the Airflow cluster to be available
52 changes: 52 additions & 0 deletions demos/airflow-scheduled-job/06-create-opa-users.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
apiVersion: batch/v1
kind: Job
metadata:
name: start-users-job
spec:
template:
spec:
containers:
- name: start-users-job
image: oci.stackable.tech/sdp/tools:1.0.0-stackable0.0.0-dev
# N.B. it is possible for the scheduler to report that a DAG exists,
# only for the worker task to fail if a pod is unexpectedly
# restarted. The wait/watch steps below are not "water-tight" but add
# a layer of stability by at least ensuring that the cluster is
# initialized and ready and that all pods are reachable (albeit
# independent of each other).
command:
- bash
- -euo
- pipefail
- -c
- |
# Airflow: wait for cluster
kubectl rollout status --watch statefulset/airflow-webserver-default
kubectl rollout status --watch statefulset/airflow-scheduler-default

# Airflow: create users
kubectl exec airflow-webserver-default-0 -- airflow users create \
--username "jane.doe" \
--firstname "Jane" \
--lastname "Doe" \
--email "jane.doe@stackable.tech" \
--password "jane.doe" \
--role "User"

kubectl exec airflow-webserver-default-0 -- airflow users create \
--username "richard.roe" \
--firstname "Richard" \
--lastname "Roe" \
--email "richard.roe@stackable.tech" \
--password "richard.roe" \
--role "User"
volumeMounts:
- name: airflow-credentials
mountPath: /airflow-credentials
volumes:
- name: airflow-credentials
secret:
secretName: airflow-credentials
restartPolicy: OnFailure
backoffLimit: 20 # give some time for the Airflow cluster to be available
6 changes: 4 additions & 2 deletions demos/demos-v2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,12 @@ demos:
- airflow
- job-scheduling
manifests:
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/01-airflow-spark-clusterrole.yaml
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/02-airflow-spark-clusterrolebinding.yaml
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/01-airflow-demo-clusterrole.yaml
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/02-airflow-demo-clusterrolebinding.yaml
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/03-enable-and-run-spark-dag.yaml
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/04-enable-and-run-date-dag.yaml
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/05-enable-and-run-kafka-dag.yaml
- plainYaml: https://raw.githubusercontent.com/stackabletech/demos/main/demos/airflow-scheduled-job/06-create-opa-users.yaml
supportedNamespaces: []
resourceRequests:
cpu: 2401m
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/modules/demos/images/airflow-scheduled-job/airflow_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/modules/demos/images/airflow-scheduled-job/airflow_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/modules/demos/images/airflow-scheduled-job/overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading