Skip to content

Kubernetes pods are crashing while replacing celeryExecutor to KubernetesExecutor #20101

@brata96

Description

@brata96

Official Helm Chart version

1.3.0 (latest released)

Apache Airflow version

2.0.2

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:13:54Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.20-gke.900", GitCommit:"1facb91642e16cb4f5be4e4a632c488aa4700382", GitTreeState:"clean", BuildDate:"2021-06-28T09:19:58Z", GoVersion:"go1.13.15b4", Compiler:"gc", Platform:"linux/amd64"}

Helm Chart configuration

`ingress:
enabled: true

web:
annotations: {
'kubernetes.io/ingress.class': 'nginx'
}
host: "airflow{{ url_suffix }}.{{ GCP_GKE_CLUSTER }}.{{ dns_domain }}"

flower:
annotations: {
'kubernetes.io/ingress.class': 'nginx'
}
host: "airflow-flower{{ url_suffix }}.{{ GCP_GKE_CLUSTER }}.{{ dns_domain }}"

web:
authenticate: false
webserverConfig:
stringOverride: |-
AUTH_ROLE_PUBLIC = "Admin"

resources:
requests:
cpu: "1000m"
memory: "4Gi"
limits:
cpu: "1000m"
memory: "4Gi"
web:
extraPipPackages:
## the following configs require Flask-AppBuilder 3.2.0 (or later)
- "Flask-AppBuilder~=3.2.0"
## the following configs require Authlib
- "Authlib~=0.15.3"

webserver:
service:
type: NodePort
authenticate: false

flower:
service:
type: NodePort

resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "500m"
memory: "1Gi"

airflow:
users: []
connections:

executor: KubernetesExecutor

image:
repository: "{{ DOCKER_REGISTRY }}/**//airflow-cluster"
tag: "{{ version }}"

config:

AIRFLOW__WEBSERVER__EXPOSE_CONFIG: "True"

AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: "30"

AIRFLOW__CORE__REMOTE_LOGGING: "True"
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: "gs://{{ GCP_WORKLOAD_IDENTITY_PROJECT }}-airflow-cluster/airflow/logs"
AIRFLOW__CORE__REMOTE_LOG_CONN_ID: "google_cloud_airflow"

AIRFLOW__CORE__LOAD_EXAMPLES: "False"
AIRFLOW__CORE__LAZY_LOAD_PLUGINS: "False"
AIRFLOW__CORE__PLUGINS_FOLDER: "/opt/airflow/plugins"
AIRFLOW__WEBSERVER__RELOAD_ON_PLUGIN_CHANGE: "True"
# kubernetesexecutor config start
AIRFLOW__KUBERNETES__DAGS_IN_IMAGE: "False"
AIRFLOW__KUBERNETES__DAGS_VOLUME_HOST: "/opt/airflow/dags/repo/airflow-dags"
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: "{{ DOCKER_REGISTRY }}/**********/*********/airflow-cluster"
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: "{{ version }}"
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS: "False"
# kubernetesexecutor config end
AIRFLOW__API__ENABLE_EXPERIMENTAL_API: "True"
AIRFLOW__API__AUTH_BACKEND: "airflow.api.auth.backend.default"
AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX: "{{ enable_proxy_fix }}"

AUTH_ROLE_PUBLIC: "Admin"

AIRFLOW__METRICS__STATSD_ON: "True"

extraPipPackages:
- "apache-airflow-providers-google==3.0.0"
scheduler:
resources:
requests:
cpu: "2000m"
memory: "2Gi"
limits:
cpu: "2000m"
memory: "2Gi"

connections:
- id: google_cloud_airflow
type: google_cloud_platform
extra: '{ "extra__google_cloud_platform__project": "{{ GCP_PROJECT }}", "extra__google_cloud_platform__scope": "https://www.googleapis.com/auth/cloud-platform", "extra__google_cloud_platform__key_path": "", "extra__google_cloud_platform__keyfile_dict": ""}'

pools: |
{
"example": {
"description": "This is an example pool with 2 slots.",
"slots": 4
}
}
variables: |
{
"dags_folder": "/opt/airflow/dags/airflow-dags",
"*********": "{{ ********* }}"
}

workers:
extraVolumes:

replicas: 4
resources:
requests:
cpu: "1000m"
memory: "4Gi"
limits:
cpu: "1000m"
memory: "4Gi"

podDisruptionBudget:
enabled: true
maxUnavailable: "20%"

autoscaling:
enabled: true
maxReplicas: 20
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80

celery:
instances: 6
gracefullTermination: true
gracefullTerminationPeriod: 540
terminationPeriod: 60

dags:
persistence:
enabled: true
storageClassName: "csi-gcs-test-sc"
accessMode: ReadOnlyMany
existingClaim: "airflow-gcs-pvc2"
size: 5Gi
rbac:
create: false

redis:
enabled: false

workers:
enabled: false`
flower:
enabled: false

Docker Image customisations

No response

What happened

trying to replace celeryexecutor with kubernetesExecutor after replacing kubernetes pods are crashing .we have disable the flower , worker and redis as each task in kubernetes executor create a new pods.
`rbac:
create: false

redis:
enabled: false

workers:
enabled: false

flower:
enabled: false`

What you expected to happen

successful implementation kubernetesExecutor

How to reproduce

Not sure!

Anything else

No

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions