Under which category would you file this issue?
Helm chart
Apache Airflow version
3.1.7 -> 3.2.0
What happened and how to reproduce it?
Upgraded Airflow from 3.1.7 and 3.2.0. The db migrations pod was stuck for 20+ minutes.
Some logs:
2026-04-16T02:51:13.804504Z [info ] Context impl MySQLImpl. [alembic.runtime.migration] loc=migration.py:210
2026-04-16T02:51:13.804607Z [info ] Will assume non-transactional DDL. [alembic.runtime.migration] loc=migration.py:213
2026-04-16T02:51:13.806593Z [info ] Migrating the Airflow database (MySQL) [airflow.utils.db] loc=db.py:1179
2026-04-16T02:51:13.806680Z [info ] MySQL: Committing session to release metadata locks [airflow.utils.db] loc=db.py:794
2026-04-16T02:58:44.621134Z [info ] Context impl MySQLImpl. [alembic.runtime.migration] loc=migration.py:210
2026-04-16T02:58:44.621234Z [info ] Will assume non-transactional DDL. [alembic.runtime.migration] loc=migration.py:213
2026-04-16T02:58:44.687475Z [info ] Running upgrade cc92b33c6709 -> 82dbd68e6171, Add composite index (ti_id, id DESC) to task_reschedule. [alembic.runtime.migration] loc=migration.py:621
First, it was stuck for 7.5 minutes even before it started doing any migrations, then the actual migration (Add composite index) was stuck for 20+ minutes.
The only way to get this to finish was to scale down all pods, which then let this migration finish. Is this expected?
We are using MySQL.
Graph of processes waiting for lock. Started migrations around start of this graph and ended when scaled down pods and let migration finish.

What you think should happen instead?
Helm chart should handle migrations cleanly without needing user to scale down workloads.
Operating System
No response
Deployment
Official Apache Airflow Helm Chart
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
No response
Official Helm Chart version
1.20.0 (latest released)
Kubernetes Version
v1.30.14
Helm Chart configuration
defaultAirflowTag: "3.2.0"
airflowVersion: "3.2.0"
executor: "KubernetesExecutor"
allowPodLaunching: true
# Avoid having the helm chart managing these secret which causes it to rotate on upgrades and causes issues for DAGs or web UI access
jwtSecretName: airflow-jwt-secret
fernetKeySecretName: airflow-fernet-key
apiSecretKeySecretName: airflow-api-secret-key
env:
- name: AIRFLOW__API__EXPOSE_CONFIG
value: "False"
- name: AIRFLOW__API__BASE_URL
value: "https://(redacted)"
- name: AIRFLOW__EMAIL__EMAIL_BACKEND
value: "airflow.utils.email.send_email_smtp"
- name: AIRFLOW__SMTP__SMTP_HOST
value: "redacted"
- name: AIRFLOW__SMTP__SMTP_MAIL_FROM
value: "redacted"
- name: AIRFLOW__SMTP__SMTP_STARTTLS
value: "False"
- name: AIRFLOW__WEBSERVER__SHOW_TRIGGER_FORM_IF_NO_PARAMS
value: "True"
- name: AIRFLOW__WEBSERVER__WARN_DEPLOYMENT_EXPOSURE
value: "False"
## in Airflow V2 parallelism=0 meant infinite.
## in V3, a value must be supplied
- name: AIRFLOW__CORE__PARALLELISM
value: "256"
# dags are mounted via localPath
dags:
persistence:
enabled: false
logs:
persistence:
enabled: true
#size: 5Gi
#annotations: {}
#storageClassName: "rook-cephfs-fs00"
existingClaim: airflow-logs
redis:
enabled: false
postgresql:
enabled: false
statsd:
enabled: false
extraEnv: |
- name: AIRFLOW_KUBE_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: AIRFLOW_ENV
valueFrom:
configMapKeyRef:
name: airflow.deployment
key: environment
extraEnvFrom: |
- secretRef:
name: airflow-okta-creds
- secretRef:
name: airflow-admin-user-pwd
data:
metadataSecretName: airflow-db-conn
resultBackendSecretName: airflow-db-conn
# Root-level tolerations and affinity apply to all components (scheduler, webserver, triggerer, etc.)
# including the KubernetesExecutor pod template.
# If desired to have different rules for the pod template, set workers.affinity/workers.tolerations instead.
# See: https://github.com/apache/airflow/blob/main/chart/files/pod-template-file.kubernetes-helm-yaml
tolerations:
- key: airflow
operator: Equal
value: "true"
effect: NoSchedule
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker-airflow
operator: In
values:
- "true"
webserver:
defaultUser:
enabled: true
role: Admin
username: redacted
firstName: redacted
lastName: redacted
email: redacted
# Note: this is only applied on account creation, if account already exists details won't be updated
createUserJob:
ttlSecondsAfterFinished: 300
command: ~
args:
- "bash"
- "-c"
- |-
exec \
airflow users create \
-r "{{ .Values.webserver.defaultUser.role }}" \
-u "{{ .Values.webserver.defaultUser.username }}" \
-e "{{ .Values.webserver.defaultUser.email }}" \
-f "{{ .Values.webserver.defaultUser.firstName }}" \
-l "{{ .Values.webserver.defaultUser.lastName }}" \
-p "${AIRFLOW_ADMIN_PASSWORD}"
applyCustomEnv: true
securityContexts:
pod:
runAsUser: 50000
runAsGroup: 0
fsGroup: 50000
# changing permissions operation is slow, especially on big volumes like the logs volume
# so, only change if the root permissions do not match the expected permissions
fsGroupChangePolicy: "OnRootMismatch"
runAsNonRoot: true
triggerer:
replicas: 2
resources:
requests:
memory: "500Mi"
cpu: "1000m"
limits:
memory: "1Gi"
cpu: "2000m"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker-airflow
operator: In
values:
- "true"
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- triggerer
topologyKey: "kubernetes.io/hostname"
extraVolumes:
- name: dags
hostPath:
path: redacted
type: Directory
extraVolumeMounts:
- name: dags
mountPath: /opt/airflow/dags
readOnly: true
subPath: dags-v3
scheduler:
replicas: 2
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker-airflow
operator: In
values:
- "true"
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- scheduler
topologyKey: "kubernetes.io/hostname"
dagProcessor:
replicas: 2
resources:
# dagProcessor has memory leak increasing around 300MB/day
requests:
memory: "4Gi"
cpu: "1"
limits:
memory: "8Gi"
cpu: "2"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker-airflow
operator: In
values:
- "true"
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- dag-processor
topologyKey: "kubernetes.io/hostname"
extraVolumes:
- name: dags
hostPath:
path: redacted
type: Directory
extraVolumeMounts:
- name: dags
mountPath: /opt/airflow/dags
readOnly: true
subPath: dags-v3
apiServer:
replicas: 2
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "3Gi"
cpu: "2000m"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/worker-airflow
operator: In
values:
- "true"
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- api-server
topologyKey: "kubernetes.io/hostname"
env:
## this should achieve the same as the community chart's `airflow.extraPipPackages`. Note: underscore is necessary, not a typo!
- name: _PIP_ADDITIONAL_REQUIREMENTS
value: "airflow-exporter==1.6.0"
service:
type: NodePort
ports:
- name: api-server
port: "{{ .Values.ports.apiServer }}"
nodePort: 30010
## Uncomment below to enable Okta
## note: does not work when port-forwarding
apiServerConfig: |
import os
from flask_appbuilder.security.manager import AUTH_OAUTH
AUTH_TYPE = AUTH_OAUTH
# registration configs
AUTH_USER_REGISTRATION = True # allow users who are not already in the FAB DB
AUTH_USER_REGISTRATION_ROLE = "Viewer" # this role will be given in addition to any AUTH_ROLES_MAPPING
# the list of providers which the user can choose from
OAUTH_PROVIDERS = [
{
"name": "okta",
"icon": "fa-circle-o",
"token_key": "access_token",
"remote_app": {
"client_id": os.environ["OKTA_CLIENT_ID"],
"client_secret": os.environ["OKTA_CLIENT_SECRET"],
"api_base_url": "redacted",
"client_kwargs": {"scope": "openid profile email groups"},
"server_metadata_url": "redacted",
"access_token_url": "redacted",
"authorize_url": "redacted",
},
},
]
# a mapping from the values of `userinfo["role_keys"]` to a list of FAB roles
AUTH_ROLES_MAPPING = {
"Viewer": ["Viewer"],
"redacted": ["Admin"],
}
# if we should replace ALL the user's roles each login, or only on registration
AUTH_ROLES_SYNC_AT_LOGIN = True
# force users to re-auth after 30min of inactivity (to keep roles in sync)
PERMANENT_SESSION_LIFETIME = 1800
## this is a bit of a confusing section. workers section is technically for deploying CeleryExecutors which we don't use
## but extraVolumes/extraVolumeMounts of this section is used to append to pod manifest of launched pods by airflow
## As of chart 1.20.0, workers.resources is deprecated in favor of workers.kubernetes.resources
workers:
extraVolumes:
- name: dags
hostPath:
path: redacted
type: Directory
extraVolumeMounts:
- name: dags
mountPath: /opt/airflow/dags
readOnly: true
subPath: dags-v3
kubernetes:
resources:
requests:
memory: "500Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "2000m"
Docker Image customizations
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
Under which category would you file this issue?
Helm chart
Apache Airflow version
3.1.7 -> 3.2.0
What happened and how to reproduce it?
Upgraded Airflow from 3.1.7 and 3.2.0. The db migrations pod was stuck for 20+ minutes.
Some logs:
First, it was stuck for 7.5 minutes even before it started doing any migrations, then the actual migration (Add composite index) was stuck for 20+ minutes.
The only way to get this to finish was to scale down all pods, which then let this migration finish. Is this expected?
We are using MySQL.
Graph of processes waiting for lock. Started migrations around start of this graph and ended when scaled down pods and let migration finish.

What you think should happen instead?
Helm chart should handle migrations cleanly without needing user to scale down workloads.
Operating System
No response
Deployment
Official Apache Airflow Helm Chart
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
No response
Official Helm Chart version
1.20.0 (latest released)
Kubernetes Version
v1.30.14
Helm Chart configuration
Docker Image customizations
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct