You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
as part of deployment, the worker pod deployed first after that airflow-redis-master pod recreated. in airflow worker pod, we can see the redis disconnect message and after few retries connected to redis but missing heartbeat from worker node and worker node indefinitely not processing any tasks. The tasks will be in queued status until worker pod restarted.
[2022-05-30 17:53:06,186: ERROR/MainProcess] consumer: Cannot connect to redis://:**@airflow-redis-master.-XX-XXX--env1.svc.cluster.local:6379/1: Error 111 connecting to airflow-redis-master-XX-XX-env1.svc.cluster.local:6379. Connection refused..
Trying again in 12.00 seconds... (6/100)
Relevant Logs
[2022-05-30 17:53:18,213: INFO/MainProcess] Connected to redis://:**@airflow-redis-master-XX-XXX-env1.svc.cluster.local:6379/1
[2022-05-30 17:53:18,228: INFO/MainProcess] mingle: searching for neighbors
[2022-05-30 17:53:19,239: INFO/MainProcess] mingle: all alone
[2022-05-30 17:53:24,246: INFO/MainProcess] missed heartbeat from celery@airflow-worker-0
@thesuperzapper , Thanks for responding. We didn't define custom-values.
I have resolved this issue with configuring liveness checks for Worker pod as per below . apache/airflow#22378
The only problem here is, The worker containers restart happening if both of the workers heartbeat missed and not processing any messages.
Checks
User-Community Airflow Helm Chart
.Chart Version
8.6.0
Kubernetes Version
Helm Version
Description
as part of deployment, the worker pod deployed first after that airflow-redis-master pod recreated. in airflow worker pod, we can see the redis disconnect message and after few retries connected to redis but missing heartbeat from worker node and worker node indefinitely not processing any tasks. The tasks will be in queued status until worker pod restarted.
[2022-05-30 17:53:06,186: ERROR/MainProcess] consumer: Cannot connect to redis://:**@airflow-redis-master.-XX-XXX--env1.svc.cluster.local:6379/1: Error 111 connecting to airflow-redis-master-XX-XX-env1.svc.cluster.local:6379. Connection refused..
Trying again in 12.00 seconds... (6/100)
Relevant Logs
Custom Helm Values
The text was updated successfully, but these errors were encountered: