You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question related to the handling of "on_failure_callback" in a special case. Our current setup is, that we are using the callback to trigger a function, which creates a jira tickets. Nothing special about this. This is working in general and we are happy with it.
What is wondering us is, that, in some cases, we are not getting an individual ticket but multiple identical tickets. After going through the logs, we found the following:
Once a network issue occurs, that leads to a broken connection to the metadata database, the on_error_callback gets triggered. This is what we would expect. We would now assume, that the task will not be retried, since we set the retry parameter to 0. Therefore the on_error_callback should not trigger again. But it does.
I thought about possible race conditions (we are using celery workers for the tasks) or specific retry mechanisms regarding the connection to the metadata db and I went through the different configuration parameters (https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html) and compared them to our configuration. The only one which would make sense to me was the "max_db_retries", since it relates to the metadata db. Unfortunately, we see an inconistent amount of triggered failure callbacks, ranging from 2 to > 5, while the Parameter is set to the default value of 3.
I would love to hear from regarding our issue. I tried to keep the question and description short but in case of any further questions, I will give you more details about our setup/configuration.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Greetings,
I have a question related to the handling of "on_failure_callback" in a special case. Our current setup is, that we are using the callback to trigger a function, which creates a jira tickets. Nothing special about this. This is working in general and we are happy with it.
What is wondering us is, that, in some cases, we are not getting an individual ticket but multiple identical tickets. After going through the logs, we found the following:
Once a network issue occurs, that leads to a broken connection to the metadata database, the on_error_callback gets triggered. This is what we would expect. We would now assume, that the task will not be retried, since we set the retry parameter to 0. Therefore the on_error_callback should not trigger again. But it does.
I thought about possible race conditions (we are using celery workers for the tasks) or specific retry mechanisms regarding the connection to the metadata db and I went through the different configuration parameters (https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html) and compared them to our configuration. The only one which would make sense to me was the "max_db_retries", since it relates to the metadata db. Unfortunately, we see an inconistent amount of triggered failure callbacks, ranging from 2 to > 5, while the Parameter is set to the default value of 3.
I would love to hear from regarding our issue. I tried to keep the question and description short but in case of any further questions, I will give you more details about our setup/configuration.
regards
Beta Was this translation helpful? Give feedback.
All reactions