-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Description
Apache Airflow version
2.7.3
What happened
When calling in not asyncronous way the AirbyteTriggerSyncOperator (here) and timeout is reached here the job should be killed otherwise the airbyte will keep running,
is just a matter of calling the cancel job which is already there https://github.com/apache/airflow/blob/main/airflow/providers/airbyte/hooks/airbyte.py#L110C9-L110C9
What you think should happen instead
I think that if the airbyte operator has not finished within the defined timeout then the airbyte should also stop. Otherwise the airbyte job may continue to operate and even finish (after the timeout). This way the airflow will have failed but airbyte will look successful, which is inconsistency among airflow and airbyte
How to reproduce
Its very easy to reproduce by calling a connection with very small timeout
from airflow import DAG
from airflow.utils.dates import days_ago
from airflow.providers.airbyte.operators.airbyte import AirbyteTriggerSyncOperator
with DAG(dag_id='trigger_airbyte_job_example',
default_args={'owner': 'airflow'},
schedule_interval='@daily',
start_date=days_ago(1)
) as dag:
money_to_json = AirbyteTriggerSyncOperator(
task_id='airbyte_money_json_example',
airbyte_conn_id='airbyte_conn_example',
connection_id='1e3b5a72-7bfd-4808-a13c-204505490110', # change this to something that works
asynchronous=False, # important to have this to False
timeout=10, # something really small
wait_seconds=3
)
Operating System
Debian GNU/Linux 11 (bullseye)
Versions of Apache Airflow Providers
apache-airflow-providers-airbyte 3.4.0
Deployment
Docker-Compose
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct