-
Notifications
You must be signed in to change notification settings - Fork 13.8k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XCom not working when xcom_pull from subdag task in main DAG #27785
Comments
The complete code is:
As you can see we've tested in multiple ways to get to the subdag's pushed value. All with no success. Thanks in advance. |
tl;dr: this actually is working, see my simplified example below. BUT there is an edge case where this doesn't work for manual triggers of the parent dag.If you load the attached dag below into your airflow environment (simplified from what you provided) it will run every 5 minutes and the XCOM value from the task in the subdag will be correctly read back from the task in the parent dag. This works because I think there may have been some bugs with your dag code, but even if there wasn't, I assume you were manually triggering the parent dag, so the xcom could not be read successfully for the above reason. from __future__ import annotations
from datetime import datetime
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.operators.subdag import SubDagOperator
DAG_NAME = "test_xcom"
SUB_DAG_NAME = "subdag_writer"
SUB_DAG_PUSH_TASK = "push_xcom_value_sub_dag"
def sub_dag() -> DAG:
with DAG(
dag_id=f"{DAG_NAME}.{SUB_DAG_NAME}",
start_date=datetime(2022, 1, 1),
catchup=False,
schedule="* * * * *",
) as subdag:
def extract(**kwargs):
print("hello Im running")
return "xcom_value_test"
PythonOperator(
task_id=SUB_DAG_PUSH_TASK,
python_callable=extract,
dag=subdag,
do_xcom_push=True,
)
return subdag
with DAG(
dag_id=DAG_NAME,
start_date=datetime(2022, 1, 1),
catchup=False,
schedule="*/5 * * * *",
) as dag:
def push_xcom(**kwargs):
return "test_value"
def read_xcom(ti, **kwargs):
return_value = ti.xcom_pull(
task_ids="push_xcom_value_sub_dag",
dag_id="test_xcom.subdag_writer", key="return_value",
)
print(f"I got this from xcom: {return_value}")
push_xcom_value = PythonOperator(
task_id="push_xcom_value",
python_callable=push_xcom,
)
subdag = sub_dag()
bash_push_sub_dag = SubDagOperator(
task_id=SUB_DAG_NAME,
subdag=subdag,
dag=dag,
)
read_xcom_value = PythonOperator(
task_id="read_xcom_value",
python_callable=read_xcom,
)
push_xcom_value >> bash_push_sub_dag >> read_xcom_value |
Hi Niko, test_xcom_issue.subdag_writer log file: read_xcom_value log file: |
Hi Niko, I changed the code, basically, some simplifications and I changed the schedule to every 5 minutes and worked. With this test, I assume the once annotation doesn't work as well, right? But your explanation doesn't make sense for this particular case, because it used scheduled="scheduled__2022-11-19T00:00:00+00:00". Code:
Thank you for your help and clarification. However, we have some use cases where we use the scheduler as None and the pipeline is triggered externally by the REST API interface or another way. There is some workaround to this issue for this use case? Best regards, |
@ana-carolina-januario @AngeloPingoGalp just want to highlight that SubDags is deprecated feature and will be removed one day, I do not know if someone want to invest time to make any changes in depreciated functional. Did you tried TasksGroup which is replacement for SubDag? |
Hi,
Yes, I started the transformation of our subdags into tasks groups already,
after some research regarding this issue. Thanks & Regards.
…On Sat, Nov 19, 2022, 16:39 Andrey Anshin ***@***.***> wrote:
@ana-carolina-januario <https://github.com/ana-carolina-januario>
@AngeloPingoGalp <https://github.com/AngeloPingoGalp> just want to
highlight that SubDags is deprecated feature and will be removed one day, I
do not know if someone want to invest time to make any changes in
depreciated functional.
Did you tried TasksGroup which is replacement for SubDag?
-
https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html#taskgroups-vs-subdags
- https://docs.astronomer.io/learn/task-groups
—
Reply to this email directly, view it on GitHub
<#27785 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABF5AFB6FBXIVI5AKVJNI2LWJD7AJANCNFSM6AAAAAASE4EQIM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Yup this was a typo of sorts, in the code snippet of mine you can see that I had the schedule (also 5 min) there but commented out, it was a slightly outdated version. I'll updated the snippet. |
@ana-carolina-januario Can this issue be resolved? |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Apache Airflow version
2.4.3
What happened
I have a DAG with a SubDAGOperator task that has two PythonOperator tasks. Both of the subdag's tasks return values and publish them into xcom. I confirmed and the values are available in XCom List (throught UI).
When inside the subdag I can read the returned values(throught xcom) from one task to the other task.
In the main DAG, the downstream task follow the subdagOperator task needs to pull from xcom the value returned by one of the subdag's task. This is the part that fails: a task from the main DAG can't read the xcom from the subdag tasks.
I've tested reading the xcom using BashOperator and PythonOperator.
I am using python 3.10.8 and Airflow 2.4.3
What you think should happen instead
Using the code:
for the reader task, we should be able to get the values from xcom.
How to reproduce
I copied the code from airflow's repository examples from xcom usage example and the subdag usage example. the specific bash command used to reproduce the issue is:
Operating System
Debian GNU/Linux 11 (bullseye)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==6.0.0
apache-airflow-providers-celery==3.0.0
apache-airflow-providers-cncf-kubernetes==4.4.0
apache-airflow-providers-common-sql==1.2.0
apache-airflow-providers-docker==3.2.0
apache-airflow-providers-elasticsearch==4.2.1
apache-airflow-providers-ftp==3.1.0
apache-airflow-providers-google==8.4.0
apache-airflow-providers-grpc==3.0.0
apache-airflow-providers-hashicorp==3.1.0
apache-airflow-providers-http==4.0.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-jdbc==3.2.1
apache-airflow-providers-microsoft-azure==4.3.0
apache-airflow-providers-mysql==3.2.1
apache-airflow-providers-odbc==3.1.2
apache-airflow-providers-postgres==5.2.2
apache-airflow-providers-redis==3.0.0
apache-airflow-providers-sendgrid==3.0.0
apache-airflow-providers-sftp==4.1.0
apache-airflow-providers-slack==6.0.0
apache-airflow-providers-sqlite==3.2.1
apache-airflow-providers-ssh==3.2.0
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
It occurs every time.
My team and I need this urgently.
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: