Using .output on non-templated fields #27285

potiuk · 2022-10-26T02:49:02Z

Discussed in #26938

^{Originally posted by stephenonethree October 7, 2022}
I just discovered the .output property functionality that apparently was released in Airflow 2 for classic operators, as a simple way of accessing their output XComs. I think that this is a super useful feature because it would allow simpler connections between tasks than what I have been doing until now.

Until now, I've been explicitly giving a downstream task the task_ids and XCom names that it needs to pull from its upstreams (as hardcoded string parameters). Something like:

# pushes XCom named myvar
upstream_task=SomeOperator(task_id='upstream_task')

# this is a custom operator and within execute() I have a line roughly like:
# actual_input = context['task_instance'].xcom_pull(
#     dag_id=context['dag'].dag_id, task_ids=upstream_task_id, key=upstream_xcom_name)
this_task = OtherOperator(
    task_id='this_task',
    upstream_task_id='upstream_task',
    upstream_xcom_name='myvar'
)
upstream_task >> this_task

With .output I could simplify this to:

# pushes XCom named myvar
upstream_task=SomeOperator(task_id='upstream_task')

this_task = OtherOperator(
    task_id='this_task',
    actual_input=upstream_task.output['myvar']
)
upstream_task >> this_task

Unfortunately it seems that there is one limitation. On the TaskFlow documentation page (https://airflow.apache.org/docs/apache-airflow/2.4.1/tutorial/taskflow.html#consuming-xcoms-between-decorated-and-traditional-tasks) it says: Using the .output property as an input to another task is supported only for operator parameters listed as a template_field.

I don't use Jinja templating for very many of my parameters as it's mostly irrelevant for me. So my questions are, is there a technical reason for this limitation? If not, is this limitation something that you are considering dropping in a future Airflow version? I suppose I could just make these fields templated to get around it, but I don't really want to turn on templating if I don't expect to need it, since I suppose it introduces the possibility of incorrect interpolation (though perhaps that's a remote possibility because I don't think most of my variables will include {{ or }}.

Let me know if I should file this as a feature request instead, for now I guess the Ideas category works.

The text was updated successfully, but these errors were encountered:

uranusjr · 2022-10-26T04:32:10Z

I don’t think there’s a strict technical limitation, but a user perception considertation: Not all fields can reference an XComArg (e.g. outlets and executor_config can’t), so even if the limitation is lifted, it still wouldn’t be all fields. Instead of trying to teach users what fields can reference dynamic values and what can’t, it’s easier to positively list what can, and template_fields is an easy one to go with.

potiuk · 2022-10-27T00:10:41Z

Following the comment - I have a bold proposal ... It's not exaclty what the original proposal is, but in a way it provides a possibility to do what was originally requested here.

Why don't we add an option (disabled by default) to make ALL ELIGIBLE fields - "templated_fields" (and automatically .output -capable).

That bothered me for a while but I think there is very little impact of making all fields templated and often people complained that templated fields. Performance overhead should be negligible (just walking through parameters and jinjafying them which in most cases will be no-op).

The only drawback it might have is that the if a string contains " {{}}" acidentally - this will be replaced with "" - which is backwards-incompatible. We could also provide a mechanism that would eclude a field from being templated just in case.

I think that has a number of benefits - for example our users will not have extend operators that miss some fields in "templated_fields".

I am not too worried anout "outlets" and executor_config not being available for .output and user's education. As long as we simply error out in this case that should be good.

In a way it woudl be similar to render_template_as_native_obj DAG paraemeter.

rkarish · 2023-01-05T23:05:48Z

@potiuk Is this bold proposal still something you want to move forward with? I think it sounds interesting and I see how it can be useful to the Users. I can make an attempt at implementing the solution you have proposed.

potiuk · 2023-01-06T06:52:13Z

Sure. Why not. I have not seen an opposition so far, and I think it's worth doing - I also see no big drawback if it's going to be similar setting to render_templates_as_native_obj that you set for the whole DAG.

…he#27285)

rkarish · 2023-01-19T09:00:29Z

Hey Jarek, I've opened a PR for this. I took a slightly different approach than a DAG level setting. I managed the functionality at the Operator level, but this can still be applied to the whole DAG with the use of default_args. I think that this approach gives the Users more flexibility. Looking forward to getting some feedback!

potiuk · 2023-01-19T18:33:39Z

Hey Jarek, I've opened a PR for this. I took a slightly different approach than a DAG level setting. I managed the functionality at the Operator level, but this can still be applied to the whole DAG with the use of default_args. I think that this approach gives the Users more flexibility. Looking forward to getting some feedback!

Very nice. I asked others for input, but I like the way you proposed.

…for attributes (apache#27285)

potiuk added kind:feature Feature Requests good first issue labels Oct 26, 2022

rkarish added a commit to rkarish/airflow that referenced this issue Jan 19, 2023

Add functionality for operators to template all eligible fields (apac…

835e2d2

…he#27285)

rkarish mentioned this issue Jan 19, 2023

Add functionality for operators to template all eligible fields (apac… #29034

Closed

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Add functionality for operators to template all eligible fields (apac…

7c9e0f7

…he#27285)

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Add new BaseOperator attributes to serialization (apache#27285)

455b01c

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Merge branch 'apache#27285' of github.com:rkarish/airflow into apache…

bc3520b

…#27285

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Add functionality for operators to template all eligible fields (apac…

a5e6b0f

…he#27285)

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Add new BaseOperator attributes to serialization (apache#27285)

639f224

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Merge branch 'apache#27285' of github.com:rkarish/airflow into apache…

e46562a

…#27285

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Fix formatting (apache#27285)

74b4d0e

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Add functionality for operators to template all eligible fields (apac…

99c4427

…he#27285)

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Add new BaseOperator attributes to serialization (apache#27285)

196bf62

rkarish added a commit to rkarish/airflow that referenced this issue Jan 20, 2023

Fix formatting (apache#27285)

4f99923

rkarish added a commit to rkarish/airflow that referenced this issue Jan 21, 2023

Merge branch 'apache#27285' of github.com:rkarish/airflow into apache…

ecc7029

…#27285

rkarish added a commit to rkarish/airflow that referenced this issue Jan 21, 2023

Update docstring and use MappedOperator instead of calling partial() …

b8e7ea1

…for attributes (apache#27285)

rkarish added a commit to rkarish/airflow that referenced this issue Jan 21, 2023

Merge branch 'main' into apache#27285

c39f669

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using .output on non-templated fields #27285

Using .output on non-templated fields #27285

potiuk commented Oct 26, 2022

uranusjr commented Oct 26, 2022

potiuk commented Oct 27, 2022 •

edited

rkarish commented Jan 5, 2023

potiuk commented Jan 6, 2023

rkarish commented Jan 19, 2023 •

edited

potiuk commented Jan 19, 2023

Using .output on non-templated fields #27285

Using .output on non-templated fields #27285

Comments

potiuk commented Oct 26, 2022

Discussed in #26938

uranusjr commented Oct 26, 2022

potiuk commented Oct 27, 2022 • edited

rkarish commented Jan 5, 2023

potiuk commented Jan 6, 2023

rkarish commented Jan 19, 2023 • edited

potiuk commented Jan 19, 2023

potiuk commented Oct 27, 2022 •

edited

rkarish commented Jan 19, 2023 •

edited