Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure DAG-level references are filled on unmap #33083

Merged

Conversation

uranusjr
Copy link
Member

@uranusjr uranusjr commented Aug 3, 2023

Previously, a serialized mapped operator's unmapped SerializedOperator misses some DAG-level references because they were established separately in BaseSerialization and were overlooked when the unmap() function was implemented. This extracts those post-population ref-fixing code into a function, and call it as needed in unmap(), so an unmapped SerializedOperator is consistent with a non-mapped SerializedOperator that comes straightly out of a database.

This was not an issue prior to 2.6 since the scheduler mostly did not access DAG-level references on a serialized operator (mapped or not). The introduction of the fail_fast flag requires accessing the DAG much later in a task's lifetime in the scheduler, and thus needs the references to be properly set.

Previously, a serialized mapped operator's unmapped SerializedOperator
misses some DAG-level references because they were established
separately in BaseSerialization and were overlooked when the unmap()
function was implemented. This extracts those post-population ref-fixing
code into a function, and call it as needed in unmap(), so an unmapped
SerializedOperator is consistent with a non-mapped SerializedOperator
that comes straightly out of a database.

This was not an issue prior to 2.6 since the scheduler mostly did not
access DAG-level references on a serialized operator (mapped or not).
The introduction of the fail_fast flag requires accessing the DAG much
later in a task's lifetime in the scheduler, and thus needs the
references to be properly set.
@uranusjr uranusjr force-pushed the serialized-mapped-operator-unmap-dag branch from c44021e to 812f81b Compare August 3, 2023 19:36
airflow/models/mappedoperator.py Outdated Show resolved Hide resolved
@uranusjr uranusjr merged commit bcfadcf into apache:main Aug 4, 2023
42 checks passed
@uranusjr uranusjr deleted the serialized-mapped-operator-unmap-dag branch August 4, 2023 04:40
@ephraimbuddy ephraimbuddy added the type:improvement Changelog: Improvements label Aug 4, 2023
@ephraimbuddy ephraimbuddy added this to the Airflow 2.7.0 milestone Aug 4, 2023
ephraimbuddy pushed a commit that referenced this pull request Aug 4, 2023
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
(cherry picked from commit bcfadcf)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants