-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic task mapping zip() iterates unexpected number of times #26499
Comments
As per my debugging I guess the airflow/airflow/models/xcom_arg.py Line 455 in 4c33f6b
airflow/tests/models/test_xcom_arg.py Lines 185 to 225 in 4c33f6b
|
Hello, I am new here, but I have been wanting to contribute to this project. I do not want to create a random pull request, so I am showing my changes here. I did run the pre-commit and unit test shown above. I was not able to run the DAG above in Docker, however on macOS. My thought on this is that there may be different instances of NOTSET and, therefore, comparison is not working. It compares against the object, not a value. In order to solve this, perhaps testing against the class ArgNotSet would be more reliable. I would create a PR, but I would like to test against the failed use case and do not want to violate the contribution decorum.
|
@rjmcginness You patch would break @tirkarthi I would expect the resulting value to contain |
@uranusjr Thank you. I was thinking that fillvalue would remain the same. It still receives an instance of NOTSET or a value. My code changes the test against the class, rather than against any instance of ArgNotSet. I did test this against the use case successfully. It seems strange that there would be multiple instances of NOTSET. Serialization/deserialization makes sense, as this may lead to creation of new instances. What was your thought on breaking fillvalue? |
Ah, I missed the |
I did. I asked on slack to find where the DAG file should go. I reproduced the error in v2.4.0. Then I ran it with my new code and received the expected output. I committed the changes back to my forked repository, but have not made a pull request. |
A pull request would be a good idea then. Maybe it’d be easier to figure out what exactly triggered the error if the full diff is presented. |
@uranusjr Ok. Thank you. I have to look how to do the rebasing, then I will request the pull. I appreciate your help. |
Apache Airflow version
2.4.0
What happened
When running
zip()
with different-length lists, I get an unexpected result:Iterates over
[("a", 1), ("b", 2), ("c", 3), ("a", 1)]
, so it iterates for the length of the longest collection, but restarts iterating elements when reaching the length of the shortest collection.I would expect it to behave like Python's builtin
zip
and iterate for the length of the shortest collection, so 3x in the example above, i.e.[("a", 1), ("b", 2), ("c", 3)]
.Additionally, I went digging in the source code and found the
fillvalue
argument which works as expected:Iterates over
[("a", 1), ("b", 2), ("c", 3), ("foo", 4)]
.However, with
fillvalue
not set, I would expect it to iterate only for the length of the shortest collection.What you think should happen instead
I expect
zip()
to iterate over the number of elements of the shortest collection (withoutfillvalue
set).How to reproduce
See above.
Operating System
MacOS
Versions of Apache Airflow Providers
No response
Deployment
Other
Deployment details
OSS Airflow
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: