-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add recursive flag to glob in filesystem sensor #16894
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
@potiuk should these checks be failing? I didn't think this change would affect integrations with other components. The exit code is |
Yep. We have some instabilities of GitHub runners and memory - constant struggle but seem to intensify a bit recently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, while this change looks initially like it is fully backwards compatible, in fact it is not. If someone had a directory structure in his folder already, this change might detect files which were skipped before.
I think better (and fully backwards-compatible) solution will be to add optional recursive flag with default = False to the sensor constructor. This way you would not even have to add any explanation to UPDATING.md
.
@potiuk Thanks for the feedback! Let me know if this looks OK now |
airflow/sensors/filesystem.py
Outdated
:param recursive: when set True, enables recursive directory matching behavior of | ||
`**` in glob filepath parameter. Defaults to False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example would be nice too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean an example of invocation like below? Or an example scenario where this would be used like #16725?
FileSensor(
task_id='test',
filepath=temp_dir + "/**",
fs_conn_id='fs_default',
dag=self.dag,
timeout=0,
poke_interval=1,
recursive=True,
)
Also, would this example go to the same place? The docstring for the class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example in airflow/example_dags
it was missing for filesystem_sensor, but it would be nice to add one since we are changing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param recursive: when set True, enables recursive directory matching behavior of | |
`**` in glob filepath parameter. Defaults to False. | |
:param recursive: when set to ``True``, enables recursive directory matching behavior of | |
``**`` in glob filepath parameter. Defaults to ``False``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to do that in a follow-up PR to add an example DAG
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
@potiuk mind taking another look. Since you had requested changes :) |
Fine for me :) |
Awesome work, congrats on your first merged pull request! |
This PR aims to fix apache#16725 by adding the `recursive` flag to `glob` in the filesystem sensor. closes: apache#16725
This PR aims to fix #16725 by adding the
recursive
flag toglob
in the filesystem sensor.closes: #16725
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.