Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 1.27 KB

get_file_list.rst

File metadata and controls

27 lines (19 loc) · 1.27 KB

:pyget_file_list <astro.files.get_file_list>

When to use the get_file_list operator

You can use get_file_list to retrieve a list of available files based on a storage path and the Airflow connection. Based on the files available on your system storage, this can generate tasks dynamically.

The supported filesystems are file_location

Warning

Fetching a lot of files using this method can lead to overloaded XCOM. This can create lot of parallel tasks when used in dynamic task map expand method.

The following example retrieves a file list from the GCS bucket and dynamically generates tasks using expand to upload each listed file to a Bigquery table.

../../../../example_dags/example_dynamic_task_template.py

  • :external+airflowDynamic task mapping - Apache Airflow <authoring-and-scheduling/dynamic-task-mapping>
  • Dynamic tasks - Astronomer