Dynamic Tasks in Airflow | parallel task execution with Celery Executor #35769
Unanswered
victor-rayan
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm seeking assistance in setting up dynamic tasks within my Airflow workflow. The aim is to read a significantly large dataset from a api.
I attempted an approach similar to the following:
However, due to the substantial volume of data I'm dealing with, this approach hasn’t been successful. I'm exploring alternative methods to efficiently manage this large dataset. Specifically, I'm considering a strategy where I read data from the CSV file incrementally to mitigate memory issues.
I'm looking for insights, best practices, or alternative techniques to effectively handle the processing of large datasets within Airflow. Any guidance or recommendations on how to implement dynamic tasks(parallel task execution with Celery executor) to process extensive CSV datasets in manageable portions would be highly appreciated.
Thank you sincerely for your support and expertise!
Beta Was this translation helpful? Give feedback.
All reactions