Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[AIRFLOW-XXX] GSoD: How to make DAGs production ready #6515
Make sure you have checked all steps below.
@KKcorps I think one of the important things to mention in this document is database access during parsing time and especially avoiding to use Airflow Variables in the DAGs (they are still ok to use in "execute" method).
This is a known problem that people are complaining a lot that scheduler opens and closes a lot of connections to the database - because every time the file is parsed, and variable is reached, a database connection is opened and query executed.
I think there are lots of examples around with using variables in the DAGs but this is not really a good practice and I think this is a perfect place to mention it.
I believe environment variables are better way to share common configuration.
Great @KKcorps It could also be great to search some of the existing documentation and see if there are no contradictions practices <> examples (I believe there are a few places where you have those bad practices shown as examples :). And we can fix them together.
Agree. So using lots of Airflow variables in the file outside of task code will cause many DB connections.
It is fine to use it in a deferred way in Jinja templated field. For using it outside task file use Environment Variables.