Replies: 1 comment 1 reply
-
|
What is the verion airflow do you use? Could you also elaborate a bit more - what kind of deployment you have, what is configurations, providers, what kind of DAGS you have, what are the sizes of dags stored in Serialized DAG tables? Scheduler generally parses the dags continuosly and uses serizlied version of the DAG (SerializedDag) to perform all kind of operations - for example it uses it to perorm scheduling. As - such it will periodically load the SerializedDag in memory and if you have. huge DAG with thousands of tasks then it will have to be loaded in memory fully in order to make scheduling decisions. I wonder how big your DAG files are ? And can you see this growth to continue forever. It would be great to see a few flamegraphs - from different times when scheduler runs - to see if there is a constant growth coming from a single place. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have a situation with a large number (thousands) of tasks getting parsed and serialized ahead of time, and eventually my scheduler dies with OOM, strangely in json decoding, which takes around a gigabyte of ram within 30 minutes of startup. Flamegraph from memray here: https://a3s.fi/hardwick-2006633-flip/memray-flamegraph-memray.html.
Is there a way to postpone parsing with all these tasks ahead of time?
Beta Was this translation helpful? Give feedback.
All reactions