Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler Memory Leak in Airflow 2.0.1 #14924

Closed
itispankajsingh opened this issue Mar 21, 2021 · 67 comments · Fixed by #18054
Closed

Scheduler Memory Leak in Airflow 2.0.1 #14924

itispankajsingh opened this issue Mar 21, 2021 · 67 comments · Fixed by #18054
Labels
affected_version:2.0 Issues Reported for 2.0 area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug priority:critical Showstopper bug that should be patched immediately

Comments

@itispankajsingh
Copy link

Apache Airflow version: 2.0.1

Kubernetes version (if you are using kubernetes) (use kubectl version): v1.17.4

Environment: Dev

  • OS (e.g. from /etc/os-release): RHEL7

What happened:

After running fine for some time my airflow tasks got stuck in scheduled state with below error in Task Instance Details:
"All dependencies are met but the task instance is not running. In most cases this just means that the task will probably be scheduled soon unless: - The scheduler is down or under heavy load If this task instance does not start soon please contact your Airflow administrator for assistance."

What you expected to happen:

I restarted the scheduler then it started working fine. When i checked my metrics i realized the scheduler has a memory leak and over past 4 days it has reached up to 6GB of memory utilization

In version >2.0 we don't even have the run_duration config option to restart scheduler periodically to avoid this issue until it is resolved.

How to reproduce it:
I saw this issue in multiple dev instances of mine all running Airflow 2.0.1 on kubernetes with KubernetesExecutor.
Below are the configs that i changed from the default config.
max_active_dag_runs_per_dag=32
parallelism=64
dag_concurrency=32
sql_Alchemy_pool_size=50
sql_Alchemy_max_overflow=30

Anything else we need to know:

The scheduler memory leaks occurs consistently in all instances i have been running. The memory utilization keeps growing for scheduler.

@itispankajsingh itispankajsingh added the kind:bug This is a clearly a bug label Mar 21, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 21, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@itispankajsingh itispankajsingh changed the title Scheduler Memory Leak in Airflow 2.0 Scheduler Memory Leak in Airflow 2.0.1 Mar 21, 2021
@xsu-git
Copy link

xsu-git commented Mar 22, 2021

i have same problem. i looked for all the channels and methods but did not solve it!
Thanks for your question and let me know what the problem is.
I have to use crontab script to restart scheduler process regularly. But this is stupid.
If can't solve it, I can only fall back to 1.10.

@itispankajsingh
Copy link
Author

This issue is also there in version 1.10.* however in version 2.0.* the issue is more severe and also we don't have option of run_duration hence have to deploy our own cron jobs to refresh scheduler regularly.
https://issues.apache.org/jira/browse/AIRFLOW-4593

@raphaelauv
Copy link
Contributor

Until there is a fix or you find a specific reason, you could handle the OOM -> kubernetes/kubernetes#40157

There is few initiatives 👍

@xsu-git
Copy link

xsu-git commented Mar 22, 2021

Thanks for yours comments.
It seems that the memory leak problem has not been solved well.
I can only call cron script to restart process every hour.

@vikramkoka vikramkoka added priority:critical Showstopper bug that should be patched immediately affected_version:2.0 Issues Reported for 2.0 area:Scheduler including HA (high availability) scheduler labels Mar 22, 2021
@uranusjr
Copy link
Member

It seems like this is specific to the Kubernetes executor? It’d be awesome if you can confirm.

@xsu-git
Copy link

xsu-git commented Mar 23, 2021

not completely.I use CeleryExecutor in the project also have this problem
After schedule, VSZ and RSS keep growing always.

@itispankajsingh
Copy link
Author

One observation I have is that the rate of memory leak increases with number of dags (irrespective of whether they are being run). It definitely has something to do with the dag parsing process.

@xsu-git
Copy link

xsu-git commented Mar 23, 2021

yep.In my production environment, when using a small number of jobs, no problems are found temporarily.
but as the business is connected, the increase in the amount of work will cause the schedule process die.

@xsu-git

This comment has been minimized.

@uranusjr
Copy link
Member

uranusjr commented Mar 24, 2021

Please keep this thread on topic with the scheduler memory issue. For usage questions, please open threads in Discussions instead.

@xsu-git
Copy link

xsu-git commented Mar 31, 2021

In my produce env.

used :
linux release: Debain
airflow version: 2.0.1
exeucutor = CeleryExecutor
max_active_dag_runs_per_dag=32
parallelism=32
dag_concurrency=16
sql_Alchemy_pool_size=16
sql_Alchemy_max_overflow=16

about 3 workers,40 dags, 1000 tasks. Many tasks keep scheduled status sometimes and canot keep running.
when I call cron script to restart process every hour.The problem haved sloved.

@maison2710
Copy link

I also got similar issue with Airflow 2.0.1 when using Kubernetes executor. Is there any update or timeline for this issue?

@brunoffaustino
Copy link

brunoffaustino commented Aug 2, 2021

We are also facing this issue right now. Any news?
We use currently CeleryExecutor as @suxin1995
cc: @Cabeda @pmamarques

@CapBananoid
Copy link

CapBananoid commented Sep 1, 2021

Hi,
if it can help I have this kind of leak on 2.1.2 celeryexecutor on kubernetes
scheduler increased per approximatively 400M every day
and I noticed a drop (from 2G to 500M) as soon as I deleted old logs in /opt/airflow/logs/scheduler (deleted all folders but the one pointed by latest)...
I had 2G of logs and now back to 300M and didn't need to restart the scheduler for that drop to occur
I don't know if it's really related or whatever but I think I will add a CronJob container on this volume to periodically do this

@potiuk
Copy link
Member

potiuk commented Sep 1, 2021

That is cool finding.

@suhanovv
Copy link
Contributor

suhanovv commented Sep 2, 2021

we have the same problem
airflow 2.1.3 in k8s cluster with CeleryExecutor

15 days of logs give us
image
~ 6GB memory usage and ~1.7 after delete 12 days of logs

@potiuk
Copy link
Member

potiuk commented Sep 2, 2021

Few questions.

Do you know which processes/containers keep the memory? Is it scheduler (and which container)? Maybe you can see the breakdown per process as well ? I understand this is whole cluster memory, and I am trying to wrap my head around it and see where it can come from, because it is super weird behaviour to get back memory after deleting files (?).

Dp you simply run "rm *" in the "/opt/airflow/logs/scheduler" and it drops immediately after? Or is there some delay involved? Do you do anything else than rm ? I understand you do not restart scheduler. Can you check if maybe the scheduler restarted itself by coincidence (or triggered by the deletion) ?

Maybe also you can see how many airflow related processes you have when scheduler runs? And maybe their number grows and then drops when you delete the logs?

@CapBananoid
Copy link

CapBananoid commented Sep 2, 2021

I did nothing but a rm and it dropped quite immediately (sorry the memory is brought back by prometheus andd you have delay but what I can tell you is that it dropped within 15s after I did the rm)
I just did it on my dev instance in fact, same result
The container is the scheduler, I run separate container for each service, this is the one with the command "airflow scheduler -n -1"
I don't think the scheduler did a restart by itself, if it had been the case then kubernetes would have shown a failed container service and so would have restarted it and it's not the case
I can tell you that the type of memory that is shown to grow is the one from the prometheus metric called container_memory_cache

@CapBananoid
Copy link

CapBananoid commented Sep 2, 2021

fun fact
if I create new log folders (doing "cp -R 2021-09-01 XXX" for example) then memory rises 30s afterwards

@potiuk
Copy link
Member

potiuk commented Sep 2, 2021

Ah right. The last line you wrote (container_memory_cache) is GOLD.

That probably would explain it and it's NOT AN ISSUE.

When you open many files Linux basically will use as much memory it can for file caches. Whenever you read or write a file, the blocks of disk are kept also in memory just in case the files needs to be accessed by any process. It also marks them dirty in case the blocks change and evicts such dirty blocks from memory. Also when some process needs more memory than it has available, it will evict some unused pages from memory to free them. Basically for any system, that writes files to logs continuously and the logs are not modified later, the cache memory will grow CONTINUOUSLY until the limit set by kernel configuration.

So depending on what your Kernel configuration is (basically the Kernel of your Kubernetes Virtual machines under the hood), you will see the metrics growing continuously (up to the kernel limit). You can limit the memory available to your Scheduler container to limit it "per container" (via giving it less memory resources) but basically as much memory you give to the scheduler container, it will be used for cache after some time (and will not be explicitly freed - but it's not a problem because the memory is effectively "free" - it's just used for cache and it can be freed immediately when needed).

That would PERFECTLY explain why the memory drops immediately after the files are deleted - those files are deleted so the cache for those files should also get deleted by the system immediately.

Instead of looking at total memory used you should look at the container_memory_working_set_bytes - metrics. It reflects the actually "actively used" memory. You can read more here: https://blog.freshtracks.io/a-deep-dive-into-kubernetes-metrics-part-3-container-resource-metrics-361c5ee46e66

You can also test it by running (from https://linuxhint.com/clear_cache_linux/):

echo 1 > /proc/sys/vm/drop_caches

In the container. This should drop your caches immediately without deleting the files.

@potiuk
Copy link
Member

potiuk commented Sep 2, 2021

Actually one thing that it might be helping eve to keep the "cache" memory down (though it has barely no consequences). Do you happen to run any kind of automated log rotation ? We have a "clean-logs.sh" script in the official Image that can be run to clean the logs. This will have a side-effect of freeing the Page Cache memory used by that files: https://github.com/apache/airflow/blob/main/scripts/in_container/prod/clean-logs.sh

@CapBananoid
Copy link

CapBananoid commented Sep 2, 2021

I can set that launch with a Cron Job easily yes but even if I understand the cache thing, I don't get why it would cache files it doesn't even need to look at (when I create dummy folder in the logs folder)
I will also try implementing limits in the scheduler container to see if it does the trick
Strange thing is that this memory increase is not seen with the web service
And finally, what you say is that it doesn't relate to the memory leak topic that's it ? or maybe the memory leak is a false flag for the freeze @itispankajsingh was concerned about

@potiuk
Copy link
Member

potiuk commented Sep 7, 2021

Ah cool. So at least we figured that one out. Then it should be no problem whatsoever. One thing we COULD do is we could potentially add this hint to kernel to not add the log files to the cache if this is a Page Cache. It's not a harm in general to get this cache growing, but adding the hint might actually save us (and our users!) from diagnosing and investigating issues like this ;)

@potiuk
Copy link
Member

potiuk commented Sep 7, 2021

@suhanovv -> this is the change you can try. #18054 . While the ever growing cache is not a problem, possibly by implementing the advise to the kernel we can simply avoid this cache from growing in the first place.

@lixiaoyong12
Copy link

We have deployed scheduler today and the memory is increased from 100 MB to 220 MB.

@lixiaoyong12 - what kind of memory you are talking about ? Is it container_memory_working_set_bytes or container_memory_cache ?
I deployed the scheduler directly on the Linux operating system.

@lixiaoyong12
Copy link

So I guess the quest continues. Hmm. Interesting one that it wen't down indeed after some time. If that's the cache then this would be strange to have container_memory_working_set_bytes (I presume the graph above is this?).

I have another hypothesis. Linux Kernel also has "dentries" and "inode" caches - it keeps in memory the used/opened directory structure and file node information. And I believe those caches would also be cleared whenever the log files are deleted.

If this is a cache, you can very easily check it - you can force cleaning the cache and see the results:

Cleaning just PageCache:

sync; echo 1 > /proc/sys/vm/drop_caches

Cleaning dentries and inodes:

sync; echo 2 > /proc/sys/vm/drop_caches

Can you make such experiment please?

sync; echo 1 > /proc/sys/vm/drop_caches ->It's down 40m, and there's more than 200

@potiuk
Copy link
Member

potiuk commented Sep 7, 2021

I deployed the scheduler directly on the Linux operating system.

Still - you can see whether it's process or cache memory that grows:

For example here you can see how to check different types of memory used: https://phoenixnap.com/kb/linux-commands-check-memory-usage

Could you check what kind of memory is growing ?

@lixiaoyong12
Copy link

We have deployed scheduler today and the memory is increased from 100 MB to 220 MB.

@lixiaoyong12 - what kind of memory you are talking about ? Is it container_memory_working_set_bytes or container_memory_cache ?
I deployed the scheduler directly on the Linux operating system.

I use: ps auxww | grep airflow at different times. I found the memory is increased from 100 MB to 220 MB.

@lixiaoyong12
Copy link

I deployed the scheduler directly on the Linux operating system.

Still - you can see whether it's process or cache memory that grows:

For example here you can see how to check different types of memory used: https://phoenixnap.com/kb/linux-commands-check-memory-usage

Could you check what kind of memory is growing ?
i use pmap -p 203557 | grep anon , i found 00007efdc9d0d000 115968K rw--- [ anon ] that grows

@potiuk
Copy link
Member

potiuk commented Sep 7, 2021

Can you please dump a few pmap outputs at different times and share it in .tar.gz or smth @lixiaoyong12 ? Without grep so that we can see everything. Ideally over of timespan of few hours so that we see that this is not a "temporary" fluctuation and see the trend ?

@potiuk
Copy link
Member

potiuk commented Sep 7, 2021

Just to explain @lixiaoyong12 -> when you have a number of different dags and schedules, I think - depending on frequency etc. - this would be perfectly normal for scheduler to use more memory over time initially. Generally speaking it should stabilize after some time and then it will be fluctuating up/down dependning on what is happening. That's why I want to make sure this is not such a fluctuation, also if you could run periodically the cache cleanup and see if the memory is returning back to some more-or-less same value after some time. That would be most helpful!

potiuk added a commit to potiuk/airflow that referenced this issue Sep 7, 2021
Extends the standard python logging.FileHandler with advise to the
Kernel to not cache the file in PageCache when it is written. While
there is nothing wrong with such cache (it will be cleaned when memory
is needed), it causes ever-growing memory usage when scheduler is
running as it keeps on writing new log files and the files are not
rotated later on. This might lead to confusion for our users, who are
monitoring memory usage of Scheduler - without realising that it is
harmless and expected in this case.

Adding the advice to Kernel might help with not generating the cache
memory growth in the first place.

Closes: apache#14924
@potiuk
Copy link
Member

potiuk commented Sep 7, 2021

I updated the fix in #18054 (hopefully it will be ok now ) @suhanovv - in case you would like to try. I will wait for it to pass the tests but hopefully it will be ok now (mixed os.open with open 🤦 )

@suhanovv
Copy link
Contributor

suhanovv commented Sep 8, 2021

@potiuk
Ok, we will deploy to the test stand today

@suhanovv
Copy link
Contributor

suhanovv commented Sep 8, 2021

@potiuk
the last fix works as it should

image

@potiuk
Copy link
Member

potiuk commented Sep 8, 2021

🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉

@potiuk
Copy link
Member

potiuk commented Sep 8, 2021

Thanks a lot ! That might really help with user confusion!

potiuk added a commit that referenced this issue Sep 9, 2021
* Advises the kernel to not cache log files generated by Airflow

Extends the standard python logging.FileHandler with advise to the
Kernel to not cache the file in PageCache when it is written. While
there is nothing wrong with such cache (it will be cleaned when memory
is needed), it causes ever-growing memory usage when scheduler is
running as it keeps on writing new log files and the files are not
rotated later on. This might lead to confusion for our users, who are
monitoring memory usage of Scheduler - without realising that it is
harmless and expected in this case.

Adding the advice to Kernel might help with not generating the cache
memory growth in the first place.

Closes: #14924
kaxil pushed a commit that referenced this issue Sep 15, 2021
* Advises the kernel to not cache log files generated by Airflow

Extends the standard python logging.FileHandler with advise to the
Kernel to not cache the file in PageCache when it is written. While
there is nothing wrong with such cache (it will be cleaned when memory
is needed), it causes ever-growing memory usage when scheduler is
running as it keeps on writing new log files and the files are not
rotated later on. This might lead to confusion for our users, who are
monitoring memory usage of Scheduler - without realising that it is
harmless and expected in this case.

Adding the advice to Kernel might help with not generating the cache
memory growth in the first place.

Closes: #14924
(cherry picked from commit 43f595f)
@sawaca96
Copy link

sawaca96 commented Jun 27, 2022

@potiuk

I use helm 1.6.0 and airflow 2.2.5

image
image

why memory continuou increase? both shceduler and triggerer not webserver

@potiuk
Copy link
Member

potiuk commented Jun 27, 2022

What kind of memory is it ? See the whole thread. There is different kind of memory and to might be observing cache memry growth for whatever reason.

Depending on the type of memory it might or might not be a problem. Buy you need gto investigate it in detail. No one is able to diagnose it without you investigating based on three thread.

The thread has all the relevant information. You need to see what process is leaking - whether it is airflow or system or some other process

BTW. I suggest you open a new discussion with all the details. There is little value in commenting on closed issue. Remember also this is a free forum where people help when they can and their help is much more efficient if you give all the information and show that you've done your part. There also companies offering help for Airflow for money and they can likely do the investigation for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.0 Issues Reported for 2.0 area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug priority:critical Showstopper bug that should be patched immediately
Projects
None yet