New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Homeassistant container memory leak - Out of Memory #93713
Comments
After cca 18hrs my memory consumpion looks like:
|
After cca 30hrs memory consumpion looks like:
|
I'd start with |
OK, thank you. I did as you suggested, the log is attached. Thank you for any help. |
Something is creating a massive amount of tasks |
Thats good thats its asyncio since it means a callgrind might show it. Can you generate one with the |
Thank you for help, please find it attached - for 180s, if needed I can produce longer. |
Its not immediately obvious where the task leak is coming from which means its likely a "slow" leak. They are always easier to find when they run the system out of memory quickly as you can see the abnormal growth quickly. The next step would be to disable custom components below and see if the problem goes away.
If that turns out to be the cause we would than need to dig through the code of the one causing it to look for the leak |
It is not that slow leak... I am running HA on server with 16GB RAM and HA was able to grow from ~300MB to 4GB in ~ two days. Should I keep running profiler for longer? Does you reply means, that there was nothing obvious in callgrind? I am doing what you wrote anyway (test and try integrations), but I need at leas half of the integrations for base function of my house hold, so I can't simply switch them off. Therefore it would be really nice, if there is something in the logs at least to narrow my search. Thank you. |
It likely it is in the profile but I can’t find it because it’s so small. It’s slow in the sense that in the time of the profile it only grows a little. When I say fast leak I mean something that will crash in a few minutes. These are easy to see where the source is because they show up in the profile by sorting You could try running the profile for 30m to see if it’s large enough to correlate the source. keep in mind the profile has all the calls so it’s looking for a needle in a haystack. Anything we can do to make the leak faster or bigger will make it easier to find |
I did run profiler for 1800s but strangely the size of the files is similar. Anyway I hope it would be helpful. Thank you. |
Hello,
So messagemanager method seems not to be in the in the custom integrations either. I did even grep in the Homeassistant and Hassio root directories, but nothing was found. Are you sure about the name? (i.e. messagemanager)? I did not found it in the diagrams you attached either. BTW: is there any HOWTO one can follow to get similar output from profiler? It would be helpful for user to try to identify faulty code themself. |
@bdraco Can you please check "messagemanager" method/class name you mentioned? Maybe I am doing it wrong, but I was not able to find it in the custom integration either. |
Just to chime in here, I'm not using any of these integrations:
I'm also experiencing a memory leak, that results in the home-assistant container gets killed (by the OOM) every 5-6 days. Quite annoying :) One of the pictures is showing when the container got killed, and the other shows the free memory keeps getting lower. |
@bdraco I am really sorry, but I have politely asked you several times during past weeks simple question regarding your original findings - i.e. where did you exactly found "messagemanager" in the trace and why I can't grep anything in HA source code including custom integrations with such name. I need such information to continue with my troubleshooting. Do I assume correctly from your silence, that I should not expect an answer at all? |
Day job has me traveling quite a bit. I’m about 150 GitHub notifications behind |
So, I just put in an issue with HAOS because I'm also experiencing OOM reboots. My memory allocation graph is even steeper and my machine reboots more often. #2595. Is this more likely to be an HA than an HAOS issue? I'm going to try to downgrade to HAOS 9.5 and see if I still have leaks. |
I see the same on my instance. It seemed to start after updating HAOS or 2023.6 beta (coming from a dev build in the early stage of 2023.5, so it could be also related to a 2023.5 specific version). |
I did observed memory leak on 2023.5. There was question why my graph of memory usage is not as steep as for others. Well I do run HA on server and I have currently 64GB of RAM, so it takes longer to consume it. (I extended my memory as intermediate workaround as my original 16GB RAM was exhausted in just 2-3 days.) |
If you have a memory leak please post a start log objects (see above) and a callgrind. Me too comments are not helpful as they don’t add any more information that gets us closer to tracking down a leak. |
callgrind_start_log_objects.zip See the attached file. If you need more or longer logs let me know. |
@VinceRMP Can you run the start log objects for 30 minutes? The way it works is it keep looking at whats in memory and sees what grows over time. Without a longer run it won't show what is leaking as it doesn't have enough time to figure out what is normal memory churn and what is not going away. |
At the moment I’m not able to do that. |
I can reproduce the start of the memory leak and stop it aswell after some testing.
This is also reported in the logs (I somehow didn't notice this before): This is the log which now has run for around 30 minutes: |
@VinceRMP that definitely looks like a different problem as there are no obvious task leaks in the log. Please continue in a new issue. |
Thanks for looking into it! I will create an issue in the frontend section. |
I think it’s still should go to core as we need to figure out why it isn’t releasing memory when the client disconnects |
@bdraco how about original issue I logged. Did you got time to find out how I should identify "messagemanager" you mentioned? |
https://www.home-assistant.io/integrations/profiler/
|
I wasn't able to find it. I don't think its going to help anyways |
@litinoveweedle I think you might have some luck figuring out the source if you do |
You'll get something like
Keep in mind you have 50000 in your dump above so it may run out of memory trying to list them all so you might want to do the dump before too many memory has leaked but not too soon that there aren't enough to identify. |
Sorry I don't understand. (On Task part). I will be back home on Sunday, I will try to see how this could be done. |
Ok, thank you for explanation, I will try to do that after HA restart |
Not sure it can help or even if it is relevant, but investigating myself an unrelated memory leak I found out a potential interesting library for your case: aiocoap which defines a messagemanager and functions called fill_or_recognize_remote as it appears in the callgrind graphs. This pyhton library is used by another library: pytradfri, itself used by the IKEA Tradfri HA Component: https://github.com/home-assistant/core/blob/dev/homeassistant/components/tradfri/manifest.json @litinoveweedle if you use this component it may worse deactivating it to check if it could be the culprit. |
Pretty sure I just stumbled on the source of this leak #96981 |
Thank you for your effort and not forgetting this issue. I did upgraded to 2023.7.3, which contains your fix yesterday and I can confirm that so far there is no memory leak anymore! As additional info, three days ago I discovered, that my official IKEA integration has new 'unknown' and 'unconfigured' device. I was able to remove this device via integration configuration GUI and it slowed down a leak significantly (but not completely). As to the origin of the this mysterious device - I remember I had some issue with IKEA battery powered blinds (Fyrtur), as after recharging their batteries some were not detected and I had to pair them to the IKEA gateway again. This was about in the same time when I discovered the leak. I was going to report this finding, but you were faster with the fix. ;-) Thank you again. |
Thanks for confirming |
Read before posting
#95386
The problem
HomeAssiastant container is leaking memory at about 1GB/day. Tried to disable not strictly necessary integrations, but could not find culprit. I am not exactly sure, when problem appeared first time, and if it is caused by new HA version or by new integration. I did disable OnVif integration, which was mentioned in separate OoM issue, but HA container kept growing anyway.
Profiler PROFILER.START_LOG_OBJECT_SOURCES log attached. I saw many messages like '"Failed to serialize <class 'list'>"'. but I am not sure, if it is real issue or not.
Could be related to #93556 or #92026
What version of Home Assistant Core has the issue?
core-2023.5.4
What was the last working version of Home Assistant Core?
core-2023-4.X?
What type of installation are you running?
Home Assistant Supervised
Integration causing the issue
unknown
Link to integration documentation on our website
No response
Diagnostics information
home-assistant.log.zip
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: