-
-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Super high CPU usage with 2024.4.1 #115072
Comments
Install the profiler integration and use the https://www.home-assistant.io/integrations/profiler/#service-profilerstart Post the callgrind here and we can take a look |
# Example configuration.yaml entry
logger:
default: info
logs:
homeassistant.core: debug You might want to enable debug logging for core as well as it will show you where the massive amount of events you see in your log entry are coming from. |
I will have to try this later today as I'm not home to test. I did notice when updating to 2024.4.1 that a mushroom card (HACS) would constantly timeout or reload. I removed this integration and went back to the 2024.4.1 image. If I start racking up CPU cycles again I'll let you know tonight. |
My install locked up again at 100% CPU after an hour. This release is broken. Reverting fixes it. |
Adding debug logging for core and profiler. I run three instances (2 dev for a custom component and 1 prod). Only my prod experiences this. My main dev is on 2024.4.1 with a fully vanilla instance except for the custom component I maintain and doesn't experience this. My prod has my whole house with 100% native HA components + my custom component. @bdraco when do I need to run profiler? When CPU is high or anytime? |
Generally you want to do it with both cases so you have a baseline to compare against |
Profiler is running now. Baseline below |
Baseline log |
Both of those look OK which is a good baseline to compare with when the problem does happen |
As luck would have it it's running super stable at the moment. I've tried upgrading multiple times over the last few days with the same result every time. Full system lockup within an hour or two with python3 pegging 1 CPU at 100%. Will keep the logs going for a while still. |
In between this test and the last one, I did update zigbee2mqtt. Perhaps this may help with troublehsooting. |
Woke up this morning to a completely locked up and unresponsive system. Had to hard reset the host as it was unresponsive. No profiler but here's the huge log. Reverting back again. This release is definitely broken for me. Woke up to a cold house as no automation for the heat worked. |
@bdraco have you guys changed how automations speak to the event bus? Looking at this it seems one of my automations that has worked fine for years is going haywire and triggering every millisecond. Whatever it is you guys did needs reversing. This is a native automation, using a native HA component. That automation is due to your native gree component losing connectivity to the AC so I wait for it to become available again and resend the command it missed. Automation is below:
Nothing fancy. |
I have yet to fully step through it, though. My day job is hectic right now, so I only have a little time for HA. I'll take another look before the weekend if I get free cycles. From a cursory review, I don't think so since your automation doesn't look like you have any However, your automation is calling several other automations, so please post those as well. |
Here you go. First one:
Second one:
|
Updated to 2024.4.2 and will be actively monitoring it. Will try and run profiler before it locks up if it does. |
You have a loop in your first automation; once You have two of these blocks in your first automation:
This block does nothing since you don't specify a trigger. I just tested this, and HA will interpret a Were there formerly triggers specified here? |
It should be the unavailable trigger id in there.
Le lun. 8 avr. 2024 à 19:02, Rick Auch ***@***.***> a écrit :
… You have a loop in your first automation; once
automation.raise_temperature is triggered by your automation, it will
continue triggering as fast as the system can loop.
You have two of these blocks in your first automation:
- wait_for_trigger: []
timeout:
hours: 0
minutes: 5
seconds: 0
milliseconds: 0
continue_on_timeout: false
This block does nothing since you don't specify a trigger. I just tested
this, and HA will interpret a null trigger as one that triggered and so
the automation will continue immediately without delay.
Were there formerly triggers specified here?
—
Reply to this email directly, view it on GitHub
<#115072 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZZGCFP6NDYO5YMP22OGHVTY4MOX7AVCNFSM6AAAAABF25RMP2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBTG43TKOJSGU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I think I've found why my system was slow. but I have no idea what caused it. My fix was adding more RAM. With no changes other than an upgrade to 2024.4.0 I've had to bump up the RAM allocated to my VM from 8GB to 16GB. When I have it set to 8GB I now have around 5GB of swap space being used! |
My issues have all been caused by a bug in the timer logic in 2024.4.x releases. The issue for me is due to having a timer as a trigger in an automation. When the timer changes from active to idle the automation is triggered. The trouble is, I need to add a delay of 1 second after the timer went idle before triggering otherwise HA keeps triggering the automation. trigger:
- platform: state
entity_id:
- timer.office_lights
for:
hours: 0
minutes: 0
seconds: 1
to: idle
from: active |
Sounds like you also have a loop. Can you share the full automation? 2024.4 has included some significant improvements to speed, and therefore is uncovering some issues with users' incorrectly programmed automations & scripts where loops without interrupts exist. |
I think I found it. In one part of my routine, I call timer.cancel on the office_lights timer. This seems to send the timer to active and back to idle again even when the timer was already idle. That was triggering the automation again.
|
@bdraco I think the leading theory here is that the way timers are handled change and would cause loops. I corrected my automation by creating another automation to act as a trigger in my wait statement. This seems to have fixed the issue. Would you like me to close the issue or would you like to use it to improve documentation? I can try my hand at better documenting this and making a PR for it if you want, just point me in the general direction of where. |
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. |
The problem
Updating my docker container to 2024.4.1 causes CPU to climb to 100% over the span of a few hours. It never goes down. The only fix is to revert back to 2024.3.3.
Something is wrong with this release.
What version of Home Assistant Core has the issue?
2024.4.1
What was the last working version of Home Assistant Core?
2024.3.3
What type of installation are you running?
Home Assistant Container
Integration causing the issue
recorder
Link to integration documentation on our website
No response
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
No response
The text was updated successfully, but these errors were encountered: