Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
RFC: Add system health component #20436
Was talking with Tinkerer, and we came to the conclusion that we should prioritize adding this component as it will help with helping (how meta).
Goal is to get a place in the UI to show the info on the machine. This will help people with diagnosing problems.
Some RFC about this implementation:
UI will look something like this:
Related issue (if applicable): fixes home-assistant/architecture#114
Pull request in home-assistant.io with documentation (if applicable): TODO home-assistant/home-assistant.io#<home-assistant.io PR number goes here>
Example entry for
referenced this pull request
Jan 26, 2019
I think the actions need to be a new part. So you see all available checks and can trigger the test to see if there is a problem. I.e., Hue register a health check function. We can click on that and trigger the register function they check if the Hub is connected or other things and return one of 3 states:
So the health component has 2 parts. Static system information (they usually are very static) and a health check function for a domain.
@pvizeli so what about some info that we could run as an action, but also could retrieve statically? For example, we will know if we're connected to HA Cloud ?
One other thing I hope other components will add to this view is last interaction. When was last interaction with Hue etc. When was the last error etc.
The Problem is, that code works nicely if you have no issue. You call now every callback any time you try to receive data for Frontend.
On component, you need to check data on external API or hardware. If there is an issue, you run into the default timeout. Also, some system API like docker eat a lot of resources and can block other processes.
I would prefer that we run the health check every, i.e. hour. And the Frontend sees the cached data of the last health check. But with an option that you can trigger a new health check with knowledge that can take up to 30-60sec until you see the result. That allows us also to slow down the checks and not trigger all at once.
With this mechanics, we can later add things like creating a trigger on background checks or a history on which time the system had issues.
This was referenced
Jan 30, 2019
If I need to grab this data from a device or in case of hass.io from the supervisor, they run into the API timeout if there is a problem available. But you are right, for integration with a running connection like the cloud it works perfectly.
That end's up in: if you see the healthy data, your system works as it should otherwise you have an issue
Jan 30, 2019
6 checks passed
For those running on a pi, this might be a good check to do and report as undervoltage can lead to SD problems/throttling
Added system_health: in configuration.yaml
Thu Feb 07 2019 13:33:49 GMT-0500 (Eastern Standard Time)