Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Add system health component #20436
Conversation
balloob
requested a review
from home-assistant/core
as a
code owner
Jan 26, 2019
homeassistant
added
cla-signed
component: updater
core
new-platform
component: system_health
labels
Jan 26, 2019
wafflebot
bot
assigned
balloob
Jan 26, 2019
wafflebot
bot
added
the
in progress
label
Jan 26, 2019
This comment has been minimized.
This comment has been minimized.
We should also include (custom) addons/components configured for diagnosis. |
This comment has been minimized.
This comment has been minimized.
Yes please. Despite the warnings in the logs, lots of people overlook custom components (or heck, forget they've installed them). |
This comment has been minimized.
This comment has been minimized.
Components will be added in a future PR |
balloob
referenced this pull request
Jan 26, 2019
Open
Add (custom) components to system health component #20479
This comment has been minimized.
This comment has been minimized.
I think the actions need to be a new part. So you see all available checks and can trigger the test to see if there is a problem. I.e., Hue register a health check function. We can click on that and trigger the register function they check if the Hub is connected or other things and return one of 3 states: So the health component has 2 parts. Static system information (they usually are very static) and a health check function for a domain. |
houndci-bot
reviewed
Jan 28, 2019
homeassistant/helpers/system_info.py Outdated
This comment has been minimized.
This comment has been minimized.
@pvizeli so what about some info that we could run as an action, but also could retrieve statically? For example, we will know if we're connected to HA Cloud ? One other thing I hope other components will add to this view is last interaction. When was last interaction with Hue etc. When was the last error etc. |
balloob
added some commits
Jan 28, 2019
This comment has been minimized.
This comment has been minimized.
What am I doing… this needs to be WS commands instead of HTTP views |
balloob
added some commits
Jan 29, 2019
This comment has been minimized.
This comment has been minimized.
The Problem is, that code works nicely if you have no issue. You call now every callback any time you try to receive data for Frontend. On component, you need to check data on external API or hardware. If there is an issue, you run into the default timeout. Also, some system API like docker eat a lot of resources and can block other processes. I would prefer that we run the health check every, i.e. hour. And the Frontend sees the cached data of the last health check. But with an option that you can trigger a new health check with knowledge that can take up to 30-60sec until you see the result. That allows us also to slow down the checks and not trigger all at once. With this mechanics, we can later add things like creating a trigger on background checks or a history on which time the system had issues. |
This comment has been minimized.
This comment has been minimized.
@pvizeli The info command is for static info, firmware version, last interaction, connected to cloud, lovelace storage mode. Future PR will add a diagnostics command that will diagnose things on command when user clicks the button. |
This comment has been minimized.
This comment has been minimized.
Very confused, can't get the tests to pass on CI but can locally. Mock is not getting applied |
houndci-bot
reviewed
Jan 29, 2019
balloob
added some commits
Jan 29, 2019
Jan 30, 2019
This was referenced
This comment has been minimized.
This comment has been minimized.
If I need to grab this data from a device or in case of hass.io from the supervisor, they run into the API timeout if there is a problem available. But you are right, for integration with a running connection like the cloud it works perfectly. That end's up in: if you see the healthy data, your system works as it should otherwise you have an issue |
This comment has been minimized.
This comment has been minimized.
Well, how can we do it otherwise? We give each component up to 5 seconds to get the data. |
This comment has been minimized.
This comment has been minimized.
I want this to be part of the beta, so will merge it. We can discuss and change things later, as it's an internal implementation. |
balloob
merged commit cb07ea0
into
dev
Jan 30, 2019
6 checks passed
delete-merged-branch
bot
deleted the
system-health
branch
Jan 30, 2019
This comment has been minimized.
This comment has been minimized.
Brianckramer
commented
Jan 31, 2019
https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=147781&start=50#p972790 For those running on a pi, this might be a good check to do and report as undervoltage can lead to SD problems/throttling |
This comment has been minimized.
This comment has been minimized.
vlad36N
commented
Feb 7, 2019
Added system_health: in configuration.yaml In log: Thu Feb 07 2019 13:33:49 GMT-0500 (Eastern Standard Time) |
This comment has been minimized.
This comment has been minimized.
@vlad36N, please open an issue. |
balloob commentedJan 26, 2019
Description:
Was talking with Tinkerer, and we came to the conclusion that we should prioritize adding this component as it will help with helping (how meta).
Goal is to get a place in the UI to show the info on the machine. This will help people with diagnosing problems.
Some RFC about this implementation:
UI will look something like this:
Related issue (if applicable): fixes home-assistant/architecture#114
Pull request in home-assistant.io with documentation (if applicable): TODO home-assistant/home-assistant.io#<home-assistant.io PR number goes here>
Example entry for
configuration.yaml
(if applicable):system_health:
Checklist:
tox
. Your PR cannot be merged unless tests passIf the code does not interact with devices: