You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With both the aspeed fantach sensors and the 1-wire temperature sensors, we encountered long sensor read times leading to timeouts all the way up in btbridge, causing hard-to-diagnose failures from the host side IPMI handling.
These sensors may not be the only very slow sensors we run across. An arbitrary 5-second timeout in btbridge may eventually prove too short for another sensor. And reads appear very slow to the host when they may not need to be.
Joel and Cyril suggested polling sensors with a background thread and reading the most recent value out over IPMI to reduce latency, and I agree that this is the correct approach if we can ensure that the polling thread will time out appropriately if the sensor is unresponsive.
(host-ipmid maybe isn't the right place for this, but it's the area that's being affected with sneaky failures, so it seemed like as good a place as any.)
The text was updated successfully, but these errors were encountered:
Isn't this an issue with either the hwmon driver itself or phosphor-hwmon? I don't see it correct to add special code to ipmi providers because we'll also have this same trouble for REST, Redfish, etc.
With both the aspeed fantach sensors and the 1-wire temperature sensors, we encountered long sensor read times leading to timeouts all the way up in btbridge, causing hard-to-diagnose failures from the host side IPMI handling.
These sensors may not be the only very slow sensors we run across. An arbitrary 5-second timeout in btbridge may eventually prove too short for another sensor. And reads appear very slow to the host when they may not need to be.
Joel and Cyril suggested polling sensors with a background thread and reading the most recent value out over IPMI to reduce latency, and I agree that this is the correct approach if we can ensure that the polling thread will time out appropriately if the sensor is unresponsive.
(host-ipmid maybe isn't the right place for this, but it's the area that's being affected with sneaky failures, so it seemed like as good a place as any.)
The text was updated successfully, but these errors were encountered: