Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phosphor-hwmon: Set a Fault/Functional property when the sysfs fault property is set #2329

Closed
spinler opened this issue Sep 18, 2017 · 2 comments
Assignees
Labels

Comments

@spinler
Copy link
Contributor

spinler commented Sep 18, 2017

The hwmon documentation states:

Each input channel may have an associated fault file. This can be used
to notify open diodes, unconnected fans etc. where the hardware
supports it. When this boolean has value 1, the measurement for that
channel should not be trusted.

phosphor-hwmon should check for that fault status, if it exists, when reading a sensor and set a new D-Bus property on that sensor (Maybe using the OperationalStatus interface) to reflect that fault. The sensor object would not be removed from D-Bus, and the process should no longer exit.

Things to consider:

  • keep retrying and checking for the fault to go away?
  • even bother reading the sensor value if the fault is set?
  • not all device drivers behind sensors may support the _fault file. Does that mean the fault property should be set in D-Bus when the actual sensor value read fails?
@zahrens zahrens added this to the openBMC v2.0 Backlog milestone Sep 20, 2017
@geissonator geissonator modified the milestones: openBMC v2.0 Backlog, openBMC v3.0 Backlog Sep 29, 2017
@geissonator geissonator added defer and removed Phase 6 labels Sep 29, 2017
geissonator pushed a commit to openbmc/phosphor-hwmon that referenced this issue Mar 23, 2018
This is a temporary fix until the following issues are completed:
    openbmc/openbmc#2327
    openbmc/openbmc#2329

When an EAGAIN or an EREMOTEIO return code is received by hwmon
from the OCC driver in the 4.13 kernel, they should be translated to
an unavailable sensor(0x00) and failed sensor(0xFF) scaled values
respectively. This will keep the OCC hwmon instance running and allow
applications to continue using these sensors as they were reported under
the mainline openbmc/linux 4.10 kernel.

Tested:
    Verified return codes are caught and sensor value modified

Change-Id: Ie61859863e7d88878caa942e5f5b062acabe67aa
Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
@spinler spinler assigned msbarth and unassigned spinler Mar 30, 2018
foxconn-bmc-ks pushed a commit to foxconn-bmc-ks/phosphor-hwmon that referenced this issue Apr 13, 2018
This is a temporary fix until the following issues are completed:
    openbmc/openbmc#2327
    openbmc/openbmc#2329

When an EAGAIN or an EREMOTEIO return code is received by hwmon
from the OCC driver in the 4.13 kernel, they should be translated to
an unavailable sensor(0x00) and failed sensor(0xFF) scaled values
respectively. This will keep the OCC hwmon instance running and allow
applications to continue using these sensors as they were reported under
the mainline openbmc/linux 4.10 kernel.

Tested:
    Verified return codes are caught and sensor value modified

Change-Id: Ie61859863e7d88878caa942e5f5b062acabe67aa
Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
Signed-off-by: Doyle Huang <doyle.sy.huang@mail.foxconn.com>
@msbarth
Copy link
Contributor

msbarth commented Apr 26, 2018

When an associated fault file for a sensor exists, the OperationalStatus interface will be added with the Functional property set to the corresponding value within that fault file. The interface will not be added to sensors that do not provide an associated fault file.

If a sensors if marked faulted (non-functional), its associated input file will not be read. When the sensor is initially faulted, the value will default to 0 and at anytime during the monitoring loop a sensor is faulted, its value will not be read from the input file and updated on dbus.

  • Add status interface to sensors
  • Fault status check before reading sensor value
  • Skip updating value for faulted sensors
  • The fault will be checked/updated on each iteration of the monitoring loop

@rfrandse
Copy link

https://gerrit.openbmc-project.xyz/10315 Refresh sensor functional state
Resolves: #2329 phosphor-hwmon: Set a Fault/Functional property when the sysfs fault property is set

geissonator pushed a commit to openbmc/phosphor-hwmon that referenced this issue May 3, 2018
With openbmc/openbmc#2329, an OCC sensor value will not be read when the
associated fault file is set to true. This will set the value to 0 when
a sensor is faulted at startup or not update the previous value during
the monitoring loop if the OCC sensor becomes faulted.

Applications(i.e. fan control) needing to react to a faulted OCC sensor
can subscribe to property changed signals on the OperationalStatus
Functional property for the sensor's dbus object.

Tested:
    A faulted OCC sensor has a non-functional status on dbus

Change-Id: Ia43ebb1e0fe0227797bc4034e617ac357edd348d
Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants