Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new mechanism to get thermal notification #551

Open
ostasevych opened this issue Jan 10, 2024 · 3 comments
Open

Add new mechanism to get thermal notification #551

ostasevych opened this issue Jan 10, 2024 · 3 comments

Comments

@ostasevych
Copy link

ostasevych commented Jan 10, 2024

Basically the serverinfo gets the information from /sys/class/thermal/thermal_zone*/temp.
At the same time some AMD motherboards and their chipsets do not store the information there, but in hwmon.
Eg, I have hp microserver, and it grabs and stores the temperature data :

k10temp:
temp1 /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon3/temp1_input

w83795adg-i2c-1-2f:
temp1 /sys/devices/pci0000:00/0000:00:14.0/i2c-1/1-002f/temp1_input
temp2 /sys/devices/pci0000:00/0000:00:14.0/i2c-1/1-002f/temp2_input
temp5 /sys/devices/pci0000:00/0000:00:14.0/i2c-1/1-002f/temp5_input

jc42-i2c-0-18
temp1 /sys/devices/pci0000:00/0000:00:14.0/i2c-0/0-0018/hwmon/hwmon0/temp1_input

jc42-i2c-0-19
temp1 /sys/devices/pci0000:00/0000:00:14.0/i2c-0/0-0019/hwmon/hwmon0/temp1_input 

And

# find /sys -name "temp*_input"
/sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon3/temp1_input
/sys/devices/pci0000:00/0000:00:14.0/i2c-1/1-002f/temp1_input
/sys/devices/pci0000:00/0000:00:14.0/i2c-1/1-002f/temp5_input
/sys/devices/pci0000:00/0000:00:14.0/i2c-1/1-002f/temp2_input
/sys/devices/pci0000:00/0000:00:14.0/i2c-0/0-0019/hwmon/hwmon1/temp1_input
/sys/devices/pci0000:00/0000:00:14.0/i2c-0/0-0018/hwmon/hwmon0/temp1_input

As well as, lm-sensors produces good data.

Is that possible to grab the data in a more universal way, eg from hwmon class, but not from the thermal_zone class?

Read more here https://github.com/Mellanox/mlxsw/wiki/Temperature-and-Fan-Control

@ostasevych
Copy link
Author

ostasevych commented Jan 10, 2024

UPD: My quick dirty hack in the function getThermalZones() in lib/OperatingSystems/DefaultOs.php, which analyses presence of thermal zones. If not getting data from hwmon.

public function getThermalZones(): array {
                if(is_dir("/sys/class/thermal/thermal_zone*")) {
                    $thermalZones = glob('/sys/class/thermal/thermal_zone*') ?: [];
                    $result = [];
                foreach ($thermalZones as $thermalZone) {
                        $tzone = [];
                        try {
                                $tzone['hash'] = md5($thermalZone);
                                $tzone['type'] = $this->readContent($thermalZone . '/type');
                                $tzone['temp'] = (float)((int)($this->readContent($thermalZone . '/temp')) / 1000);
                                if ($tzone['temp'] > 0) { $tzone['temp'] = '+'.$tzone['temp']; }
                        } catch (RuntimeException $e) {
                                continue;
                        }
                        $result[] = $tzone;
                    }
                } else {
                    $thermalZones = glob('/sys/class/hwmon/hwmon*') ?: [];
                    $result = [];
                    foreach ($thermalZones as $thermalZone) {
                        $tzone = [];
                        try {
                                $tzone['hash'] = md5($thermalZone);
                                $tzone['type'] = $this->readContent($thermalZone . '/name');
                                $tzone['temp'] = (float)((int)($this->readContent($thermalZone . '/temp1_input')) / 1000);
                        } catch (RuntimeException $e) {
                                continue;
                        }
                        $result[] = $tzone;
                    }
                }
                return $result;
        }

The data are not so comfortable to interpret:

image

sensors gives the following data:

jc42-i2c-0-18
Adapter: SMBus PIIX4 adapter port 0 at 0b00
RAM1 Temp:    +13.75°C  (low  =  +0.0°C)
                       (high = +60.0°C, hyst = +54.0°C)
                       (crit = +70.0°C, hyst = +64.0°C)

jc42-i2c-0-19
Adapter: SMBus PIIX4 adapter port 0 at 0b00
RAM2 Temp:    +13.5°C  (low  =  +0.0°C)
                       (high = +60.0°C, hyst = +54.0°C)
                       (crit = +70.0°C, hyst = +64.0°C)

k10temp-pci-00c3
Adapter: PCI adapter
CPU Core Temp:  +24.75°C  (high = +70.0°C)
                         (crit = +100.0°C, hyst = +95.0°C)

And this data are completely missing:

w83795adg-i2c-1-2f
CPU Temp:     +26.0°C  (high = +109.0°C, hyst = +109.0°C)
                       (crit = +109.0°C, hyst = +109.0°C)  sensor = thermal diode
NB Temp:      +29.0°C  (high = +105.0°C, hyst = +105.0°C)
                       (crit = +105.0°C, hyst = +105.0°C)  sensor = thermal diode
MB Temp:       +4.5°C  (high = +39.0°C, hyst = +39.0°C)
                       (crit = +44.0°C, hyst = +44.0°C)  sensor = thermistor

@kesselb
Copy link
Collaborator

kesselb commented Jan 10, 2024

Hey,

Using /sys/class/hwmon/hwmon looks okay to me.

The data are not so comfortable to interpret:

A device in /sys/class/hwmon/hwmon1 is a "driver" and can have many sensors.

I guess you want something like below to read all sensors.

Index: lib/OperatingSystems/Linux.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/lib/OperatingSystems/Linux.php b/lib/OperatingSystems/Linux.php
--- a/lib/OperatingSystems/Linux.php	(revision 268a3601683d8d1d0605ba3d1c17b44afab007e2)
+++ b/lib/OperatingSystems/Linux.php	(date 1704898019944)
@@ -232,6 +232,18 @@
 	public function getThermalZones(): array {
 		$data = [];
 
+		$drivers = glob('/sys/class/hwmon/hwmon*');
+		foreach ($drivers as $driver) {
+			$name = $this->readContent($driver . '/name');
+
+			$zones = glob($driver . '/temp*_label');
+			foreach ($zones as $zone) {
+				$type = $name . ' ' . $this->readContent($zone);
+				$temp = (int)$this->readContent(str_replace('_label', '_input', $zone)) / 1000;
+				$data[] = new ThermalZone(md5($zone), $type, $temp);
+			}
+		}
+
 		$zones = glob('/sys/class/thermal/thermal_zone*');
 		if ($zones === false) {
 			return $data;

image

@ostasevych
Copy link
Author

ostasevych commented Jan 10, 2024

$data = [];
 
+		$drivers = glob('/sys/class/hwmon/hwmon*');
+		foreach ($drivers as $driver) {
+			$name = $this->readContent($driver . '/name');
+
+			$zones = glob($driver . '/temp*_label');
+			foreach ($zones as $zone) {
+				$type = $name . ' ' . $this->readContent($zone);
+				$temp = (int)$this->readContent(str_replace('_label', '_input', $zone)) / 1000;
+				$data[] = new ThermalZone(md5($zone), $type, $temp);
+			}
+		}

Oh, that is nice!

@kesselb Daniel, may I ask you to post here the whole text of the patched function getThermalZones for NC v27?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants