Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support for Framework 16 #20

Closed
hasechris opened this issue Feb 28, 2024 · 19 comments
Closed

[Feature] Support for Framework 16 #20

hasechris opened this issue Feb 28, 2024 · 19 comments

Comments

@hasechris
Copy link

Hi :-),

i just go my Framework 16 in the mail and did setup Manjaro on it. Because of too little fan rpm my whole laptop is (at least for my feeling) quite to warm.

I found your AUR package, but sadly it does not work because your script uses lm_sensors to determine the temperatures. Would you be capable of adding a mode where the temperatures are determined with the ectool? A AUR package for the framrwork-patched version is already available.

Greetings
hasechris

@TamtamHero
Copy link
Owner

Sorry, i don't have a framework 16, and I'm not even running Arch/Manjaro (I don't even know who published fw-fanctrl on AUR, sorry !)
Why is it an issue for you to use lm_sensors, btw ?

@hasechris
Copy link
Author

hasechris commented Mar 3, 2024

Hmmm.

yeah i see it now. The package https://aur.archlinux.org/packages/fw-fanctrl-git is made by icedream. Maybe I can support you here, because I own a FW 16.

Why is it an issue for you to use lm_sensors, btw ?

Ahem.. It seems that lm_sensors does not get the needed values from the hardware or they appear in another form.

[root@nemu-framework-manjaro ~]# sensors
ucsi_source_psy_USBC000:002-isa-0000
Adapter: ISA adapter
in0:           0.00 V  (min =  +0.00 V, max =  +0.00 V)
curr1:       410.00 mA (max =  +0.00 A)

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +40.8°C  

ucsi_source_psy_USBC000:004-isa-0000
Adapter: ISA adapter
in0:           0.00 V  (min =  +0.00 V, max =  +0.00 V)
curr1:       680.00 mA (max =  +0.00 A)

amdgpu-pci-c100
Adapter: PCI adapter
vddgfx:      901.00 mV 
vddnb:       760.00 mV 
edge:         +38.0°C  
PPT:          12.07 W  (avg =   4.06 W)

BAT1-acpi-0
Adapter: ACPI interface
in0:          16.34 V  
curr1:       835.00 mA 

ucsi_source_psy_USBC000:003-isa-0000
Adapter: ISA adapter
in0:           5.00 V  (min =  +5.00 V, max =  +5.00 V)
curr1:         0.00 A  (max =  +1.50 A)

ucsi_source_psy_USBC000:001-isa-0000
Adapter: ISA adapter
in0:           5.00 V  (min =  +5.00 V, max =  +5.00 V)
curr1:         5.00 A  (max =  +3.00 A)

mt7921_phy0-pci-0100
Adapter: PCI adapter
temp1:        +46.0°C  

nvme-pci-0200
Adapter: PCI adapter
Composite:    +36.9°C  (low  = -40.1°C, high = +83.8°C)
                       (crit = +87.8°C)
Sensor 1:     +49.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +40.8°C  
temp2:        +43.8°C  
temp3:        +40.8°C  
temp4:        +39.8°C  

But i think it would be cool if you supported a way via the tool ectool.
Temps are available via

[root@nemu-framework-manjaro ~]# ectool temps all
--sensor name -------- temperature -------- ratio (fan_off and fan_max) --
ambient_f75303@4d     315 K (= 42 C)        N/A (fan_off=0 K, fan_max=0 K)
charger_f75303@4d     316 K (= 43 C)        N/A (fan_off=0 K, fan_max=0 K)
apu_f75303@4d         314 K (= 41 C)           0% (320 K and 335 K)
cpu@4c                313 K (= 40 C)           0% (338 K and 370 K)
gpu_amb_f75303@4d     273 K (= 0 C)        N/A (fan_off=0 K, fan_max=0 K)
gpu_vr_f75303@4d      273 K (= 0 C)           0% (323 K and 347 K)
gpu_vram_f75303@4d    273 K (= 0 C)        N/A (fan_off=0 K, fan_max=0 K)
gpu_amdr23m@40        273 K (= 0 C)           0% (323 K and 353 K)

Fan Duty is available with

[root@nemu-framework-manjaro ~]# ectool pwmgetfanrpm all
Fan 0 RPM: 0
Fan 1 RPM: 0
[root@nemu-framework-manjaro ~]# 

We could either control the fans with a "dumb" ectool fanduty <pwmduty> <fan-no> or with the following. The Embedded Controller sets these fan curves as default:

[root@nemu-framework-manjaro ~]# ectool thermalget
sensor  warn  high  halt   fan_off fan_max   name
  0      363   363    378      0       0     ambient_f75303@4d
  1      363   363    378      0       0     charger_f75303@4d
  2      363   363    378    320     335     apu_f75303@4d
  3      381   381    400    338     370     cpu@4c
  4        0     0      0      0       0     gpu_amb_f75303@4d
  5      344     0      0    323     347     gpu_vr_f75303@4d
  6        0     0      0      0       0     gpu_vram_f75303@4d
  7        0     0      0    323     353     gpu_amdr23m@40
(all temps in degrees Kelvin)
[root@nemu-framework-manjaro ~]# 

These settings can be overridden with

[root@nemu-framework-manjaro ~]# ectool thermalset 3 381 381 400 300 345

Then the settings are like this:

[root@nemu-framework-manjaro ~]# ectool thermalget
sensor  warn  high  halt   fan_off fan_max   name
  0      363   363    378      0       0     ambient_f75303@4d
  1      363   363    378      0       0     charger_f75303@4d
  2      363   363    378    320     335     apu_f75303@4d
  3      381   381    400    300     345     cpu@4c
  4        0     0      0      0       0     gpu_amb_f75303@4d
  5      344     0      0    323     347     gpu_vr_f75303@4d
  6        0     0      0      0       0     gpu_vram_f75303@4d
  7        0     0      0    323     353     gpu_amdr23m@40
(all temps in degrees Kelvin)
[root@nemu-framework-manjaro ~]# 
[root@nemu-framework-manjaro ~]# ectool temps all
--sensor name -------- temperature -------- ratio (fan_off and fan_max) --
ambient_f75303@4d     315 K (= 42 C)        N/A (fan_off=0 K, fan_max=0 K)
charger_f75303@4d     316 K (= 43 C)        N/A (fan_off=0 K, fan_max=0 K)
apu_f75303@4d         313 K (= 40 C)           0% (320 K and 335 K)
cpu@4c                311 K (= 38 C)          24% (300 K and 345 K)
gpu_amb_f75303@4d     273 K (= 0 C)        N/A (fan_off=0 K, fan_max=0 K)
gpu_vr_f75303@4d      273 K (= 0 C)           0% (323 K and 347 K)
gpu_vram_f75303@4d    273 K (= 0 C)        N/A (fan_off=0 K, fan_max=0 K)
gpu_amdr23m@40        273 K (= 0 C)           0% (323 K and 353 K)
[root@nemu-framework-manjaro ~]#

@TamtamHero
Copy link
Owner

TamtamHero commented Mar 5, 2024

Thanks for all the logs, very useful !
So it seems the issue is more about AMD and Intel having different reporting methods.

For AMD, I'm going to take the acpitz-acpi-0 temperatures, these are more precise than what the ectool seems to output.
https://community.frame.work/t/resolved-monitoring-amd-temperature-from-linux/39980/13

Could you please give me the output of sensors -j ?

@hasechris
Copy link
Author

Hi :)

here the log for sensors -j. Maybe i need another driver for additional I2C Sensors?

[root@nemu-framework-manjaro mkinitcpio.d]# sensors -j
{
   "ucsi_source_psy_USBC000:002-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 0.000,
         "in0_min": 0.000,
         "in0_max": 0.000
      },
      "curr1":{
         "curr1_input": 0.000,
         "curr1_max": 0.000
      }
   },
   "k10temp-pci-00c3":{
      "Adapter": "PCI adapter",
      "Tctl":{
         "temp1_input": 48.375
      }
   },
   "ucsi_source_psy_USBC000:004-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 0.000,
         "in0_min": 0.000,
         "in0_max": 0.000
      },
      "curr1":{
         "curr1_input": 0.410,
         "curr1_max": 0.000
      }
   },
   "mt7921_phy0-pci-0100":{
      "Adapter": "PCI adapter",
      "temp1":{
         "temp1_input": 50.000
      }
   },
   "BAT1-acpi-0":{
      "Adapter": "ACPI interface",
      "in0":{
         "in0_input": 16.310
      },
      "curr1":{
         "curr1_input": 0.000
      }
   },
   "ucsi_source_psy_USBC000:003-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 5.000,
         "in0_min": 5.000,
         "in0_max": 5.000
      },
      "curr1":{
         "curr1_input": 5.000,
         "curr1_max": 3.000
      }
   },
   "ucsi_source_psy_USBC000:001-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 0.000,
         "in0_min": 0.000,
         "in0_max": 0.000
      },
      "curr1":{
         "curr1_input": 0.680,
         "curr1_max": 0.000
      }
   },
   "amdgpu-pci-c100":{
      "Adapter": "PCI adapter",
      "vddgfx":{
         "in0_input": 0.715
      },
      "vddnb":{
         "in1_input": 0.764
      },
      "edge":{
         "temp1_input": 43.000
      },
      "PPT":{
         "power1_average": 6.209,
         "power1_input": 5.186
      }
   },
   "nvme-pci-0200":{
      "Adapter": "PCI adapter",
      "Composite":{
         "temp1_input": 40.850,
         "temp1_max": 83.850,
         "temp1_min": -40.150,
         "temp1_crit": 87.850,
         "temp1_alarm": 0.000
      },
      "Sensor 1":{
         "temp2_input": 52.850,
         "temp2_max": 65261.850,
         "temp2_min": -273.150
      },
      "Sensor 2":{
         "temp3_input": 40.850,
         "temp3_max": 65261.850,
         "temp3_min": -273.150
      }
   },
   "acpitz-acpi-0":{
      "Adapter": "ACPI interface",
      "temp1":{
         "temp1_input": 44.800
      },
      "temp2":{
         "temp2_input": 45.800
      },
      "temp3":{
         "temp3_input": 46.800
      },
      "temp4":{
         "temp4_input": 46.800
      }
   }
}

@TamtamHero
Copy link
Owner

Thanks !
Can you try this branch on your laptop and tell me if it's all good ?
#21

@hasechris
Copy link
Author

Thanks ! Can you try this branch on your laptop and tell me if it's all good ? #21

Wow you are crazy :-)

I tried your branch, we have two problems.

Temps

It seems to me you are adding together the temps from acpitz-acpi-0 - but these temp sensors are not the single cores. These four temps correspond to the four temp sensors from ectool.

temp1 => ambient
temp2 => charger
temp3 => apu (whatever this means - maybe the socket)
temp4 => cpu (seems to me "the hottest core from all")

Additionally, if i see that correctly, the block acpitz-acpi-0 should get 4 additional temp sensors for the dGPU - but i dont have that.

The Problem: The Framework 16 does have two fans where heatpipes from the CPU are connected to both fans. If you get the additional dGPU there are additional heatpipes to the same fans, but the fans not only exhaust to the side but also to the back (there would be the additional heatpipe coolers for the dGPU). So these additional fan temps would then have to be considered additionally.

ectool

You have an ectool in your repo and install it into the system. Sadly it seems that framework had to patch the original ectool for the framework 16, so your ectool does simply get me the following error:

Mär 06 19:38:37 nemu-framework-manjaro python3[40094]: Missing Chromium EC memory map.
Mär 06 19:38:37 nemu-framework-manjaro python3[40094]: Cannot find I2C adapter
Mär 06 19:38:37 nemu-framework-manjaro python3[40094]: Unable to establish host communication
Mär 06 19:38:37 nemu-framework-manjaro python3[40094]: Couldn't find EC

I simply removed your copy of ectool and used the ectool from AUR "fw-ectool-git". Then it worked.

Hope this helps and thank you VERY much for your time :)

Greetings
hasechris

@hasechris
Copy link
Author

btw. do you have a sponsor link?

@TamtamHero
Copy link
Owner

TamtamHero commented Mar 10, 2024

Damn. I didn't know D.Howett's ectool had pursued its life on gitlab, that's a great discovery !

Additionally, if i see that correctly, the block acpitz-acpi-0 should get 4 additional temp sensors for the dGPU - but I don't have that.

Do you mean that you don't have a dGPU on your laptop, or rather that you simply don't have the 4 additional sensors displayed ?
I see that there is an entry "amdgpu-pci-c100" with a temperature, it looks like it could be related to the discrete GPU, if you have one. Let me know.

Also, out of curiosity, does ectool allow you to drive the 2 fans independently ? What does sudo ./ectool pwmgetfanrpm show ?

btw. do you have a sponsor link?

Thank you so much for your appreciation for the project! I'm glad to hear that you find value in it. However, I see this project more as a personal hobby rather than something I rely on for financial support. I enjoy maintaining it from time to time, and frankly it doesn't eat my time at all.

@hasechris
Copy link
Author

btw. do you have a sponsor link?

Thank you so much for your appreciation for the project! I'm glad to hear that you find value in it. However, I see this project more as a personal hobby rather than something I rely on for financial support. I enjoy maintaining it from time to time, and frankly it doesn't eat my time at all.

Daaamn, so wholesome :3 Thank you again BIG time. If you dont see a problem with it i will also make a small comment in the framework forum to get other people the info that fw-fanctrl will support the framework 16,

Do you mean that you don't have a dGPU on your laptop, or rather that you simply don't have the 4 additional sensors displayed ?

I meant that i dont have the dGPU and i (naturally) dont have the four additional temp sensors in lm_sensors.

I see that there is an entry "amdgpu-pci-c100" with a temperature, it looks like it could be related to the discrete GPU, if you have one. Let me know.

Yeah, nope. I have this block in sensors, but i dont have the dGPU. I think this textblock in sensors is correlated to the APU GPU.

EDIT: Tested it a bit. The voltage seems to be correlated to the AMD SoC, maybe only the cpu cores. also the PPT field seems to be the total power of the SoC, because with a clean cpu load but no gpu load i see about 45 watts in the PPT field.

EDIT2:
Ah, found the correlation. With lspci i get this info:

[root@nemu-framework-manjaro ~]# lspci
c1:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Phoenix1 (rev c2)

This correlates with the textblock title in sensors amdgpu-pci-c100. I dont really understand this, because the power and voltage info does not seem to correlate to the gpu yet the value name is vddgfx. Sorry that is not really a big help.

Also, out of curiosity, does ectool allow you to drive the 2 fans independently ? What does sudo ./ectool pwmgetfanrpm show ?

Info:
Fan 0 = Left
Fan 1 = Right
Tested the fan position with manually setting a fanduty via ectool fanduty <fanindex> <duty>.

RPM Info:
stress -c 6

[root@nemu-framework-manjaro ~]# ectool pwmgetfanrpm
Fan 0 RPM: 3996
Fan 1 RPM: 3627

stress -c 3

[root@nemu-framework-manjaro ~]# ectool pwmgetfanrpm
Fan 0 RPM: 2899
Fan 1 RPM: 2685

stress -c 2

[root@nemu-framework-manjaro ~]# ectool pwmgetfanrpm
Fan 0 RPM: 2318
Fan 1 RPM: 2022

Seems the EC does drive the two fans with the same pwm duty, but fan1 does have an offset to fan0.
I think the left fan does have more thermal work to do, see the following images.

The picture of the dGPU (which i dont have) seems to split the thermal load 50:50 to both fans.
In the Laptop itself the heatpipes for the left fan are smaller, but there are two of them. Additionally there are some chips on the left side of the mainboard, which are thermally connected to the left heatpipes. Dont really what this is, could be the vrm or something else.

image
IMG_20240310_235045
IMG_20240310_235049
IMG_20240310_235124
IMG_20240310_235137
IMG_20240310_235351

@TamtamHero
Copy link
Owner

Additionally, if i see that correctly, the block acpitz-acpi-0 should get 4 additional temp sensors for the dGPU - but i dont have that.

By the way, where did you get this info ?

I got a dump from a 16" owner with dGPU from Framework's discord (thanks again @Sniffels) and it would be nice to know what to pick exactly.
For the record:

{
   "amdgpu-pci-0300":{
      "Adapter": "PCI adapter",
      "vddgfx":{
         "in0_input": 0.018
      },
      "fan1":{
         "fan1_input": 0.000,
         "fan1_min": 0.000,
         "fan1_max": 4900.000
      },
      "edge":{
         "temp1_input": 49.000,
         "temp1_crit": 100.000,
         "temp1_crit_hyst": -273.150,
         "temp1_emergency": 105.000
      },
      "junction":{
         "temp2_input": 50.000,
         "temp2_crit": 100.000,
         "temp2_crit_hyst": -273.150,
         "temp2_emergency": 105.000
      },
      "mem":{
         "temp3_input": 57.000,
         "temp3_crit": 105.000,
         "temp3_crit_hyst": -273.150,
         "temp3_emergency": 110.000
      },
      "PPT":{
         "power1_average": 1.000,
         "power1_cap": 100.000
      }
   },
   "ucsi_source_psy_USBC000:003-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 5.000,
         "in0_min": 5.000,
         "in0_max": 5.000
      },
      "curr1":{
         "curr1_input": 0.000,
         "curr1_max": 1.500
      }
   },
   "k10temp-pci-00c3":{
      "Adapter": "PCI adapter",
      "Tctl":{
         "temp1_input": 46.625
      }
   },
   "ucsi_source_psy_USBC000:001-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 0.000,
         "in0_min": 0.000,
         "in0_max": 0.000
      },
      "curr1":{
         "curr1_input": 0.000,
         "curr1_max": 0.000
      }
   },
   "BAT1-acpi-0":{
      "Adapter": "ACPI interface",
      "in0":{
         "in0_input": 17.724
      },
      "curr1":{
         "curr1_input": 0.000
      }
   },
   "amdgpu-pci-c400":{
      "Adapter": "PCI adapter",
      "vddgfx":{
         "in0_input": 1.273
      },
      "vddnb":{
         "in1_input": 0.761
      },
      "edge":{
         "temp1_input": 44.000
      },
      "PPT":{
         "power1_average": 5.252,
         "power1_input": 12.213
      }
   },
   "ucsi_source_psy_USBC000:004-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 5.000,
         "in0_min": 5.000,
         "in0_max": 5.000
      },
      "curr1":{
         "curr1_input": 5.000,
         "curr1_max": 3.000
      }
   },
   "mt7921_phy0-pci-0400":{
      "Adapter": "PCI adapter",
      "temp1":{
         "temp1_input": 48.000
      }
   },
   "ucsi_source_psy_USBC000:002-isa-0000":{
      "Adapter": "ISA adapter",
      "in0":{
         "in0_input": 0.000,
         "in0_min": 0.000,
         "in0_max": 0.000
      },
      "curr1":{
         "curr1_input": 0.000,
         "curr1_max": 0.000
      }
   },
   "nvme-pci-0500":{
      "Adapter": "PCI adapter",
      "Composite":{
         "temp1_input": 45.850,
         "temp1_max": 85.850,
         "temp1_min": -0.150,
         "temp1_crit": 86.850,
         "temp1_alarm": 0.000
      },
      "Sensor 1":{
         "temp2_input": 39.850,
         "temp2_max": 65261.850,
         "temp2_min": -273.150
      },
      "Sensor 2":{
         "temp3_input": 42.850,
         "temp3_max": 65261.850,
         "temp3_min": -273.150
      }
   },
   "acpitz-acpi-0":{
      "Adapter": "ACPI interface",
      "temp1":{
         "temp1_input": 45.800
      },
      "temp2":{
         "temp2_input": 46.800
      },
      "temp3":{
         "temp3_input": 45.800
      },
      "temp4":{
         "temp4_input": 45.800
      },
      "temp5":{
         "temp5_input": 49.800
      },
      "temp6":{
         "temp6_input": 51.800
      },
      "temp7":{
         "temp7_input": 49.800
      },
      "temp8":{
         "temp8_input": 49.800
      }
   }
}

Seems the EC does drive the two fans with the same pwm duty, but fan1 does have an offset to fan0.

Interesting. I wonder if we should replicate this behavior or drive both fans at the same speed. I would think that having both at the same speed is more efficient and quiet, but I'm no expert.

@hasechris
Copy link
Author

Additionally, if i see that correctly, the block acpitz-acpi-0 should get 4 additional temp sensors for the dGPU - but i dont have that.

By the way, where did you get this info ?

As I see it now I incorrectly assumed this part. The tool ectool temps all shows 8 temps and i assumed the lower half would be for a dGPU and is therefore all 0°C in my system - but it seems to me now that this part is for the gpu in the AMD APU and the ec just does not have temp sensors for the gpu in the APU.

[root@nemu-framework-manjaro ~]# ectool temps all
--sensor name -------- temperature -------- ratio (fan_off and fan_max) --
ambient_f75303@4d     309 K (= 36 C)        N/A (fan_off=0 K, fan_max=0 K)
charger_f75303@4d     310 K (= 37 C)        N/A (fan_off=0 K, fan_max=0 K)
apu_f75303@4d         308 K (= 35 C)           0% (320 K and 335 K)
cpu@4c                305 K (= 32 C)           0% (338 K and 370 K)
#################################################            I mean these following 4 temp sensors
gpu_amb_f75303@4d     273 K (= 0 C)        N/A (fan_off=0 K, fan_max=0 K)
gpu_vr_f75303@4d      273 K (= 0 C)           0% (323 K and 347 K)
gpu_vram_f75303@4d    273 K (= 0 C)        N/A (fan_off=0 K, fan_max=0 K)
gpu_amdr23m@40        273 K (= 0 C)           0% (323 K and 353 K)
[root@nemu-framework-manjaro ~]# 

I got a dump from a 16" owner with dGPU from Framework's discord (thanks again @Sniffels) and it would be nice to know what to pick exactly.

Wohoo, that sounds superb. I correlated the info from @Sniffels with my info from sensors -j.
The block amdgpu-pci-c400 seems to correlate to my block amdgpu-pci-c100. The available subvalues are the same.

I interpret from this info that the pci address for the internal gpu is shifted upwards from c1:00.0 to c4:00.0 upon connecting the dGPU. I already saw this on my desktop computer happening when i inserted another PCIe Card in the lowest slot - all pcie adresses were changed, even from devices before the adress of the new card.

Maybe @Sniffels could send output from lspci, then we can check this.

So for your software there would be two versions of blocks, which you would have to check.

First Set is without a dGPU.
I marked the temperatures which you would have to get and factor in.

{
   [...]
   "amdgpu-pci-c100":{
      "Adapter": "PCI adapter",
      "vddgfx":{
         "in0_input": 0.815
      },
      "vddnb":{
         "in1_input": 0.760
      },
      "edge":{
         "temp1_input": 29.000                        <<<<<<< Temp of internal GPU
      },
      "PPT":{
         "power1_average": 4.043,
         "power1_input": 5.142
      }
   },
   "acpitz-acpi-0":{
      "Adapter": "ACPI interface",
      "temp1":{
         "temp1_input": 34.800
      },
      "temp2":{
         "temp2_input": 35.800
      },
      "temp3":{
         "temp3_input": 33.800                     <<<<<<< Temp of CPU Socket
      },
      "temp4":{
         "temp4_input": 30.800                     <<<<<<< Temp of hottest CPU Core
      }
   }
}

Second Set with a dGPU:

{
   "amdgpu-pci-0300":{
      "Adapter": "PCI adapter",
      "vddgfx":{
         "in0_input": 0.018
      },
      "fan1":{
         "fan1_input": 0.000,
         "fan1_min": 0.000,
         "fan1_max": 4900.000
      },
      "edge":{
         "temp1_input": 49.000,                           <<<<<<< Temp of dGPU (seems to be the package temp - see link under this part)
         "temp1_crit": 100.000,
         "temp1_crit_hyst": -273.150,
         "temp1_emergency": 105.000
      },
      "junction":{
         "temp2_input": 50.000,                           <<<<<<< Temp of dGPU (seems to be "hottest part of GPU")
         "temp2_crit": 100.000,
         "temp2_crit_hyst": -273.150,
         "temp2_emergency": 105.000
      },
      "mem":{
         "temp3_input": 57.000,                           <<<<<<< Temp of dGPU (seems to be the vram temperature of GPU)
         "temp3_crit": 105.000,
         "temp3_crit_hyst": -273.150,
         "temp3_emergency": 110.000
      },
      "PPT":{
         "power1_average": 1.000,
         "power1_cap": 100.000
      }
   },
   "amdgpu-pci-c400":{
      "Adapter": "PCI adapter",
      "vddgfx":{
         "in0_input": 1.273
      },
      "vddnb":{
         "in1_input": 0.761
      },
      "edge":{
         "temp1_input": 44.000                            <<<<<<< Temp of internal GPU
      },
      "PPT":{
         "power1_average": 5.252,
         "power1_input": 12.213
      }
   },
   "acpitz-acpi-0":{
      "Adapter": "ACPI interface",
      "temp1":{
         "temp1_input": 45.800
      },
      "temp2":{
         "temp2_input": 46.800
      },
      "temp3":{
         "temp3_input": 45.800                     <<<<<<< Temp of CPU Socket (if the same as on my system)
      },
      "temp4":{
         "temp4_input": 45.800                     <<<<<<< Temp of hottest CPU Core (if the same as on my system)
      },
      "temp5":{
         "temp5_input": 49.800
      },
      "temp6":{
         "temp6_input": 51.800
      },
      "temp7":{
         "temp7_input": 49.800
      },
      "temp8":{
         "temp8_input": 49.800
      }
   }
}

Link with more information on edge and junction temperatures on AMD GPUs
https://www.reddit.com/r/Amd/comments/ck1jjl/comment/evizwql/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Sadly i dont have any real information on the block acpitz-acpi-0. Maybe someone on the framework discord has more information regarding those. I just correlated the temps from acpitz-acpi-0 with the temps from ectool on my system with a bit of load playing.


Seems the EC does drive the two fans with the same pwm duty, but fan1 does have an offset to fan0.

Interesting. I wonder if we should replicate this behavior or drive both fans at the same speed. I would think that having both at the same speed is more efficient and quiet, but I'm no expert.

Yeah, I dont think that would be necessary. It seems to be around 8-10 percent difference in rpm.

Greetings
hasechris

@TamtamHero
Copy link
Owner

I got a second output from another guy on Discord (thanks @Colson Barbelo) and the dGPU got the same labeling.
I'm not going to hope that it remains the same for all future GPUs that will get supported eventually, instead I'm looking for an entry that contains the junction and edge property, and I read the edge temp value . The edge one seems better than the junction one since this one is probably quite jumpy and less representative of the thermal state of the GPU (as mentioned in the reddit link you sent)
Then I compare it to the CPU temp (acpitz-acpi-0 -> temp3) and only keep the highest value, so we make the fan roar even if a single one is hot.

I have updated #21, could you please try it a bit before I merge it ?

@hasechris
Copy link
Author

I got a second output from another guy on Discord (thanks @Colson Barbelo) and the dGPU got the same labeling. I'm not going to hope that it remains the same for all future GPUs that will get supported eventually, instead I'm looking for an entry that contains the junction and edge property, and I read the edge temp value . The edge one seems better than the junction one since this one is probably quite jumpy and less representative of the thermal state of the GPU (as mentioned in the reddit link you sent) Then I compare it to the CPU temp (acpitz-acpi-0 -> temp3) and only keep the highest value, so we make the fan roar even if a single one is hot.

I have updated #21, could you please try it a bit before I merge it ?

Yep, have installed the new version and am testing right now. Looks super good on the first glance - Everything seems to just work as expected and even strategyOnDischarge is also working as expected and i did not have to fill in batteryChargingStatusPath. Noice :)

I will test the next couple of days and get back to you. Thanks again for all this 👍
Maybe ask @colson Barbelo to test this against the dGPU.

Best regards
hasechris

@hasechris
Copy link
Author

Feedback after a couple of days.

Everything just works :) You can release this. Thanks again VERY much 😁

@TamtamHero
Copy link
Owner

TamtamHero commented Mar 16, 2024

Very nice, thanks !
Didn't get any volunteer for testing the PR with dGPU though 🤷‍♂️
I guess it's okay, if someone has an issue with it, feel free to open a new issue here :)

#21 merged

@cbiffle
Copy link

cbiffle commented Mar 25, 2024

Just a heads up -- running fw-fanctrl (built at b7a8259) on a Framework 16 actually stops my fans from spinning up on high workloads, even on aggressive profiles. The machine happily got quite hot on the outside before I figured this out and shut the service off; the EC took over and cooled the machine back off.

I haven't figured out what the problem is yet, but thought you should be aware. With fw-fanctrl running I never saw the RPMs cross ~1100, even as the temperature reached 100C. On shutting it down, the EC immediately drove them to ~3600.

@TamtamHero TamtamHero reopened this Mar 26, 2024
@TamtamHero
Copy link
Owner

TamtamHero commented Mar 26, 2024

Weird, thanks for letting me know !
Could you please give me the result of the command sensors -j ?
Also, do you have a dGPU ? I just merged a fix for it here: #22

@ngraham20
Copy link
Contributor

Hey @TamtamHero I am running a FW 16 with a dedicated GPU, and the software works great for me 👍

@TamtamHero
Copy link
Owner

I assume #22 fixed your issue @cbiffle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants