Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Weird total CPU temp reading on FreeBSD and Ryzen 9 #488

Open
MikeJakubik opened this issue Dec 31, 2022 · 11 comments
Open

[BUG] Weird total CPU temp reading on FreeBSD and Ryzen 9 #488

MikeJakubik opened this issue Dec 31, 2022 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@MikeJakubik
Copy link

I get an odd total CPU temp reading on FreeBSD and Ryzen 9, it's always -273C. The rest of the cores appear to have sane values, but they all display the same temp. Attached is a screenshot.

[mike@fbsd /usr/home/mike]$ uname -a
FreeBSD fbsd.localdomain 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-701b36961c: Thu Dec 29 19:28:32 EST 2022     mike@fbsd.localdomain:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64

[mike@fbsd /usr/home/mike]$ sysctl -a|grep temp
amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb0
vm.pfault_oom_attempts: 3
net.inet6.ip6.use_tempaddr: 0
net.inet6.ip6.temppltime: 86400
net.inet6.ip6.tempvltime: 604800
net.inet6.ip6.prefer_tempaddr: 0
        value:  /boot/kernel/amdtemp.ko
hw.usb.template: -1
kstat.zfs.misc.arcstats.arc_tempreserve: 0
dev.amdtemp.0.ccd1: 42.1C
dev.amdtemp.0.ccd0: 38.1C
dev.amdtemp.0.core0.sensor0: 39.2C
dev.amdtemp.0.sensor_offset: 0
dev.amdtemp.0.%parent: hostb0
dev.amdtemp.0.%pnpinfo: 
dev.amdtemp.0.%location: 
dev.amdtemp.0.%driver: amdtemp
dev.amdtemp.0.%desc: AMD CPU On-Die Thermal Sensors
dev.amdtemp.%parent: 
dev.cpu.31.temperature: 39.2C
dev.cpu.30.temperature: 39.2C
dev.cpu.29.temperature: 39.2C
dev.cpu.28.temperature: 39.2C
dev.cpu.27.temperature: 39.2C
dev.cpu.26.temperature: 39.2C
dev.cpu.25.temperature: 39.2C
dev.cpu.24.temperature: 39.2C
dev.cpu.23.temperature: 39.2C
dev.cpu.22.temperature: 39.2C
dev.cpu.21.temperature: 39.2C
dev.cpu.20.temperature: 39.2C
dev.cpu.19.temperature: 39.2C
dev.cpu.18.temperature: 39.2C
dev.cpu.17.temperature: 39.2C
dev.cpu.16.temperature: 39.2C
dev.cpu.15.temperature: 39.2C
dev.cpu.14.temperature: 39.2C
dev.cpu.13.temperature: 39.2C
dev.cpu.12.temperature: 39.2C
dev.cpu.11.temperature: 39.2C
dev.cpu.10.temperature: 39.2C
dev.cpu.9.temperature: 39.2C
dev.cpu.8.temperature: 39.2C
dev.cpu.7.temperature: 39.2C
dev.cpu.6.temperature: 39.2C
dev.cpu.5.temperature: 39.2C
dev.cpu.4.temperature: 39.2C
dev.cpu.3.temperature: 39.2C
dev.cpu.2.temperature: 39.2C
dev.cpu.1.temperature: 39.2C
dev.cpu.0.temperature: 39.2C

Screenshot_20221230_231537

@MikeJakubik MikeJakubik added the bug Something isn't working label Dec 31, 2022
@MikeJakubik
Copy link
Author

MikeJakubik commented Dec 31, 2022

[mike@fbsd /usr/home/mike/Programs/btop]$ gmake CXX=g++12 STRIP=true ADDFLAGS="-march=native" info
 
 ██████╗ ████████╗ ██████╗ ██████╗
 ██╔══██╗╚══██╔══╝██╔═══██╗██╔══██╗   ██╗    ██╗
 ██████╔╝   ██║   ██║   ██║██████╔╝ ██████╗██████╗
 ██╔══██╗   ██║   ██║   ██║██╔═══╝  ╚═██╔═╝╚═██╔═╝
 ██████╔╝   ██║   ╚██████╔╝██║        ╚═╝    ╚═╝
 ╚═════╝    ╚═╝    ╚═════╝ ╚═╝      Makefile v1.4
PLATFORM   ?| FreeBSD
ARCH       ?| x86_64
CXX        ?| g++12 (12.2.0)
THREADS    :| 32
REQFLAGS   !| -std=c++20
WARNFLAGS  :| -Wall -Wextra -pedantic
OPTFLAGS   :| -O2 -ftree-loop-vectorize -flto=32
LDCXXFLAGS :| -pthread -D_FORTIFY_SOURCE=2 -D_GLIBCXX_ASSERTIONS -fexceptions -fstack-clash-protection -fcf-protection -fstack-protector -march=native -s -lstdc++ -lm -lkvm -ldevstat -Wl,-rpath=/usr/local/lib/gcc12
CXXFLAGS   +| $(REQFLAGS) $(LDCXXFLAGS) $(OPTFLAGS) $(WARNFLAGS)
LDFLAGS    +| $(LDCXXFLAGS) $(OPTFLAGS) $(WARNFLAGS)

Wish this would just compile with llvm instead of gcc specifically (13 is in base fbsd, 14 in master branch), and under some load here is how this looks (if it matters, this is not a thing in bpytop which usually works flawlessly):

Screenshot_20221231_173953

@imwints
Copy link
Contributor

imwints commented Jan 2, 2023

I'm working on Clang support, I've only got some compile flags to add and std::views::split to look at which isn't implemented at all in libcxx. You cannot compile libstdc++'s <ranges> with Clang at all.

Edit: I've read the other issues and PRs regarding llvm and it seems that there is currently no interest to support premature llvm support by @aristocratos , but if that has changed by now im willing to open a feature request

@MikeJakubik
Copy link
Author

MikeJakubik commented Jan 2, 2023

I'm working on Clang support, I've only got some compile flags to add and std::views::split to look at which isn't implemented at all in libcxx. You cannot compile libstdc++'s <ranges> with Clang at all.

Edit: I've read the other issues and PRs regarding llvm and it seems that there is currently no interest to support premature llvm support by @aristocratos , but if that has changed by now im willing to open a feature request

That's great to hear, might may things easier on Apple computers too, since they come llvm by default. Any idea how/where that total CPU temp value is derived? perhaps i can point to the right resource in FreeBSD itself (i also have access to some Intel and AMD Epyc servers with FreeBSD).

@aristocratos
Copy link
Owner

@stwnt

I've read the other issues and PRs regarding llvm and it seems that there is currently no interest to support premature llvm [...]

I've never really voiced any opinion on llvm since clang hasn't had support for std::ranges before version 15 that was recently released. I believe you are referring to the discussions about cmake?

However supporting compilation with Clang 15 shouldn't be an issue with the current build system.

Testing supported compiler flags are already done in the Makefile (some of the currently used flags are hardware specific), see:

btop/Makefile

Lines 42 to 43 in c4ee41e

#? Any flags added to TESTFLAGS must not contain whitespace for the testing to work
override TESTFLAGS := -fexceptions -fstack-clash-protection -fcf-protection

btop/Makefile

Lines 127 to 128 in c4ee41e

#? Filter out unsupported compiler flags
override GOODFLAGS := $(foreach flag,$(TESTFLAGS),$(strip $(shell echo "int main() {}" | $(CXX) -o /dev/null $(flag) -x c++ - >/dev/null 2>&1 && echo $(flag) || true)))

Theoretically the only changes needed for the Makefile is to add a check for $(CXX) --version, grep for clang and check that $(CXX) -dumpversion is greater or equal to 15.0.0.
Then add/switch any needed flags.

The Tools::ssplit() function in btop_tools.cpp was also an issue when compiling with msvc in btop4win, so the rewritten version of that function can be copied over from btop4win and used instead:
https://github.com/aristocratos/btop4win/blob/c2ab1e50e2fdcc294a6c16eeb878b36600d18eec/src/btop_tools.cpp#L370-L381

I can take a look at it when I've got some time if you are unfamiliar with (the sometimes a bit abstract) Makefile logic :)

@MikeJakubik
Regarding your issue, I'm not sure why you get temperatures for each core with sysctl, as far as I know Ryzen only has sensors for the ccd's, so the only actual real temps in your output would be:

dev.amdtemp.0.ccd1: 42.1C
dev.amdtemp.0.ccd0: 38.1C

So the CPU "package" temp should be the average of 42.1 and 38.1, and the CPU cores should have the same temperature as the CCD they belong to.

The reason you are getting wrong values for the CPU is probably because AMD changed the name for the sensors again, so when btop was written (before Ryzen 9 was released) these sensor names wasn't included.
Will take a look at it when I've got some time.

@MikeJakubik
Copy link
Author

MikeJakubik commented Jan 4, 2023

The GCC vs LLVM thing isn't a major issue for me, just thought it be nice. I'm not a dev just an admin and i assumed c++ standards and features would be the same, but i guess not. The main issue is the display of -270C temperature (how is this number calculated?). This works in bpytop fine and im pretty sure it used to on the C version too, so not sure what changed either, but if i knew how this value was produced it should be simple to tell the issue.

@MikeJakubik
Copy link
Author

MikeJakubik commented Jan 5, 2023

Also just FYI, the Ryzen 9 and Epyc (and most Zen3+ ive seen) CPUs do report temps on each individual core (even on L3 caches!), not just the CCD's. This is probably why we see each individual temp entry in FreeBSD's dev.cpu sysctls (though they don't seem to report correctly in this case). Attached is a screenshot of the exact same system running Windows 11 with HWiNFO.

hwinfo

@MikeJakubik
Copy link
Author

MikeJakubik commented Jan 8, 2023

Update.

I switched to the main branch from FreeBSD, recompiled, and it shows the total temp correctly now. However, IO does not, it just reads 0% usage at all times. Going back to bpytop, as it works perfectly.

@danjenson
Copy link

I am seeing something similar on void linux:
image
the CPU temp is always higher in the upper right than the measurement for any core

@MikeJakubik
Copy link
Author

MikeJakubik commented May 29, 2023

I tried compiling the latest master but got a few issues now. It complains about -flto being invalid (40) and can't find something called fmt/core.h. I tried commenting out -flto and installing a port named libfmt, but it still can't find it. Tried both gcc12 and clang15, and I had the same issue. It seems like a ./configure script would be handy to detect these.

@imwints
Copy link
Contributor

imwints commented May 29, 2023

@MikeJakubik
The PR for Clang isn't merged yet, but Clang 16 is required anyway.

Seems like you didn't pull the submodule properly

git pull
git submodule init
git pull --recurse-submodules

@MikeJakubik
Copy link
Author

@MikeJakubik The PR for Clang isn't merged yet, but Clang 16 is required anyway.

Seems like you didn't pull the submodule properly

git pull
git submodule init
git pull --recurse-submodules

Ahh yes, I did not do git clone --recursive, which took care of libfmt, but the rpath is still statically defined in the Makefile. After changing that to reflect gcc12 I got it to compile, however, it still shows a bogus overall CPU temp of 8C.

Screenshot 2023-05-29 051210

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants