Skip to content

Conversation

priteau
Copy link
Member

@priteau priteau commented Dec 5, 2022

The temp_max value is dynamically gathered from the device [1]. With Xeon CPUs (coretemp driver), it is often 90C, but sometimes lower.

This can help reduce the frequency of alerts with busy hypervisors.

[1] https://docs.kernel.org/hwmon/coretemp.html

@priteau priteau self-assigned this Dec 5, 2022
@priteau priteau requested a review from a team as a code owner December 5, 2022 12:25
@markgoddard
Copy link
Contributor

In general it's better to push config changes into the earliest branch first, then merge through to later branches.

markgoddard
markgoddard previously approved these changes Dec 5, 2022
@priteau priteau changed the base branch from stackhpc/yoga to stackhpc/wallaby December 5, 2022 12:58
@priteau priteau dismissed markgoddard’s stale review December 5, 2022 12:58

The base branch was changed.

The temp_max value is dynamically gathered from the device [1]. With
Xeon CPUs (coretemp driver), it is often 90C, but sometimes lower.

This can help reduce the frequency of alerts with busy hypervisors.

[1] https://docs.kernel.org/hwmon/coretemp.html
@priteau priteau changed the base branch from stackhpc/wallaby to stackhpc/xena December 5, 2022 12:59
@priteau
Copy link
Member Author

priteau commented Dec 5, 2022

I rebased on stackhpc/xena, since the wallaby branch doesn't include these alerts.

@markgoddard markgoddard merged commit 48ace32 into stackhpc/xena Dec 7, 2022
@markgoddard markgoddard deleted the overheating-alert branch December 7, 2022 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants