Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieval of core_id and physical_package_id for CPUs that are offline. #361

Closed
kishen-v opened this issue Feb 21, 2024 · 0 comments · Fixed by #362
Closed

Retrieval of core_id and physical_package_id for CPUs that are offline. #361

kishen-v opened this issue Feb 21, 2024 · 0 comments · Fixed by #362

Comments

@kishen-v
Copy link
Contributor

kishen-v commented Feb 21, 2024

It was observed that, on CPUs that are made offline using echo 0 > /sys/devices/system/cpu/cpuX/online, the topology directory does not contain information related to core_id and physical_package_id.

Due to the missing file, a related message WARNING: failed to read int from file: open /sys/devices/system/cpu/cpuX/topology/physical_package_id: no such file or directory is raised, and a similar behaviour is also observed if the core_id file is missing too.

In the case of a large compute node, this may lead to plenty of warnings printed along with other information if many CPUs are offline.

One such instance which was observed on the system is

WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu10/topology/physical_package_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu10/topology/core_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu11/topology/physical_package_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu11/topology/core_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu12/topology/physical_package_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu12/topology/core_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu13/topology/physical_package_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu13/topology/core_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu14/topology/physical_package_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu14/topology/core_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu15/topology/physical_package_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu15/topology/core_id: no such file or directory
WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu16/topology/physical_package_id: no such file or directory

... - From cpu8 to cpu64.

This can be overcome by checking the online file in the cpuX directory for the bit 1 before proceeding to access the physical_package_id/core_id file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant