Skip to content
This repository was archived by the owner on Jun 30, 2025. It is now read-only.

Conversation

@skmono
Copy link
Contributor

@skmono skmono commented Nov 17, 2022

Remove numa dependency and acquire cores/threads by parsing lscpu

@skmono skmono requested a review from a team as a code owner November 17, 2022 05:16
@skmono skmono requested review from justalittlenoob and removed request for a team November 17, 2022 05:16
@skmono
Copy link
Contributor Author

skmono commented Nov 17, 2022

@justalittlenoob What do you think of this approach? This PR can remove the numa dependency

@justalittlenoob
Copy link
Contributor

Hi @skmono
This PR looks good.
I have only one concern. If using two different machines compile code and execute code separately, IPCL_THREAD_NUM may not be correct at runtime.

… hardcoded fixed value

* Set ```IPCL_NUM_NODES``` macro after parsing ```lscpu```
@skmono
Copy link
Contributor Author

skmono commented Nov 18, 2022

@justalittlenoob You're right. Also hard coded num_sockets could pose the same issue. Let me dig into it a bit more.

@justalittlenoob
Copy link
Contributor

@justalittlenoob You're right. Also hard coded num_sockets could pose the same issue. Let me dig into it a bit more.

I have an idea FYI. How about set several options in the compilation options, such as IPCL_ COMPILE_ ONLY IPCL_ EXCUTE_ ONLY and IPCL_ COMPILE_ AND_ EXECUTE?
If this machine is only responsible for compiling, then we can put the detection of avx512, rdseed, rdrand, num_thread, etc. into the compilation period(IPCL_COMPILE_ONLY).
If this machine is only responsible for executing code(IPCL_EXECUTE_ONLY), then detecting those features in the cpp code.
etc..

@skmono
Copy link
Contributor Author

skmono commented Nov 18, 2022

@justalittlenoob I added a /proc/cpuinfo parser to detect number of sockets (nodes)

So there will be following cases:

Case IPCL_THREAD_COUNT explicitly set IPCL_THREAD_COUNT not set
IPCL_DETECT_CPU_RUNTIME=OFF cpus = IPCL_THREAD_COUNT cpus = std::thread::hardware_concurrency
nodes = IPCL_NUM_NODES nodes = IPCL_NUM_NODES
IPCL_DETECT_CPU_RUNTIME=ON cpus = IPCL_THREAD_COUNT cpus = std::thread::hardware_concurrency
nodes from parser nodes from parser

nodes CMake compile def can be found in:

ipcl_get_core_thread_count(num_cores num_threads num_sockets)

# set cpu socket count parsed from lscpu precompile
add_compile_definitions(IPCL_NUM_NODES=${num_sockets})

nodes parser from:

#ifdef IPCL_RUNTIME_DETECT_CPU_FEATURES
static const linuxCPUInfo cpuinfo;
static const linuxCPUInfo getLinuxCPUInfo() { return GetLinuxCPUInfo(); }
#endif

#ifdef IPCL_RUNTIME_DETECT_CPU_FEATURES
const linuxCPUInfo OMPUtilities::cpuinfo = OMPUtilities::getLinuxCPUInfo();
#endif // IPCL_RUNTIME_DETECT_CPU_FEATURES

What do you think?

@skmono
Copy link
Contributor Author

skmono commented Nov 18, 2022

There is little difference in performance, as /proc/cpuinfo will be parsed only once

Copy link
Contributor

@justalittlenoob justalittlenoob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great.

@skmono skmono merged commit 2fcde46 into ipcl_v2.0.0 Nov 18, 2022
@skmono skmono deleted the skmono/refactor_thread_counts branch November 18, 2022 06:57
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants