Skip set_mempolicy for phantom NUMA nodes#296
Closed
nileshnegi wants to merge 5 commits into
Closed
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR prevents numa_run_on_node() (and the resulting set_mempolicy calls) from being invoked for “phantom” NUMA nodes that aren’t present in the process’ allowed memory-node mask, avoiding EINVAL errors on platforms that expose NUMA nodes without memory backing.
Changes:
- Guard
numa_run_on_node(exeIndex)behind anuma_get_mems_allowed()/numa_bitmask_isbitset()check in the CPU executor path.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
CollectTopology() probes each NUMA node by allocating a small buffer, which calls numa_set_preferred(i) for all numa_num_configured_nodes(). If some nodes have no memory backing, set_mempolicy(MPOL_PREFERRED) returns EINVAL for them. Guard the call with numa_get_mems_allowed() so only nodes with real memory get a preferred policy set. Co-authored-by: Claude <claude@anthropic.com>
1366134 to
6b60e53
Compare
numa_num_configured_nodes() includes phantom nodes (MNNVL fabric nodes, HSA virtual nodes) that have no memory, causing set_mempolicy EINVAL on every startup. Switch to numa_bitmask_weight(numa_get_mems_allowed()) so only nodes with real memory are counted as CPU executors. Also update the CollectTopology HSA agent probe loop and the per-core NUMA counter to skip phantom nodes consistently. Print an [INFO] message when phantom nodes are detected so the reduction in executor count is visible and not surprising. Co-authored-by: Claude <claude@anthropic.com>
Instead of a standalone [INFO] printf before the banner, surface the phantom node count inline in the existing topology summary line: 2 configured CPU NUMA node(s) [38 total; detected 36 phantom NUMA node(s) with no memory/CPU] Falls back to the original format when no phantom nodes are present. Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude <claude@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (3)
src/header/TransferBench.hpp:7219
- This counts the allowed-memory bitmask from a
numa_get_mems_allowed()allocation without freeing it. Store the returned mask, compute the weight, then free it to avoid leaking during topology collection.
topo.numExecutors[EXE_CPU] = numCpus;
src/header/TransferBench.hpp:7231
- Calling
numa_get_mems_allowed()inside this loop leaks a newly allocated bitmask on every CPU iteration. Reuse a single mask for the loop and free it after the CPU-core scan.
topo.numSubExecutors[{EXE_CPU, node}]++;
src/header/TransferBench.hpp:7764
- This loop calls
numa_get_mems_allowed()for each node and never frees the returned bitmask, leaking memory during topology initialization. Allocate the mask once before the loop, reuse it for all nodes, and free it afterward.
AllocateMemory({MEM_CPU, node}, 1024, (void**)&tempBuffer);
Comment on lines
+1552
to
+1553
| if (deviceIdx < 0 || numa_bitmask_isbitset(numa_get_mems_allowed(), deviceIdx)) { | ||
| numa_set_preferred(deviceIdx); |
Comment on lines
+7229
to
7232
| int node = numa_node_of_cpu(cpuCore); | ||
| if (numa_bitmask_isbitset(numa_get_mems_allowed(), node)) | ||
| topo.numSubExecutors[{EXE_CPU, node}]++; | ||
| } |
Comment on lines
+7763
to
7767
| if (!numa_bitmask_isbitset(numa_get_mems_allowed(), node)) continue; | ||
| AllocateMemory({MEM_CPU, node}, 1024, (void**)&tempBuffer); | ||
| hsa_amd_pointer_info(tempBuffer, &info, NULL, NULL, NULL); | ||
| cpuAgents.push_back(info.agentOwner); | ||
| DeallocateMemory(MEM_CPU, tempBuffer, 1024); |
Comment on lines
+91
to
+93
| if (numPhantom > 0) | ||
| printf(" %d configured CPU NUMA node(s) [%d total; detected %d phantom NUMA node(s) with no memory/CPU]\n", | ||
| numCpus, numCpusTotal, numPhantom); |
The guard added to skip numa_set_preferred() for phantom NUMA nodes was internally inconsistent: it skipped the preferred policy but still called CheckPages(deviceIdx), which would deterministically fail since pages cannot land on a node with no memory backing. The guard is also now redundant — CollectTopology iterates only over nodes in mems_allowed (added in the previous commit), so phantom node indices never reach AllocateMemory. Co-authored-by: Claude <claude@anthropic.com>
Collaborator
|
This functionality has been addressed in a more robust manner in |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Fix set_mempolicy EINVAL for memory-less NUMA nodes during topology init
Technical Details
CollectTopology() probes each NUMA node by allocating a small buffer, which calls
numa_set_preferred(i)for allnuma_num_configured_nodes(). If some nodes have no memory backing,set_mempolicy(MPOL_PREFERRED)returns EINVAL for them.Guard the call with
numa_get_mems_allowed()so only nodes with real memory get a preferred policy set.Test Plan
Test Result
Submission Checklist