Skip to content

Skip set_mempolicy for phantom NUMA nodes#296

Closed
nileshnegi wants to merge 5 commits into
candidatefrom
users/nileshnegi/fix/set-mempolicy-invalid
Closed

Skip set_mempolicy for phantom NUMA nodes#296
nileshnegi wants to merge 5 commits into
candidatefrom
users/nileshnegi/fix/set-mempolicy-invalid

Conversation

@nileshnegi
Copy link
Copy Markdown
Collaborator

@nileshnegi nileshnegi commented May 11, 2026

Motivation

Fix set_mempolicy EINVAL for memory-less NUMA nodes during topology init

Technical Details

CollectTopology() probes each NUMA node by allocating a small buffer, which calls numa_set_preferred(i) for all numa_num_configured_nodes(). If some nodes have no memory backing, set_mempolicy(MPOL_PREFERRED) returns EINVAL for them.

Guard the call with numa_get_mems_allowed() so only nodes with real memory get a preferred policy set.

Test Plan

Test Result

Submission Checklist

Copilot AI review requested due to automatic review settings May 11, 2026 00:20
@nileshnegi nileshnegi requested a review from a team as a code owner May 11, 2026 00:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents numa_run_on_node() (and the resulting set_mempolicy calls) from being invoked for “phantom” NUMA nodes that aren’t present in the process’ allowed memory-node mask, avoiding EINVAL errors on platforms that expose NUMA nodes without memory backing.

Changes:

  • Guard numa_run_on_node(exeIndex) behind a numa_get_mems_allowed() / numa_bitmask_isbitset() check in the CPU executor path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/header/TransferBench.hpp Outdated
Comment thread src/header/TransferBench.hpp Outdated
CollectTopology() probes each NUMA node by allocating a small buffer,
which calls numa_set_preferred(i) for all numa_num_configured_nodes().
If some nodes have no memory backing, set_mempolicy(MPOL_PREFERRED)
returns EINVAL for them.

Guard the call with numa_get_mems_allowed() so only nodes with real
memory get a preferred policy set.

Co-authored-by: Claude <claude@anthropic.com>
@nileshnegi nileshnegi force-pushed the users/nileshnegi/fix/set-mempolicy-invalid branch from 1366134 to 6b60e53 Compare May 11, 2026 00:33
@nileshnegi nileshnegi requested a review from Copilot May 11, 2026 01:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment thread src/header/TransferBench.hpp Outdated
nileshnegi and others added 2 commits May 13, 2026 16:50
numa_num_configured_nodes() includes phantom nodes (MNNVL fabric nodes,
HSA virtual nodes) that have no memory, causing set_mempolicy EINVAL on
every startup. Switch to numa_bitmask_weight(numa_get_mems_allowed()) so
only nodes with real memory are counted as CPU executors.

Also update the CollectTopology HSA agent probe loop and the per-core
NUMA counter to skip phantom nodes consistently. Print an [INFO] message
when phantom nodes are detected so the reduction in executor count is
visible and not surprising.

Co-authored-by: Claude <claude@anthropic.com>
Instead of a standalone [INFO] printf before the banner, surface the
phantom node count inline in the existing topology summary line:

  2 configured CPU NUMA node(s) [38 total; detected 36 phantom NUMA node(s) with no memory/CPU]

Falls back to the original format when no phantom nodes are present.

Co-authored-by: Claude <claude@anthropic.com>
Copilot AI review requested due to automatic review settings May 13, 2026 22:01
Co-authored-by: Claude <claude@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (3)

src/header/TransferBench.hpp:7219

  • This counts the allowed-memory bitmask from a numa_get_mems_allowed() allocation without freeing it. Store the returned mask, compute the weight, then free it to avoid leaking during topology collection.
    topo.numExecutors[EXE_CPU] = numCpus;

src/header/TransferBench.hpp:7231

  • Calling numa_get_mems_allowed() inside this loop leaks a newly allocated bitmask on every CPU iteration. Reuse a single mask for the loop and free it after the CPU-core scan.
        topo.numSubExecutors[{EXE_CPU, node}]++;

src/header/TransferBench.hpp:7764

  • This loop calls numa_get_mems_allowed() for each node and never frees the returned bitmask, leaking memory during topology initialization. Allocate the mask once before the loop, reuse it for all nodes, and free it afterward.
        AllocateMemory({MEM_CPU, node}, 1024, (void**)&tempBuffer);

Comment thread src/header/TransferBench.hpp Outdated
Comment on lines +1552 to +1553
if (deviceIdx < 0 || numa_bitmask_isbitset(numa_get_mems_allowed(), deviceIdx)) {
numa_set_preferred(deviceIdx);
Comment on lines +7229 to 7232
int node = numa_node_of_cpu(cpuCore);
if (numa_bitmask_isbitset(numa_get_mems_allowed(), node))
topo.numSubExecutors[{EXE_CPU, node}]++;
}
Comment on lines +7763 to 7767
if (!numa_bitmask_isbitset(numa_get_mems_allowed(), node)) continue;
AllocateMemory({MEM_CPU, node}, 1024, (void**)&tempBuffer);
hsa_amd_pointer_info(tempBuffer, &info, NULL, NULL, NULL);
cpuAgents.push_back(info.agentOwner);
DeallocateMemory(MEM_CPU, tempBuffer, 1024);
Comment thread src/client/Topology.hpp
Comment on lines +91 to +93
if (numPhantom > 0)
printf(" %d configured CPU NUMA node(s) [%d total; detected %d phantom NUMA node(s) with no memory/CPU]\n",
numCpus, numCpusTotal, numPhantom);
Comment thread src/header/TransferBench.hpp
The guard added to skip numa_set_preferred() for phantom NUMA nodes was
internally inconsistent: it skipped the preferred policy but still called
CheckPages(deviceIdx), which would deterministically fail since pages
cannot land on a node with no memory backing.

The guard is also now redundant — CollectTopology iterates only over
nodes in mems_allowed (added in the previous commit), so phantom node
indices never reach AllocateMemory.

Co-authored-by: Claude <claude@anthropic.com>
@gilbertlee-amd
Copy link
Copy Markdown
Collaborator

This functionality has been addressed in a more robust manner in
#303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants