Support hardware with more than 1024 CPUs by janvorli · Pull Request #126763 · dotnet/runtime

janvorli · 2026-04-10T21:21:25Z

A customer has reported that .NET runtime fails to initialize on machines that have more than 1024 CPUs due to sched_getaffinity being passed the default instance of cpu_set_t that supports max 1024 CPUs and fails if there are more CPUs on the current machine.

This change fixes sched_getaffinity calls to use a dynamically allocated CPU set data structure so that it can support any number of CPUs.

In the GC code, we keep the limit of max 1024 heaps, but the CPU limit is now dynamic. The arrays proc_no_to_heap_no and numa_node_to_heap_map are now dynamically allocated based on the real number of processors configured on the system. Also the AffinitySet was modified to be able to contain affinities for a dynamic number of CPUs.

Several other arrays were originally sized by MAX_SUPPORTED_CPUS, but that was misleading as they are really indexed by heaps. So I've renamed the constant to MAX_SUPPORTED_HEAPS to make it clear that the number of supported CPUs is not limited.

Close #126747

A customer has reported that .NET runtime fails to initialize on machines that have more than 1024 CPUs due to sched_getaffinity being passed the default instance of cpu_set_t that supports max 1024 CPUs and fails if there are more CPUs on the current machine. This change fixes sched_getaffinity calls to use a dynamically allocated CPU set data structure so that it can support any number of CPUs. In the GC code, we keep the limit of max 1024 heaps, but the CPU limit is now dynamic. The array `proc_no_to_heap_no` is now dynamically allocated based on the real number of processors configured on the system. Also the AffinitySet was modified to be able to contain affinities for a dynamic number of CPUs. Several other arrays were originally sized by MAX_SUPPORTED_CPUS, but that was misleading as they are really indexed by heaps. So I've renamed the constant to MAX_SUPPORTED_HEAPS to make it clear that the number of supported CPUs is not limited.

dotnet-policy-service · 2026-04-10T21:22:19Z

Tagging subscribers to this area: @agocke, @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

This PR updates CoreCLR (PAL, GC, and NativeAOT PAL) to correctly handle Linux machines with >1024 CPUs by avoiding fixed-size cpu_set_t usage and by making GC affinity-related data structures CPU-count-aware while keeping the GC heap limit at 1024.

Changes:

Use dynamically-sized CPU affinity sets (CPU_ALLOC / CPU_ALLOC_SIZE) for sched_getaffinity/sched_setaffinity to support >1024 CPUs.
Introduce GCToOSInterface::GetMaxProcessorCount() and make AffinitySet dynamically sized (plus rename MAX_SUPPORTED_CPUS → MAX_SUPPORTED_HEAPS for clarity).
Allocate GC mapping tables based on actual processor capacity (e.g., proc_no_to_heap_no, numa_node_to_heap_map) while retaining the 1024-heap limit.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
src/coreclr/pal/src/thread/thread.cpp	Switch thread-start affinity reset to dynamically-sized cpu_set allocation.
src/coreclr/pal/src/misc/sysinfo.cpp	Make logical CPU count retrieval use dynamic cpu_set; add clamping for total CPU count.
src/coreclr/nativeaot/Runtime/unix/PalUnix.cpp	Update NativeAOT processor count initialization to use dynamic cpu_set.
src/coreclr/gc/windows/gcenv.windows.cpp	Initialize process affinity set dynamically; loop bounds updated to avoid 1024 CPU assumption.
src/coreclr/gc/unix/gcenv.unix.cpp	Initialize process affinity set based on configured CPU count; use dynamic cpu_set for affinity enumeration.
src/coreclr/gc/env/gcenv.os.h	Rename MAX_SUPPORTED_CPUS→MAX_SUPPORTED_HEAPS; make `AffinitySet` dynamically allocated; add `GetMaxProcessorCount()` API.
src/coreclr/gc/interface.cpp	Initialize config affinity set with max processor count; handle init failure for NUMA heap mapping.
src/coreclr/gc/gcconfig.cpp	Validate HeapAffinitizeRanges against dynamic max CPU count.
src/coreclr/gc/gc.cpp	Allocate CPU→heap and NUMA→heap maps dynamically; remove modulo mapping for processor→heap.
src/coreclr/gc/*.cpp / gcpriv.h	Replace array sizes indexed by heaps to use MAX_SUPPORTED_HEAPS and adjust signatures accordingly.

Copilot · 2026-04-10T21:27:09Z

src/coreclr/gc/windows/gcenv.windows.cpp

    // Locate heap_number-th available processor
    uint16_t procIndex = 0;
    size_t cnt = heap_number;
-    for (uint16_t i = 0; i < MAX_SUPPORTED_CPUS; i++)
+    for (uint16_t i = 0; i < g_totalCpuCount; i++)
    {
        if (g_processAffinitySet.Contains(i))


In GetProcessorForHeap, the loop counter is uint16_t but the upper bound is g_totalCpuCount (uint32_t). On systems with 65,536 processors, i will wrap and this loop becomes infinite. Use a wider type (e.g., uint32_t/size_t) for the iterator and only cast to uint16_t when storing the selected processor index.

Copilot · 2026-04-10T21:27:09Z

src/coreclr/pal/src/misc/sysinfo.cpp

+            }
+
+            nrcpus = CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);


If sched_getaffinity fails, nrcpus is computed from the zeroed mask and may become 0, which can propagate into GetSystemInfo().dwNumberOfProcessors. Consider falling back to PAL_GetTotalCpuCount() (or at least clamping to 1) when st != 0 to avoid returning an invalid CPU count.

Suggested change

}

nrcpus = CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);

nrcpus = PAL_GetTotalCpuCount();

}

else

{

nrcpus = CPU_COUNT_S(cpuSetSize, pCpuSet);

if (nrcpus < 1)

{

ASSERT("CPU_COUNT_S returned an invalid CPU count\n");

nrcpus = PAL_GetTotalCpuCount();

}

}

Copilot · 2026-04-10T21:27:10Z

src/coreclr/nativeaot/Runtime/unix/PalUnix.cpp

+            }
+
+            count = (uint32_t)CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);


When sched_getaffinity fails, count is still computed from the (empty) cpu set and can become 0, which conflicts with the later _ASSERTE(count > 0) and can result in g_RhNumberOfProcessors being set to 0 in release builds. Consider falling back to GCToOSInterface::GetTotalProcessorCount() or clamping to 1 when st != 0.

Suggested change

}

count = (uint32_t)CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);

count = GCToOSInterface::GetTotalProcessorCount();

}

else

{

count = (uint32_t)CPU_COUNT_S(cpuSetSize, pCpuSet);

}

Copilot · 2026-04-10T21:27:10Z

src/coreclr/gc/unix/gcenv.unix.cpp

+    int configuredCpuCount = sysconf(_SC_NPROCESSORS_CONF);
+    if (configuredCpuCount == -1)
+    {
+        return false;
+    }
+
    g_totalCpuCount = cpuCount;

+    if (!g_processAffinitySet.Initialize(configuredCpuCount))
+    {
+        return false;
+    }


GC init now hard-fails if sysconf(_SC_NPROCESSORS_CONF) returns -1. That introduces a new runtime-initialization failure mode; other call sites (e.g., PAL sysinfo/thread affinity) fall back to CPU_SETSIZE instead. Consider using a similar fallback here (and also validate configuredCpuCount >= 1) so GC initialization remains robust if sysconf returns an unexpected value.

Copilot · 2026-04-10T21:27:11Z

src/coreclr/gc/env/gcenv.os.h

 #define __GCENV_OS_H__

+#include <new>
+using std::nothrow;


using std::nothrow; in a widely-included header injects a name into the global namespace for all includers and can cause subtle conflicts. Prefer removing the using-directive from the header and using new (std::nothrow) at the allocation sites instead.

Suggested change

using std::nothrow;

Copilot · 2026-04-10T21:27:11Z

src/coreclr/gc/env/gcenv.os.h

 class AffinitySet
 {
    static const size_t BitsPerBitsetEntry = 8 * sizeof(uintptr_t);

-    uintptr_t m_bitset[MAX_SUPPORTED_CPUS / BitsPerBitsetEntry];
+    uintptr_t *m_bitset = nullptr;
+    size_t m_bitsetDataSize = 0;

    static uintptr_t GetBitsetEntryMask(size_t cpuIndex)
    {
        return (uintptr_t)1 << (cpuIndex & (BitsPerBitsetEntry - 1));
    }

    static size_t GetBitsetEntryIndex(size_t cpuIndex)
    {
        return cpuIndex / BitsPerBitsetEntry;
    }

 public:

-    static const size_t BitsetDataSize = MAX_SUPPORTED_CPUS / BitsPerBitsetEntry;
+    bool Initialize(int cpuCount)
+    {
+        m_bitsetDataSize = (cpuCount + BitsPerBitsetEntry - 1) / BitsPerBitsetEntry;
+        m_bitset = new (nothrow) uintptr_t[m_bitsetDataSize];
+        if (m_bitset == nullptr)
+        {
+            return false;
+        }
+
+        memset(m_bitset, 0, sizeof(uintptr_t) * m_bitsetDataSize);
+        return true;
+    }

-    AffinitySet()
+    ~AffinitySet()
    {
-        memset(m_bitset, 0, sizeof(m_bitset));
+        delete[] m_bitset;
    }


AffinitySet now owns a heap-allocated buffer and has a destructor, but it still has the compiler-generated copy/move operations. Any accidental copy (e.g., pass-by-value or assignment) would double-free m_bitset. Consider explicitly deleting copy/move ctor/assignment (or implementing deep copy / using a smart pointer) and also guarding Initialize against being called twice (to avoid leaking the previous buffer).

Copilot · 2026-04-10T21:27:11Z

src/coreclr/pal/src/thread/thread.cpp

    {
-        ASSERT("sched_getaffinity failed!\n");
-        // The sched_getaffinity should never fail for getting affinity of the current process
-        palError = ERROR_INTERNAL_ERROR;
-        goto fail;
-    }
+        int configuredCpuCount = sysconf(_SC_NPROCESSORS_CONF);
+        if (configuredCpuCount == -1)
+        {
+            // In the unlikely event that sysconf(_SC_NPROCESSORS_CONF) fails, just assume a reasonable default maximum number of CPUs to avoid failing thread creation.
+            configuredCpuCount = CPU_SETSIZE;
+        }

-    st = sched_setaffinity(0, sizeof(cpu_set_t), &cpuSet);
-    if (st != 0)
-    {
-        if (errno == EPERM || errno == EACCES)
+        cpu_set_t* pCpuSet = CPU_ALLOC(configuredCpuCount);
+        if (pCpuSet == nullptr)
        {
-            // Some sandboxed or restricted environments (snap strict confinement,
-            // vendor-modified Android kernels with strict SELinux policy) block
-            // sched_setaffinity even when passed a mask extracted via sched_getaffinity.
-            // Treat this as non-fatal — the thread will continue running on any
-            // available CPU rather than the originally affinitized one. 
-            WARN("sched_setaffinity failed with EPERM/EACCES, ignoring\n");
+            ASSERT("CPU_ALLOC failed!\n");
+            palError = ERROR_OUTOFMEMORY;
+            goto fail;
        }
-        else
+
+        size_t cpuSetSize = CPU_ALLOC_SIZE(configuredCpuCount);
+        CPU_ZERO_S(cpuSetSize, pCpuSet);
+


This change introduces a heap allocation (CPU_ALLOC/CPU_FREE) on every thread start to reset affinity. If thread creation is performance-sensitive in some workloads, consider caching the required cpu_set_t size (and possibly reusing a buffer) to avoid repeated malloc/free on the hot path, while still supporting >1024 CPUs.

jkotas · 2026-04-11T07:44:32Z

src/coreclr/pal/src/thread/thread.cpp

-            ASSERT("sched_setaffinity failed!\n");
+            ASSERT("sched_getaffinity failed!\n");
+            CPU_FREE(pCpuSet);
+            // The sched_getaffinity should never fail for getting affinity of the current process


We gracefully ignore sched_getaffinity failures in the GC in release builds. Should we do the same here?

janvorli added this to the 8.0.x milestone Apr 10, 2026

janvorli requested a review from jkotas April 10, 2026 21:21

janvorli self-assigned this Apr 10, 2026

janvorli requested a review from MichalStrehovsky as a code owner April 10, 2026 21:21

janvorli added the area-GC-coreclr label Apr 10, 2026

Copilot AI review requested due to automatic review settings April 10, 2026 21:21

Copilot started reviewing on behalf of janvorli April 10, 2026 21:22 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

build-analysis bot mentioned this pull request Apr 11, 2026

System.Net.NameResolution.Tests DNS failures: Name or service not known #126641

Open

jkotas reviewed Apr 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support hardware with more than 1024 CPUs#126763

Support hardware with more than 1024 CPUs#126763
janvorli wants to merge 1 commit intodotnet:mainfrom
janvorli:support-more-than-1024-cpus

janvorli commented Apr 10, 2026

Uh oh!

dotnet-policy-service bot commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

jkotas Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		}

		nrcpus = CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);

-            }
-            nrcpus = CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);
+                nrcpus = PAL_GetTotalCpuCount();
+            }
+            else
+            {
+                nrcpus = CPU_COUNT_S(cpuSetSize, pCpuSet);
+                if (nrcpus < 1)
+                {
+                    ASSERT("CPU_COUNT_S returned an invalid CPU count\n");
+                    nrcpus = PAL_GetTotalCpuCount();
+                }
+            }

		}

		count = (uint32_t)CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);

-            }
-            count = (uint32_t)CPU_COUNT_S(CPU_ALLOC_SIZE(configuredCpuCount), pCpuSet);
+                count = GCToOSInterface::GetTotalProcessorCount();
+            }
+            else
+            {
+                count = (uint32_t)CPU_COUNT_S(cpuSetSize, pCpuSet);
+            }

Conversation

janvorli commented Apr 10, 2026

Uh oh!

dotnet-policy-service bot commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

jkotas Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants