Fix cpuinfo init on Linux without CPU sysfs lists#28230
Merged
tianleiwu merged 4 commits intomicrosoft:mainfrom Apr 29, 2026
Merged
Fix cpuinfo init on Linux without CPU sysfs lists#28230tianleiwu merged 4 commits intomicrosoft:mainfrom
tianleiwu merged 4 commits intomicrosoft:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes ONNX Runtime startup failures on Linux ARM64 environments where /sys/devices/system/cpu/{possible,present} are unavailable by (1) making early cpuinfo-init logging safe before a default logger exists, and (2) patching the bundled pytorch/cpuinfo to fall back to sysconf(_SC_NPROCESSORS_ONLN) for both CPU counts and per-CPU present/possible flags.
Changes:
- Guard
LOGS_DEFAULT(...)usage inPosixEnvso cpuinfo init failures won’t crash when logging hasn’t been initialized yet. - Patch
pytorch/cpuinfoLinux processor detection to provide robust sysfs-missing fallbacks (counts + flags). - Add a standalone simulation script to validate the early-logging and sysfs-missing behaviors (incl.
LD_PRELOADsysfs hiding).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| onnxruntime/core/platform/posix/env.cc | Avoids crashing during early PosixEnv construction by falling back to std::cerr when no default logger exists. |
| cmake/external/onnxruntime_external_deps.cmake | Wires in the new cpuinfo patch during FetchContent dependency setup (Linux + ARM64/ARM64EC patch flow). |
| cmake/patches/cpuinfo/fix_missing_sysfs_fallback.patch | Adds sysconf(_SC_NPROCESSORS_ONLN)-based fallbacks for max CPU count and present/possible flags when sysfs cpulists are missing. |
| onnxruntime/test/common/test_cpuinfo_sysfs_fallback.py | Adds a manual/simulation validation script (compiles small programs + LD_PRELOAD shim). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…s, fix docstring - Convert test_cpuinfo_sysfs_fallback.py from standalone functions to proper unittest.TestCase so pytest/unittest discovery works correctly - Add platform guards (sys.platform == 'linux') and tool detection (shutil.which) with unittest.SkipTest for non-Linux or missing compilers - Remove unused get_ort_root() function - Fix docstring: 'intercepts open/fopen' -> 'intercepts fopen' to match impl - Fix CodeQL implicit string concatenation warning by extracting the -c script to a named variable - Remove fix_missing_sysfs_fallback.patch from Windows ARM64/ARM64EC block since it only modifies Linux-specific sources (src/linux/processors.c); keep it in the Linux-only elseif block
hariharans29
approved these changes
Apr 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes ONNX Runtime startup on Linux ARM64 environments where
/sys/devices/system/cpu/possibleand/sys/devices/system/cpu/presentare unavailable, such as AWS Lambda ARM64/Graviton and restricted build sandboxes.There are two related failure modes:
PosixEnvmay be constructed before ORT's default logger is registered. Ifcpuinfo_initialize()fails during that early construction path, the existingLOGS_DEFAULT(INFO)call can terminate withAttempt to use DefaultLogger but none has been registered.pytorch/cpuinfocode treats missing Linux CPUpossible/presentsysfs cpulists as fatal on ARM Linux. The max-count helpers returnUINT32_MAX, which wraps to0after1 + UINT32_MAXin ARM Linux initialization and prevents cpuinfo from reaching the later/proc/cpuinfoandgetauxval()based detection paths.Root Cause
The immediate import crash is caused by unsafe early logging in
onnxruntime/core/platform/posix/env.cc. Python bindings can referenceEnv::Default()during module load before logging is initialized, so a cpuinfo initialization failure must not useLOGS_DEFAULT()unless a default logger exists.The cpuinfo initialization failure is more subtle. A count-only fallback is not enough: after cpuinfo computes max possible/present CPU counts, it calls
cpuinfo_linux_detect_possible_processors()andcpuinfo_linux_detect_present_processors()to setCPUINFO_LINUX_FLAG_POSSIBLEandCPUINFO_LINUX_FLAG_PRESENTon each processor. ARM Linux initialization later marks processors valid only if those flags are set. If only the count fallback is provided,valid_processorscan remain zero and cpuinfo can proceed into an invalid partial initialization state.Fix
PosixEnvlogging safe when cpuinfo initialization fails before a default logger exists:logging::LoggingManager::HasDefaultLogger()beforeLOGS_DEFAULT()std::cerrwhen no logger is registeredsysconf(_SC_NPROCESSORS_ONLN) - 10..nproc-1sysconf(_SC_NPROCESSORS_ONLN)count and present/possible flag fallback behavior/sys/devices/system/cpu/{possible,present}viaLD_PRELOADTesting
Ran from a clean branch/worktree:
Result:
onnxruntime.capinot built/importable in this workspace)Also validated the cpuinfo patch directly:
And syntax-checked patched
src/linux/processors.cin a temporary tree with cpuinfo headers.Related Issue
Fixes #10038.