Skip to content

tools/Config.mk: speed up builds by caching the prefix-detection shells#18868

Merged
xiaoxiang781216 merged 1 commit into
apache:masterfrom
dfanache:fix/make-shell-speedup
May 12, 2026
Merged

tools/Config.mk: speed up builds by caching the prefix-detection shells#18868
xiaoxiang781216 merged 1 commit into
apache:masterfrom
dfanache:fix/make-shell-speedup

Conversation

@dfanache
Copy link
Copy Markdown
Contributor

Summary

Currently, three top-level assignments in tools/Config.mk use the form export VAR ?= $(... ${shell ...} ...) which produces a recursive/lazy variable, slowing down the build.

The right-hand side is re-expanded - and the embedded ${shell ...} reruns every time the variable is used. Because these variables are also exported, make expands them once per recipe's environment, spawning the tools/incdir / tools/define host helpers hundreds of times over a full build.

This PR replaces the three lines with:

ifeq ($(origin VAR),undefined)
  VAR := $(... ${shell ...} ...)
endif
export VAR

Here, := is simply-expanded, so the shell runs just once at parse time. The ifeq ... wrapper is meant to preserve the override semantics of ?= (passing DEFINE_PREFIX=... on the make command line or as an env variable cancels the assignment).

Impact

Developers and build systems: noticeably faster full builds on multi-core hosts; no behaviour change - the override semantics of ?= are preserved.

Measured impact on a 20-core build host is a ~26% speedup of wall time, while testing a number of standard boards.

Testing

Host: Linux, 20-threads x86_64, arm-none-eabi-gcc 15.2.1, GNU make 4.4.1.

Methodology: three full clean builds per side (patched/unpatched), each preceded by make distclean and a fresh configure.sh + olddefconfig. The patched and unpatched runs were taken back to back on the same host, with the same -j20 invocation. The times reported in seconds are medians over the three runs; variance was always under 0.2s.

Config 1: raspberrypi-pico-2:nsh (rp23xx, Cortex-M33)

Extra apps enabled to increase the number of recipes: TESTING_OSTEST, SYSTEM_SYSTEM_TOP, TESTING_GETPRIME, BENCHMARKS_COREMARK, BENCHMARKS_DHRYSTONE, LIBC_FLOATINGPOINT.

Baseline Patched Change
Wall 9.67s 7.15s -2.52s, -26%
User 75.48s 73.51s -1.97s
Sys 25.09s 23.36s -1.73s

The elf was flashed to a Raspberry Pi Pico 2 W; ostest ran to completion with exit status 0.

Config 2: imxrt1064-evk:netnsh (iMX RT 1064, Cortex-M7)

Default config, no app overrides. No hardware test; build success only.

Baseline Patched Change
Wall 12.37s 9.01s -3.36 s, -27%
User 106.57s 95.38s -11.19s
Sys 35.14s 30.16s -4.98s

Larger scale impact and fix origins

This optimization came about by trying to speedup the build of a RP2350-based custom PX4-Autopilot board target, where the build time drops from 126.01s to 14.69s wall time (88% speedup, 8.6x quicker).

In PX4, NuttX is built through CMake and multiple isolated sub-makes are spawned (each NuttX library is a separate add_custom_command invoking make -C <libdir> and the wrappers reset MAKELEVEL=0 probably to avoid jobserver collision). So here we can see a proportionally larger speed increase, because the bug fires per-sub-make rather than once.

I am not including timing logs of the PX4 builds as evidence for now, just as context on how the cost of the existing inefficiency increases with build complexity from other projects including the OS.

@dfanache dfanache requested a review from xiaoxiang781216 as a code owner May 11, 2026 21:02
@github-actions github-actions Bot added Area: Build system Size: S The size of the change in this PR is small labels May 11, 2026
Three `export VAR ?= $(shell ...)` assignments cause GNU make to
re-run the embedded ${shell ...} every time the variable is exported
to a recipe's environment.  That spawns `tools/incdir` and
`tools/define` once per recipe, serialised through the master make
thread, which adds per recipe overhead to multi-job builds.

Wrap each with `ifeq ($(origin VAR),undefined)` + `:=` so the shell
call runs once at parse time while preserving the override semantics
of `?=`.

Measured impact on a 20-core build host is a ~26% speedup of wall
time.

Signed-off-by: Daniel Fanache <dan@rts.ro>
@dfanache dfanache force-pushed the fix/make-shell-speedup branch from d444326 to db91870 Compare May 11, 2026 23:34
@dfanache
Copy link
Copy Markdown
Contributor Author

Yikes - the CI builds were not happy. The := evaluates Config.mk early, and $(INCDIR) resolves to tools/incdir which might not be built yet.

I've added a shell || fallback to tools/incdir.sh (always present in the tree) so the parse-time evaluation always succeeds. Pretty fiddly to test locally so if another CI run is approved, it might just go green.

@lupyuen lupyuen requested a review from simbit18 May 12, 2026 06:08
Copy link
Copy Markdown
Contributor

@linguini1 linguini1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@xiaoxiang781216 xiaoxiang781216 merged commit 8b988e8 into apache:master May 12, 2026
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Build system Size: S The size of the change in this PR is small

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants