Conversation
Replace scripts/text-rollup.py (which silently bucketed 99.7% as
<unknown> whenever the production build dropped DWARF) with
scripts/subsystem-rollup.py, an LTO-aware reporter that:
- strips GCC clone suffixes (.lto_priv/.constprop/.isra/.part/
.cold/.localalias, including stacked combinations) before
bucket lookup so specialized clones credit their parent
function's source dir;
- deduplicates IPA-ICF aliases by start address (ld.bfd has no
--icf, so all folding is GCC-side and surfaces as multiple
distinct names sharing one address) into a normalized
<icf-merged> bucket;
- uses addr2line -p -f -i with "(inlined by)" continuations to
attribute each address to the OUTERMOST inline frame ("where
the code lives in the image");
- filters by ELF section so .init.text/.exit.text/.head.text do
not inflate the rollup by ~7%;
- routes linker-emitted bytes outside any nm symbol to
<compiler-partition> rather than dropping them;
- lex-normalizes . and .. segments via posixpath.normpath before
bucketing, so cross-tree paths like lib/../scripts/dtc/libfdt/...
attribute to where the source actually lives (scripts) instead
of the meaningless depth-2 key lib/..;
- exits 2 on missing DWARF instead of silently emitting garbage.
Outputs land next to section-sizes.txt: subsystem-rollup.txt
(table consumed by the gate), subsystem-rollup-bars.svg (sorted
horizontal bars), subsystem-rollup-tree.html (D3 treemap with
hover-to-drill).
Add --deep BUCKET flag (repeatable) plus --deep-output PATH to
emit two new sibling artifacts: subsystem-rollup-deep.txt (tab-
delimited per-bucket 2nd-level subdirectory rollup + top-N source
files, machine-parseable) and subsystem-rollup-deep.html (styled
tables with sticky header, in-page nav between buckets,
proportional share bars, hover tooltips). kernel-size-report.sh
invokes it with --deep kernel --deep lib. --deep-output rejects
.html-suffixed paths because the HTML sibling derives via
with_suffix(".html") and would otherwise alias the .txt silently.
The _esc helper escapes both " and ' in addition to <>&, so values
are safe in attribute contexts as well as tag content.
Add per-bucket budget gate: scripts/check-subsystem-budget.py
diffs subsystem-rollup.txt against configs/subsystem-budget.txt
with a default +/- 2% noise band per bucket (LTO re-decides what
to inline between rebuilds, so identical sources still produce
small fluctuations). Wired into kernel-size-report.sh after the
rollup as a warn-only stage: a breach prints to stderr and writes
subsystem-budget.txt with bucket-by-bucket status, but does not
abort the build. Total-bytes threshold remains the coarse gate;
this layer answers WHICH bucket regressed. subsystem-budget.txt
ships with all rules commented out -- the operator pins ceilings
5-10% above observed sizes after one diagnostic run.
qemu-trace-to-orderfile.py learns LTO-clone normalization for the
bootcost view (matches the rollup's clone-suffix stripping so the
same function's hot-path hits do not split across N specialized
clone names) and emits kernel_bootcost.txt rolled up by
context_switch / scheduler / syscall_entry / exec_path /
fork_clone / softirq_irq buckets. collect-kernel-profile.sh adds
the new artifact to its cleanup and output lists.
Multi-model review (Gemini + Codex) caught the lex-normalize gap,
the _esc quote-escape gap, and the --deep-output aliasing risk
before merge -- all three are addressed above.
The embedded initramfs is gzip-compressed (CONFIG_INITRAMFS_COMPRESSION_GZIP=y, runtime decompressor lib/zlib_inflate 4,588 bytes), but olddefconfig was silently restoring upstream "default y" for every other RD_* selector. The new sub-bucket rollup surfaced the cost: lib/zstd 36,942 bytes, lib/lz4 10,972 bytes, lib/xz 6,598 bytes -- ~54KB of decompressor library code with no consumer on this target. RD_ZSTD also pulls lib/xxhash.c (~3KB), which cascades out automatically. Add explicit "# CONFIG_RD_ZSTD is not set" / RD_LZ4 / RD_XZ disables to the inline kernel .config block, and mirror them into the existing positive olddefconfig-survivor verification. RD_LZMA / RD_BZIP2 / RD_LZO are deferred (RD_LZO has 728 bytes of measured cost; LZMA and BZIP2 have zero in this build, only matter for hygiene). Add a negative-guard loop that fails the build if any of ZSTD_DECOMPRESS, ZSTD_COMMON, LZ4_DECOMPRESS, XZ_DEC, XXHASH, DECOMPRESS_ZSTD, DECOMPRESS_LZ4, or DECOMPRESS_XZ survive olddefconfig as =y. The decompressor libraries are hidden bools selected by the RD_* options, but a future fs/ or net/ enable (e.g. squashfs+zstd) could re-pull them through a different selector -- the guard catches that drift loudly so the size win does not silently regress. XXHASH is in the list because the Kconfig comment claims the cascade covers it; including it tightens the guard to match the claim, with no other in-tree consumer enabled here to displace. Result: linux.axf 1,303,072 -> 1,229,344 bytes (-73,728 / -5.7% in three steps; cumulative -32.4% vs the pre-pruning baseline). QEMU boot-test (scripts/validate-qemu.sh) passes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.