Skip to content

build: calibrate PGO collection workloads and fix renderer profile loss#51855

Merged
MarshallOfSound merged 1 commit into
43-x-yfrom
trop/43-x-y-bp-build-calibrate-pgo-collection-workloads-and-fix-renderer-profile-loss-1780451800505
Jun 3, 2026
Merged

build: calibrate PGO collection workloads and fix renderer profile loss#51855
MarshallOfSound merged 1 commit into
43-x-yfrom
trop/43-x-y-bp-build-calibrate-pgo-collection-workloads-and-fix-renderer-profile-loss-1780451800505

Conversation

@trop
Copy link
Copy Markdown
Contributor

@trop trop Bot commented Jun 3, 2026

Backport of #51852

See that PR for details.

Notes: none

The macOS collection ran the instrumented app sandboxed, and sandboxed
child processes fail their exit-time profraw write (LLVM Profile Error:
Operation not permitted). Which processes lose is a per-run race; when
the web-benchmark renderers lose, the merged profile ships with Blink
essentially absent from the hot set (4 DOM functions in the top-50k vs
~400 expected). The 42-x-y and 43-x-y collections both hit this.
Use %c continuous mode on Darwin (counters live in an mmap established
before sandbox lockdown, no exit-time write needed) and pass
--no-sandbox on the macOS collect step like the Linux step already
does.

The synthetic workloads were also capped by wall time, not work, so
their tight loops dominated the count budget: the top 2 blocks held
~10% of all counts, which inflated the global hot threshold and crowded
Blink out of the hot set. PGO hotness is a threshold, not a share - a
path is hot once it clears ~10^5 counts, and extra volume only distorts
the distribution. Cap the loops by iterations (still 2-3 orders of
magnitude above the threshold), soft-cap MotionMark at 120s, and spend
the reclaimed time on two more Speedometer runs.

Add an async-churn workload (250k tiny ops: promise fs, immediates,
local socket echoes) so the per-async-operation machinery (AsyncWrap,
BaseObject lifecycle, stream plumbing) stays trained now that the
heavyweight workloads are capped - their counts ride on async op
volume, not loop iterations.

Also clean the profraw directory at collection start: stale %m pool
files from a previous run silently merge their counters into the new
collection.

Validated against an instrumented 42-x-y build with the same scripts
CI runs: profraw count 9 (renderers lost) -> 23+ with zero write
errors; blocks holding 90% of counts 1,710 -> 14,000+; Blink DOM
functions in the top-50k 4 -> ~404 (Chrome-profile parity); top-2-block
share ~10% -> 2.6%; node serialization paths hot at sane magnitudes
(StringBytes::Write 410M, contextBridge marshaling 15M).

Co-authored-by: Samuel Attard <sattard@anthropic.com>
@trop trop Bot added 43-x-y backport This is a backport PR semver/none labels Jun 3, 2026
@MarshallOfSound MarshallOfSound merged commit 880b4cf into 43-x-y Jun 3, 2026
32 checks passed
@release-clerk
Copy link
Copy Markdown

release-clerk Bot commented Jun 3, 2026

No Release Notes

@MarshallOfSound MarshallOfSound deleted the trop/43-x-y-bp-build-calibrate-pgo-collection-workloads-and-fix-renderer-profile-loss-1780451800505 branch June 3, 2026 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant