π [REPRO β DO NOT MERGE] reproduce early-resource-timings flake on CI#4630
Closed
thomas-lebeau wants to merge 4 commits into
Closed
π [REPRO β DO NOT MERGE] reproduce early-resource-timings flake on CI#4630thomas-lebeau wants to merge 4 commits into
thomas-lebeau wants to merge 4 commits into
Conversation
Temporary repro branch β DO NOT MERGE. - Add raw-timing + clock-resolution debug logging to the `retrieve early requests timings` test, to confirm WebKit's 1ms clamping is responsible for the duration=0 / download.start=0 failures. - Narrow GitLab CI e2e job to only `webkit-pinned`, only the relevant `-g "retrieve early requests timings"` filter, `--repeat-each=50`, and `--retries=0` so each repeat is an independent attempt. The hypothesis is that WebKit clamps PerformanceResourceTiming timestamps and performance.now() to 1ms. When the entire /empty.css request fits in a single tick (more likely on CI's faster Linux loopback than on local macOS), startTime === responseEnd and the SDK falls through `(duration === 0 && startTime < responseEnd)` to return 0.
π All green!βοΈ No new flaky tests detected π― Code Coverage (details) π Commit SHA: 65c7e29 | Docs | Datadog PR Page | Give us feedback! |
Bundles Sizes Evolution
π CPU Performance
π§ Memory Performance
π RealWorld |
- Run the early-timings test with --workers=1 so stdout isn't interleaved. - Reduce --repeat-each to 30 (3 sub-tests Γ 30 = 90 attempts) to keep CI wall-clock reasonable now that the run is single-threaded. - Tag each debug line with the title + repeat index so raw=/sdk= pairs are always matchable in the log.
Add chromium to the matrix to compare timestamp clamping with webkit-pinned. Expectation: chromium will not flake (no 1ms-tick collapse) because it exposes higher-precision timestamps for same-origin resources.
Collaborator
Author
|
Repro confirmed β see analysis in PR description. Closing without merging; fix follows in a separate PR. |
WebKit clamps `PerformanceResourceTiming` timestamps and `performance.now()` to 1ms (privacy/Spectre mitigation). For the zero-byte `/empty.css` served over fast (Linux CI) loopback, the entire request occasionally completes in a single 1ms tick, collapsing every timestamp to the same value. The SDK then truthfully emits `duration: 0` and `download.start: 0`, and the strict `> 0` assertion in `expectToHaveValidTimings` fails β top flaky E2E this week (28/24/21 occurrences for the async/npm/bundle variants). Adding a 50ms server-side delay guarantees `startTime < responseEnd` even under 1ms clamping. Keeps `expectToHaveValidTimings` strict, so a real regression that zeroed duration would still be caught on every browser.
Collaborator
Author
|
Re-running the same repro setup (webkit-pinned + chromium, repeat-each=30, retries=0) with the 50ms /empty.css delay applied to verify the fix lands 0 failures. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The E2E test
rum resources βΊ retrieve early requests timings(async / npm / bundle variants attest/e2e/scenario/rum/resources.scenario.ts:61) is one of the top flaky tests this week onwebkit-pinnedβ 28 / 24 / 21 occurrences over the last 7 days across many branches.Working theory: WebKit clamps
PerformanceResourceTimingtimestamps andperformance.now()to 1ms resolution. When the entire/empty.cssrequest (a zero-byte response) fits inside one 1ms tick, every relevant timestamp clamps to the same value. The SDK's existing Safari fallback incomputeResourceEntryDuration(packages/rum-core/src/domain/resource/resourceUtils.ts:76) is guarded bystartTime < responseEnd, which fails when they're equal, so the SDK truthfully emitsduration: 0β and the test asserts strictly> 0.Locally on macOS, 50/50 repeats passed but every captured entry confirmed 1ms clamping. The failure presumably needs CI's faster Linux loopback to make collapse probable. This branch verifies that.
Changes
test/e2e/scenario/rum/resources.scenario.ts) to log:PerformanceResourceTimingforempty.css(all 12 timestamps + size fields).performance.now()samples to expose clock granularity.resource.duration,download,first_byte,size,transfer_size..gitlab-ci.ymle2e job to:webkit-pinned(skip chromium/firefox/etc.)retrieve early requests timingstest via-g--repeat-each=50Γ 3 sub-tests = 150 attempts--retries=0so each repeat is an independent observationTest instructions
e2e: [webkit-pinned]job to finish.[EARLY-TIMINGS-DEBUG webkit]lines.nowSamplesshows 4 identical integer-ms values (clock clamped to 1ms).startTime === responseStart === responseEndinraw=and the correspondingsdk=showsduration: 0.duration: β₯1ms.Checklist