Skip to content

display: return realized X root size instead of failing on libxcvt rounding#276

Merged
hiroTamada merged 5 commits into
mainfrom
hypeship/display-realized-dims
Jun 4, 2026
Merged

display: return realized X root size instead of failing on libxcvt rounding#276
hiroTamada merged 5 commits into
mainfrom
hypeship/display-realized-dims

Conversation

@hiroTamada
Copy link
Copy Markdown
Contributor

@hiroTamada hiroTamada commented Jun 4, 2026

Summary

  • PATCH /display previously enforced strict equality between the request and the X root size after a resize. libxcvt's CVT 8-pixel grid round (e.g. 1365 → 1360) and the hard-coded FWXGA bump for 1360×768 → 1366×768 make that equality impossible to satisfy for some requests, returning 500.
  • Callers that treat /display 500s as vm_unrecoverable were tainting the browser instance on every odd-width /configure call.
  • Replace the strict post-condition with a single xrandr read of the X root after neko's synchronous resize returns. Use the realized dimensions for the response body so callers' coordinate math lines up with what the X server actually rendered. Log the request-vs-realized gap.

Where the rounding lives

libxcvt_gen_mode_info (called by neko via XCreateScreenMode) rounds widths down to the CVT 8-pixel grid in lib/libxcvt.c:102:

hdisplay_rnd = mode_info->hdisplay - (mode_info->hdisplay % CVT_H_GRANULARITY); // 8

Then at the end of the same function there's an FWXGA carve-out (lib/libxcvt.c:294-298) that bumps the canonical 1360×768 CVT output back to 1366×768 to match real laptop EDIDs:

/* FWXGA hack adapted from hw/xfree86/modes/xf86EdidModes.c, because you can't say 1366 */
if (mode_info->hdisplay == 1360 && mode_info->vdisplay == 768) {
     mode_info->hdisplay = 1366;
     ...
}

So 1365×768 → 1360×768 (CVT round) → 1366×768 (FWXGA bump). Verified by linking a small probe against libxcvt 0.1.3 directly:

input output
1363×768 1366×768 (CVT→1360, FWXGA→1366)
1365×768 1366×768 (CVT→1360, FWXGA→1366)
1919×1080 1912×1080 (CVT only)
391×844 384×844 (CVT only)
1365×769 1360×769 (CVT only — FWXGA needs vdisplay==768)

No amount of rounding logic on the kernel-images side can mirror this — the right contract is to read what landed, not predict it.

Why this approach over the alternatives

  • Pre-round in resolveDisplayParams: would need to mirror libxcvt's CVT-8 round and FWXGA carve-out and any future libxcvt rule. Brittle.
  • Read realized dims from neko's HTTP response: neko's screenConfigurationChange handler returns the request body (HttpSuccess(w, data)), not the realized config (size) — a separate upstream bug. Worth fixing in kernel/neko but doesn't block this.
  • Read X root via xrandr after neko returns (this PR): X server is the ground truth. Neko's call is synchronous so a single read at that point captures the realized dimensions.

Verified

  • Built chromium-headful-test:latest locally and ran the e2e reproducer (TestDisplayResizeOddWidthHonoursLibxcvtRounding):
    • Before: 500 with X root verification: x root is 1366x768, want 1365x768.
    • After: 200 with body {"height":768,"refresh_rate":60,"width":1366} and x_root=1366x768.
  • All existing headful resize subtests still pass: headful_start_maximized, headful_kiosk, headful_xorg_no_neko. headless_default skipped because the headless image isn't built locally (unrelated).

Test plan

  • go build ./...
  • go test ./cmd/api/api/... ./lib/nekoclient/... (unit tests pass)
  • e2e: TestDisplayResizeOddWidthHonoursLibxcvtRounding passes against built image
  • e2e regression: TestDisplayResizeChromiumWindow headful subtests pass
  • Smoke-test in a staging metro before merging

Follow-ups

  • File an upstream fix in kernel/neko for screenConfigurationChange to return size instead of data so the HTTP response carries the realized dims. Independent of this PR.
  • Consider also relaxing the browser pool's taint policy so future display-resize 500s (for any reason) don't kill the VM.

🤖 Generated with Claude Code


Note

Medium Risk
Changes headful display resize success criteria and API-reported dimensions; mis-polling could return wrong sizes, but CDP maximize failures still fail the request.

Overview
PATCH /display on the headful Xorg path no longer fails when the X server lands on a size libxcvt rounded away from the request (e.g. 1365×768 → 1366×768). Strict waitForXRootSize verification is replaced by waitForXRootRealized, which polls xrandr until the root matches the request, or stays stable within 32px of it—so CVT/FWXGA rounding is accepted while transient dummy-DDX sizes (e.g. 3840×2160 during kiosk settle) are not. The handler uses those dimensions in the 200 body (with logging when they differ from the request); CDP maximize re-assert runs after that poll. Neko client docs note the API echoes the request, not realized size.

Adds TestDisplayResizeOddWidthHonoursLibxcvtRounding for the production 1365×768 case.

Reviewed by Cursor Bugbot for commit 2050e6a. Bugbot is set up for automated code reviews on this repo. Configure here.

…unding

PATCH /display previously enforced strict equality between the request and
the X root size after a resize. libxcvt's CVT 8-pixel grid round (e.g.
1365 → 1360) and the hard-coded FWXGA bump for 1360×768 → 1366×768 make
that equality impossible to satisfy for some requests, returning 500.
Callers that treat /display 500s as fatal end up tainting the browser
instance on every odd-width resize.

Replace the strict post-condition with a single xrandr read of the X root
after neko's synchronous resize returns. Use the realized dimensions for
the response body so callers' coordinate math lines up with what the X
server actually rendered. Log the request-vs-realized gap for diagnosability.

The e2e reproducer is flipped from asserting the 500 to asserting the 200
+ realized 1366 in the response body.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hiroTamada hiroTamada marked this pull request as ready for review June 4, 2026 14:55
@firetiger-agent
Copy link
Copy Markdown

Created a monitoring plan for this PR.

What this PR does: Fixes browser VMs being permanently tainted and recycled when a display resize request uses a width libxcvt rounds to a different value (e.g., 1365px → 1366px). Previously every such /configure call returned a 500 and marked the VM unrecoverable; after the fix, the resize succeeds and the response reports the actual realized screen dimensions.

Intended effect:

  • "X root verification" configure-claim failures: pre-deploy baseline 988–1,559/hr during active-use windows (observed June 3 21:00–22:00 UTC); confirmed if this drops to 0 post-deploy.
  • failed to configure and claim total rate: normal baseline is 0–28/hr (non-display causes); confirmed if it stays at or below that level without any "X root verification" errors in the detail.

Risks:

  • xrandr subprocess failure on new path"failed to read X root after resize" appears in configure-claim error detail; alert if any occurrence post-deploy (pre-deploy baseline: 0).
  • CDP maximize regression"CDP maximize re-assert failed" in configure-claim errors; alert if any occurrence (pre-deploy baseline: 0).
  • Fix reverts to broken state — if the image rollout is partial or a canary exits early, the "X root verification" errors resume; alert if count > 0 for any post-deploy hour.

View monitor

A single xrandr read immediately after neko's resize could catch a
transient — chromium running in --kiosk briefly pushes the X root to the
dummy DDX's max mode (3840×2160) while mutter settles on the new screen,
producing a misleading "realized" size in the response.

Replace the single read with a short poll loop that returns early when
either (a) xrandr reports the requested size, or (b) consecutive readings
are stable for ~150ms. The deadline (10s) only triggers if the X root
never converges, in which case the last observation is still returned —
the path stays non-fatal.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread server/cmd/api/api/display.go
waitForXRootRealized would treat any three consecutive stable readings as
the realized size, which silently accepts the pre-resize baseline if the
X server hasn't committed the new mode yet. The response would echo back
the old dimensions as 'realized' — a success-looking lie.

Capture the pre-resize X root, pass it through, and exclude it from the
stable-match condition. The match-the-request fast path is unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 1e848b7. Configure here.

Comment thread server/cmd/api/api/display.go
The previous commit added a guard that refused to accept the pre-resize
baseline as a stable-N match, defending against a hypothetical race
where the X server hadn't committed the new mode yet. In practice
XSetScreenConfiguration is a synchronous X protocol round-trip, so by
the time neko's call returns the server has committed — that race
doesn't exist.

The guard had a real cost: when a request rounds back to the current X
root (idempotent re-PATCH of an odd width, or any resize whose libxcvt
realization equals the current screen), the poll loop would never
declare baseline-stable and would wait the full 10s before returning.

Drop the guard. Stable-N now accepts any value, including the baseline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@IlyaasK IlyaasK force-pushed the hypeship/display-realized-dims branch from 8c80a4a to 07268ef Compare June 4, 2026 16:59
…N path

Why: chromium --start-maximized briefly drives xrandr to report the dummy
DDX's max mode (e.g. 3840x2160) while mode-switch propagates. A naive
stable-N would echo that transient into the response body. Real libxcvt
rounding is <16 px; the dummy max is >1000 px off — acceptableDelta=32
sits between them.
@hiroTamada hiroTamada merged commit 7fbc8cd into main Jun 4, 2026
10 checks passed
@hiroTamada hiroTamada deleted the hypeship/display-realized-dims branch June 4, 2026 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants