Skip to content

fix(browser): fall back to Hyprland grim capture for headed viewport screenshots#67149

Open
Angfr95 wants to merge 4 commits intoopenclaw:mainfrom
Angfr95:fix/hyprland-headed-viewport-screenshot-timeout
Open

fix(browser): fall back to Hyprland grim capture for headed viewport screenshots#67149
Angfr95 wants to merge 4 commits intoopenclaw:mainfrom
Angfr95:fix/hyprland-headed-viewport-screenshot-timeout

Conversation

@Angfr95
Copy link
Copy Markdown

@Angfr95 Angfr95 commented Apr 15, 2026

Problem

On Omarchy/Hyprland/Wayland, headed Chromium viewport screenshots time out (~13s) with Page.captureScreenshot timed out after 10000ms on certain pages (GitHub Trending, Yahoo Finance), even though the page loaded successfully. Headless mode and the control UI work fine.

Root Cause

CDP Page.captureScreenshot relies on Chromium's internal raster path, which can stall on headed Wayland sessions for compositor-dependent reasons. The page is loaded and interactive, but the screenshot command hangs.

Fix

Add a Hyprland-native capture fallback in the /screenshot route:

  1. New file hyprland-capture.ts: detects active Hyprland session, creates a dedicated headless output via hyprctl, moves the browser window onto it, and captures with grim (~20-170ms vs 10s+ timeout).
  2. Modified agent.snapshot.ts: for normal viewport screenshots on headed non-headless Linux, tries grim capture first. Falls back to CDP transparently on failure or when not on Hyprland.

Scope: only normal viewport screenshots. Full-page and element captures still use CDP/Playwright.

Tests

  • agent.snapshot.test.ts: 8/8 pass
  • screenshot.test.ts: 2/2 pass
  • cdp.test.ts: 31/31 pass

Fixes #64317

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c13d0ddb6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +375 to +377
const hyprlandPng =
browserPid && browserPid > 0
? await tryHyprlandViewportCapture({ browserPid, timeoutMs: 3000 })
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep Hyprland fallback bound to the requested target tab

In the headed Hyprland path, /screenshot now calls tryHyprlandViewportCapture with only browserPid, which captures compositor output for the browser window rather than the selected CDP target. Because ensureTabAvailable(targetId) does not activate/focus that tab, requests for a background targetId can return an image of a different tab/window, so automation can read or act on the wrong page state.

Useful? React with 👍 / 👎.

Comment on lines +379 to +380
if (hyprlandPng) {
buffer = hyprlandPng;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve requested image format when Hyprland capture succeeds

When type: "jpeg" is requested, the Hyprland branch assigns the raw grim output directly to buffer, but grim emits PNG bytes. saveNormalizedScreenshotResponse only converts on resize/compression pressure; otherwise it keeps the original buffer and labels it from the requested type, which can produce .jpeg media IDs backed by PNG content and break consumers that rely on consistent output format.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 15, 2026

Greptile Summary

This PR adds a Hyprland-native screenshot fallback (grim via hyprctl headless output) to fix CDP Page.captureScreenshot timeouts on headed Wayland/Omarchy sessions. The detection and binary resolution logic is well-guarded and the fallback to CDP on non-Hyprland environments is transparent.

Two P1 defects need addressing before merge:

  • PNG/JPEG mismatch: grim always outputs PNG, but when type: "jpeg" is requested and the buffer fits within normalizeBrowserScreenshot's 5 MB / 2000 px limits (the typical case for a 1920×1080 viewport), the saved file gets contentType: "image/jpeg" despite containing PNG bytes, breaking downstream image consumers.
  • Concurrent setup race: The module-level singleton is not guarded against concurrent async callers — two simultaneous requests both enter if (!cachedCapture) and independently call setupHeadedBrowserViewportCapture, dispatching duplicate movetoworkspacesilent commands and orphaning the first capture reference.

Confidence Score: 4/5

Not safe to merge as-is; two P1 defects (format mismatch and concurrent setup race) need to be fixed first.

The Hyprland detection and grim capture mechanics are sound, and the non-Hyprland fallback is transparent. However, the PNG-labeled-as-JPEG bug will silently corrupt screenshot responses whenever JPEG is requested on Hyprland, and the concurrent-setup race can cause double workspace dispatches under load. Both are straightforward fixes but represent present defects in the changed code.

Both changed files: hyprland-capture.ts for the singleton race and missing exit cleanup; agent.snapshot.ts for the PNG/JPEG format override in the Hyprland branch.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/browser/src/browser/hyprland-capture.ts
Line: 421-449

Comment:
**Race condition in singleton setup**

Because `detectHyprlandSession()` and `setupHeadedBrowserViewportCapture()` both contain `await` points, two concurrent screenshot requests that both enter the `if (!cachedCapture)` branch will each proceed through setup independently. Both dispatch `movetoworkspacesilent` for the same PID; the second write to `cachedCapture` silently overwrites the first, leaving the first capture object abandoned without teardown.

Fix: deduplicate concurrent callers with a shared in-flight promise:

```ts
let setupInFlight: Promise<HeadedBrowserViewportCapture | null> | null = null;

if (!cachedCapture) {
  if (!setupInFlight) {
    setupInFlight = (async () => {
      const session = await detectHyprlandSession();
      if (!session) return null;
      return await setupHeadedBrowserViewportCapture({ browserPid: params.browserPid, session });
    })().finally(() => { setupInFlight = null; });
  }
  const capture = await setupInFlight;
  if (!capture) return null;
  cachedCapture = capture;
  cachedPid = params.browserPid;
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/browser/src/browser/routes/agent.snapshot.ts
Line: 378-389

Comment:
**PNG/JPEG format mismatch when `type === "jpeg"`**

`grim` always outputs PNG. When `type === "jpeg"` and the screenshot fits within `normalizeBrowserScreenshot`'s limits (< 5 MB, dimensions ≤ 2000 px — the typical case for a 1920×1080 viewport), `normalizeBrowserScreenshot` returns `{ buffer }` with no `contentType`. The fallback `image/${params.type}` evaluates to `"image/jpeg"`, so the saved file is labeled as JPEG despite containing raw PNG bytes.

Fix: override `type` to `"png"` for the Hyprland branch:

```ts
if (hyprlandPng) {
  await saveNormalizedScreenshotResponse({
    res,
    buffer: hyprlandPng,
    type: "png", // grim always produces PNG
    targetId: tab.targetId,
    url: tab.url,
  });
  return;
}
buffer = await captureScreenshot({
  wsUrl: tab.wsUrl ?? "",
  fullPage,
  format: type,
  quality: type === "jpeg" ? 85 : undefined,
});
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/browser/src/browser/hyprland-capture.ts
Line: 417-419

Comment:
**No process-exit cleanup for headless compositor output**

The `browser-capture` headless output is never removed on process exit (`SIGTERM`, crash, normal exit). On next run `ensureCaptureOutput` will find and reuse it, so it is not permanently leaked, but the dangling monitor is visible to users in their Hyprland session.

Consider registering exit handlers that call `teardownHeadedBrowserViewportCapture(cachedCapture)` once, lazily, when the first capture is set up.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(browser): fall back to Hyprland grim..." | Re-trigger Greptile

Comment on lines +421 to +449
export async function tryHyprlandViewportCapture(params: {
browserPid: number;
timeoutMs?: number;
}): Promise<Buffer | null> {
if (process.platform !== "linux") return null;
try {
if (cachedCapture && cachedPid !== params.browserPid) {
await teardownHeadedBrowserViewportCapture(cachedCapture).catch(() => {});
cachedCapture = null;
cachedPid = null;
}
if (!cachedCapture) {
const session = await detectHyprlandSession();
if (!session) return null;
cachedCapture = await setupHeadedBrowserViewportCapture({
browserPid: params.browserPid,
session,
});
cachedPid = params.browserPid;
}
return await captureViewportPng({
capture: cachedCapture,
timeoutMs: params.timeoutMs ?? 2000,
});
} catch {
cachedCapture = null;
cachedPid = null;
return null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Race condition in singleton setup

Because detectHyprlandSession() and setupHeadedBrowserViewportCapture() both contain await points, two concurrent screenshot requests that both enter the if (!cachedCapture) branch will each proceed through setup independently. Both dispatch movetoworkspacesilent for the same PID; the second write to cachedCapture silently overwrites the first, leaving the first capture object abandoned without teardown.

Fix: deduplicate concurrent callers with a shared in-flight promise:

let setupInFlight: Promise<HeadedBrowserViewportCapture | null> | null = null;

if (!cachedCapture) {
  if (!setupInFlight) {
    setupInFlight = (async () => {
      const session = await detectHyprlandSession();
      if (!session) return null;
      return await setupHeadedBrowserViewportCapture({ browserPid: params.browserPid, session });
    })().finally(() => { setupInFlight = null; });
  }
  const capture = await setupInFlight;
  if (!capture) return null;
  cachedCapture = capture;
  cachedPid = params.browserPid;
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/browser/src/browser/hyprland-capture.ts
Line: 421-449

Comment:
**Race condition in singleton setup**

Because `detectHyprlandSession()` and `setupHeadedBrowserViewportCapture()` both contain `await` points, two concurrent screenshot requests that both enter the `if (!cachedCapture)` branch will each proceed through setup independently. Both dispatch `movetoworkspacesilent` for the same PID; the second write to `cachedCapture` silently overwrites the first, leaving the first capture object abandoned without teardown.

Fix: deduplicate concurrent callers with a shared in-flight promise:

```ts
let setupInFlight: Promise<HeadedBrowserViewportCapture | null> | null = null;

if (!cachedCapture) {
  if (!setupInFlight) {
    setupInFlight = (async () => {
      const session = await detectHyprlandSession();
      if (!session) return null;
      return await setupHeadedBrowserViewportCapture({ browserPid: params.browserPid, session });
    })().finally(() => { setupInFlight = null; });
  }
  const capture = await setupInFlight;
  if (!capture) return null;
  cachedCapture = capture;
  cachedPid = params.browserPid;
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 378 to 389
: null;
if (hyprlandPng) {
buffer = hyprlandPng;
} else {
buffer = await captureScreenshot({
wsUrl: tab.wsUrl ?? "",
fullPage,
format: type,
quality: type === "jpeg" ? 85 : undefined,
});
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 PNG/JPEG format mismatch when type === "jpeg"

grim always outputs PNG. When type === "jpeg" and the screenshot fits within normalizeBrowserScreenshot's limits (< 5 MB, dimensions ≤ 2000 px — the typical case for a 1920×1080 viewport), normalizeBrowserScreenshot returns { buffer } with no contentType. The fallback image/${params.type} evaluates to "image/jpeg", so the saved file is labeled as JPEG despite containing raw PNG bytes.

Fix: override type to "png" for the Hyprland branch:

if (hyprlandPng) {
  await saveNormalizedScreenshotResponse({
    res,
    buffer: hyprlandPng,
    type: "png", // grim always produces PNG
    targetId: tab.targetId,
    url: tab.url,
  });
  return;
}
buffer = await captureScreenshot({
  wsUrl: tab.wsUrl ?? "",
  fullPage,
  format: type,
  quality: type === "jpeg" ? 85 : undefined,
});
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/browser/src/browser/routes/agent.snapshot.ts
Line: 378-389

Comment:
**PNG/JPEG format mismatch when `type === "jpeg"`**

`grim` always outputs PNG. When `type === "jpeg"` and the screenshot fits within `normalizeBrowserScreenshot`'s limits (< 5 MB, dimensions ≤ 2000 px — the typical case for a 1920×1080 viewport), `normalizeBrowserScreenshot` returns `{ buffer }` with no `contentType`. The fallback `image/${params.type}` evaluates to `"image/jpeg"`, so the saved file is labeled as JPEG despite containing raw PNG bytes.

Fix: override `type` to `"png"` for the Hyprland branch:

```ts
if (hyprlandPng) {
  await saveNormalizedScreenshotResponse({
    res,
    buffer: hyprlandPng,
    type: "png", // grim always produces PNG
    targetId: tab.targetId,
    url: tab.url,
  });
  return;
}
buffer = await captureScreenshot({
  wsUrl: tab.wsUrl ?? "",
  fullPage,
  format: type,
  quality: type === "jpeg" ? 85 : undefined,
});
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +417 to +419
// ── Singleton session cache ─────────────────────────────────────────────
let cachedCapture: HeadedBrowserViewportCapture | null = null;
let cachedPid: number | null = null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No process-exit cleanup for headless compositor output

The browser-capture headless output is never removed on process exit (SIGTERM, crash, normal exit). On next run ensureCaptureOutput will find and reuse it, so it is not permanently leaked, but the dangling monitor is visible to users in their Hyprland session.

Consider registering exit handlers that call teardownHeadedBrowserViewportCapture(cachedCapture) once, lazily, when the first capture is set up.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/browser/src/browser/hyprland-capture.ts
Line: 417-419

Comment:
**No process-exit cleanup for headless compositor output**

The `browser-capture` headless output is never removed on process exit (`SIGTERM`, crash, normal exit). On next run `ensureCaptureOutput` will find and reuse it, so it is not permanently leaked, but the dangling monitor is visible to users in their Hyprland session.

Consider registering exit handlers that call `teardownHeadedBrowserViewportCapture(cachedCapture)` once, lazily, when the first capture is set up.

How can I resolve this? If you propose a fix, please make it concise.

@Angfr95 Angfr95 force-pushed the fix/hyprland-headed-viewport-screenshot-timeout branch from 0c13d0d to 0b586d1 Compare April 15, 2026 11:50
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0b586d1134

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +433 to +435
if (cachedCapture && cachedPid !== params.browserPid) {
await teardownHeadedBrowserViewportCapture(cachedCapture).catch(() => {});
cachedCapture = null;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Serialize teardown with capture for mixed-profile screenshots

Guarding only setup with setupLock leaves capture and teardown unsynchronized: if a screenshot for PID A is already in captureViewportPng, a concurrent request for PID B will execute teardownHeadedBrowserViewportCapture on the shared output immediately. That can invalidate the in-flight grim capture for PID A, causing an avoidable fallback to the original CDP timeout path and flaky failures whenever two profiles are snapshotted in parallel.

Useful? React with 👍 / 👎.

Comment on lines +458 to +460
} catch {
cachedCapture = null;
cachedPid = null;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Tear down Hyprland output on capture fallback errors

The error path clears cachedCapture/cachedPid and returns null without cleaning up a headless output that may already have been created (and possibly had the browser moved onto it). After transient setup/capture failures, this can leave browser-capture state behind until a later successful PID-switch teardown happens, so a one-off failure can persistently alter compositor state.

Useful? React with 👍 / 👎.

…screenshots

On Omarchy/Hyprland/Wayland, CDP Page.captureScreenshot can hang 10s+
on headed Chromium for certain pages (GitHub Trending, Yahoo Finance)
even though the page loaded successfully.

Add a native Hyprland workaround: create a dedicated headless output,
move the browser window onto it, and capture with grim (~20-170ms).
Falls back to CDP transparently when not on Hyprland or for fullPage/
element screenshots.

Addresses review feedback:
- Force type=png when grim captures (grim always outputs PNG)
- Add setup mutex to prevent concurrent race on singleton cache

Fixes openclaw#64317
@Angfr95 Angfr95 force-pushed the fix/hyprland-headed-viewport-screenshot-timeout branch from 0b586d1 to ab78955 Compare April 15, 2026 12:01
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5d3cd3363

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +433 to +437
if (cachedCapture && cachedPid !== params.browserPid) {
await teardownHeadedBrowserViewportCapture(cachedCapture).catch(() => {});
cachedCapture = null;
cachedPid = null;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clean up cached Hyprland output when capture session ends

browser-capture teardown currently runs only when another screenshot arrives with a different PID (or when a capture error is thrown), so a successful screenshot followed by browser stop/restart can leave the headless Hyprland output and moved window state behind indefinitely. This makes /screenshot introduce persistent compositor side effects across runs instead of being self-contained; adding lifecycle cleanup outside the PID-mismatch path (for example on profile stop/shutdown) would prevent orphaned outputs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dcfabcdae5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

? Math.max(250, Math.min(10000, Math.floor(params.timeoutMs)))
: DEFAULT_CAPTURE_TIMEOUT_MS;

const png = await execBuffer(grim, ["-o", params.capture.outputName, "-l", "0", "-"], {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Capture only tab viewport in Hyprland fallback

captureViewportPng runs grim -o <output> and the route returns that buffer as the /screenshot result, which captures the full compositor output rather than the page viewport that Page.captureScreenshot normally returns. On headed Linux this can include browser chrome and non-content pixels (for example when the window is not exactly full-output), so downstream consumers that rely on viewport-faithful screenshots for visual reasoning or click targeting get incorrect imagery whenever the Hyprland fallback path is taken.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Headed Chromium viewport screenshots time out on Omarchy/Hyprland/Wayland after successful page load

1 participant