Skip to content

Add a server-side console screenshot endpoint / hcloud server screenshot (PNG of the VGA console, no interactive VNC session) #1420

@alexandru-savinov

Description

@alexandru-savinov

Summary

I'd like to request a server-side console screenshot capability: an action that captures the current VGA/console framebuffer of a Cloud server and returns it as an image (PNG/JPEG), without requiring an interactive noVNC session. Ideally surfaced in the CLI as:

hcloud server screenshot <server> -o console.png

I realize a still-image of the console is a new platform/API capability, and that this repo is a thin wrapper over the public API — so this is filed here mainly to track the CLI surface and gauge interest. If the maintainers would rather I route the platform side through the in-Console Feedback menu (the channel announced on docs.hetzner.cloud/whats-new on 2025-08-26 for product/API feature requests), I'm happy to cross-post there — a pointer would be appreciated.

Use case

The motivating scenario is debugging a server that is unbootable or otherwise unreachable, where you have no other window into what's happening:

  • A server stuck in the UEFI/EDK2 shell or boot menu.
  • A stage-1 / initramfs boot hang, a kernel panic, or a GRUB/BIOS screen frozen before networking comes up.
  • Anything where SSH never comes up and there is no serial-console-log API to read instead.
  • CI / automation, where there is no human available to open the noVNC console and look at the screen. A pipeline that provisions a server, waits, and finds it unreachable currently has no programmatic way to capture "what is on the screen right now" for triage or for attaching to a failure report.

In all of these, the one piece of information that would unblock diagnosis is a single image of the console. Today there is no way to get that without a human and a browser.

The gap

The Cloud API's only console capability is interactive-only. The single console endpoint is:

POST /v1/servers/{id}/actions/request_console

"Requests credentials for remote access via VNC over websocket to keyboard, monitor, and mouse for a Server. The provided URL is valid for 1 minute […]"

It returns a wss_url (e.g. wss://console.hetzner.cloud/?server_id=…&token=…, token valid ~1 minute) plus a VNC password. That is a live noVNC/VNC-over-WebSocket interactive session — keyboard/monitor/mouse — not a still image.

Checked against the authoritative OpenAPI spec (docs.hetzner.cloud/cloud.spec.json): of the 24 /servers/{id}/actions/* paths, request_console is the only console-related one. The canonical clients mirror this exactly — hcloud-go exposes only ServerClient.RequestConsole ("requests a WebSocket VNC console") and hcloud-python only request_console (→ wss_url + password). Neither has any screenshot method, because the API exposes none.

Industry parity

Server-side console screenshots are a fairly common primitive among major providers — usually built exactly for the "instance won't boot / can't SSH in" case:

Provider Screenshot API? API / CLI Returns
AWS EC2 Yes GetConsoleScreenshot (aws ec2 get-console-screenshot) Base64-encoded JPEG (imageData), server-side, no session
Google Cloud (GCE) Yes instances.getScreenshot (gcloud compute instances get-screenshot) Base64-encoded JPEG (contents); needs a virtual display device
Azure VMs Yes RetrieveBootDiagnosticsData (Boot Diagnostics) Short-lived SAS URL to vm.screenshot.bmp (BMP)
Hetzner Cloud No request_console only Interactive VNC wss_url + password — no image
DigitalOcean No Interactive browser/VNC console only
Vultr No Interactive noVNC console only
Linode (Akamai) No Glish (interactive VNC) / Lish (shell) only

So this isn't universal — DigitalOcean, Vultr, and Linode also lack it — but the three hyperscalers (AWS, GCP, Azure) all provide it, and AWS/GCP in particular return it as a simple base64 image, which is a nice shape to mirror.

Current workaround (and why it's fragile)

Because there's no native endpoint, the only way to get a console image today is to call request_console, dial the wss:// proxy, speak the RFB/VNC protocol yourself, and decode the framebuffer — all within the ~1-minute token window. People have repeatedly built tooling to do exactly this, which is itself evidence of the demand:

  • agentydragon/ducktape — hetzner_vnc_screenshot — agent skill that does request_console → WebSocket → RFB (via asyncvnc) → PNG (Pillow). Explicitly framed as the workaround for diagnosing boot problems, kernel panics, and stuck GRUB/BIOS/UEFI screens.
  • t-unix/hcloud-console — Go CLI that decodes the RFB 3.8 framebuffer; --once <id> captures a single frame and exits (scriptable). "Exists precisely because the Hetzner API exposes no native console screenshot."
  • hilbix/hcloud-console — Python+noVNC middleware bridging the request-console VNC WebSocket to a browser viewer; canonical example of the fragile wss→RFB bridge built around the missing endpoint.
  • RavuAlHemio — hetzner-vnc-alt-sysrq.js — browser-console helper for the noVNC console; adjacent evidence of the same ecosystem of hacks against the VNC websocket.

Each of these reimplements a VNC client just to grab one picture, has to race the 1-minute token, and breaks if the proxy/RFB details change. A native endpoint would make all of that unnecessary.

Proposed shape

Modeling on AWS/GCP, something like a read-style action:

POST /v1/servers/{id}/actions/request_console_screenshot

returning a small JSON body, e.g.:

{
  "image": "<base64-encoded image data>",
  "content_type": "image/png"
}

(JPEG would be equally fine — matching AWS/GCP — as long as content_type makes it explicit.) No interactive session, no token-window race; the caller just decodes the base64 and writes a file.

CLI surface:

hcloud server screenshot <server> -o console.png   # decode + write the image
hcloud server screenshot <server> > console.png    # raw image to stdout

Notes

  • This is genuinely a platform capability, so I understand it may not be actionable in this repo directly — feel free to redirect me to the Console Feedback menu or to hcloud-go/the API team. I'd just like it tracked and the demand visible.
  • Not urgent and not a blocker — the workarounds above function. The ask is to make the common "my VM won't boot, show me the screen" case a one-liner instead of a custom VNC client. Thanks for considering it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions