e2e: abstract container behind backend interface (docker + hypeman)#273
Conversation
Introduce a Backend interface in server/e2e that captures the public surface the ~24 e2e_*_test.go files consume via *TestContainer (Start/Stop, the API/CDP/ChromeDriver endpoint accessors, API clients, Wait* helpers, Exec, ExitCh, Container). TestContainer is now a thin facade that delegates to a Backend selected at construction time. Two backends are provided: - dockerBackend: the historical testcontainers-go logic, moved verbatim behind the interface. Default, so existing CI is unchanged. - hypemanBackend: starts the image as a remote VM on a running Hypeman dev server via the github.com/kernel/hypeman-go client. Endpoints target the instance's network IP on the fixed guest ports (10001/9222/9224); Exec runs against the instance API server's /process/exec endpoint to preserve the (exitCode, combinedOutput, error) contract. Backend selection is via the KI_E2E_BACKEND env var (docker|hypeman, default docker). Hypeman connection details are read from env only and never hardcoded: KI_E2E_HYPEMAN_BASE_URL (or HYPEMAN_BASE_URL) and HYPEMAN_AUTH_TOKEN (or the SDK-native HYPEMAN_API_KEY). Optional GPU passthrough via KI_E2E_HYPEMAN_GPU_DEVICES and VM sizing via KI_E2E_HYPEMAN_SIZE. Test changes are minimal: six direct port-field accesses in two test files now use backend-agnostic accessors (CDPAddr, ChromeDriverURL, plus new ChromeDriverAddr/ChromeDriverWSURL helpers) instead of hardcoding 127.0.0.1:<port>, which only ever worked for the Docker backend. Added infra-free unit tests for backend selection and hypeman config validation. This unblocks running the e2e suite against the GPU image (chromium-headful-vgpu) from kernel-images-private via the hypeman backend. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
Addresses review feedback on the backend interface: - Remove Container() testcontainers.Container from the Backend interface (and the TestContainer facade). It leaked Docker-specifics into the otherwise backend-agnostic surface and was dead: no e2e test consumed it. The Docker backend keeps its *testcontainers.Container internally for Start/Exec. - Hypeman backend: reach instances via a single host-level wildcard ingress (find-or-create, keyed by tag managed-by=ki-e2e) instead of the instance's private network IP. Set KI_E2E_HYPEMAN_INGRESS_DOMAIN to route "<instance>-<role>.<domain>" through the host's reverse proxy to guest ports 10001/9222/9224; ingress is created at most once per host and never per instance. Unset = previous raw-IP behavior (needs L3 reachability to the instance subnet). KI_E2E_HYPEMAN_INGRESS_TLS toggles https/wss on :443. Verification: go build ./... and go vet ./e2e/ pass; new table tests cover raw-IP, ingress, and TLS endpoint derivation plus the shared-ingress params. Docker-backend e2e (TestDisplayResolutionChange + TestScreenshotHeadless) passes against onkernel/chromium-headful-private + chromium-headless-private. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… Start) Per review: Start() reading env vars is surprising and couples the backend to the process environment. Introduce hypemanConfig holding every option (BaseURL, Token, IngressDomain, IngressTLS, RawIP, Size, DiskIOBps, GPUDevices, GPUProfile). newHypemanBackend(image, cfg) and Start now consume only the struct — env parsing collapses to a single hypemanConfigFromEnv() called by the e2e factory, so other callers can populate options explicitly and never touch the environment. Also defaults DiskIOBps to 62MB/s (KI_E2E_HYPEMAN_DISK_IO_BPS overrides): ad-hoc hypeman instances otherwise get ~15MB/s, which starves the in-guest playwright daemon's cold first-read (~43MB of node_modules) past its 5s start budget. With 62MB/s the daemon starts in time — validated: persist_login TestCookiePersistence Headless now PASSES on hypeman (was failing on "playwright daemon failed to start within 5s"). go build/vet/unit pass (incl. new TestHypemanConfigFromEnv); live hypeman TestDisplayResolutionChange passes via the new construction path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ckend
Mirrors the `test` job but with KI_E2E_BACKEND=hypeman, pointing
E2E_CHROMIUM_*_IMAGE at the public onkernel/chromium-{headful,headless}:<sha>
tags that build-headful/build-headless just pushed. Hypeman pulls those images
itself on instance create, so the runner needs no docker login. Uses org
var/secret HYPEMAN_API_URL / HYPEMAN_API_KEY.
Note: we deliberately do NOT build the images inside Hypeman — its builder VM's
writable layer is RAM-backed and hard-capped at memory_mb=16384, which is too
small for the chromium image build (fails with "no space left on device"). The
registry-pull approach sidesteps that entirely. See PR description.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Created a monitoring plan for this PR. What this PR does: Adds a pluggable backend interface to the e2e test harness so browser instances can run on Hypeman remote VMs instead of (or alongside) local Docker containers, and wires up a new optional Intended effect:
Risks:
Status updates will be posted automatically on this PR as monitoring progresses. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: CDP version uses plain HTTP
- fetchBrowserWebSocketURL now derives /json/version from CDPURL and maps ws/wss to http/https so TLS ingress backends use HTTPS correctly.
Or push these changes by commenting:
@cursor push 42b24aa0dc
Preview (42b24aa0dc)
diff --git a/server/e2e/e2e_cdp_reconnect_test.go b/server/e2e/e2e_cdp_reconnect_test.go
--- a/server/e2e/e2e_cdp_reconnect_test.go
+++ b/server/e2e/e2e_cdp_reconnect_test.go
@@ -456,7 +456,20 @@
}
func fetchBrowserWebSocketURL(ctx context.Context, c *TestContainer) (string, error) {
- versionURL := fmt.Sprintf("http://%s/json/version", c.CDPAddr())
+ versionEndpoint, err := url.Parse(c.CDPURL())
+ if err != nil {
+ return "", err
+ }
+ switch versionEndpoint.Scheme {
+ case "ws":
+ versionEndpoint.Scheme = "http"
+ case "wss":
+ versionEndpoint.Scheme = "https"
+ }
+ versionEndpoint.Path = "/json/version"
+ versionEndpoint.RawQuery = ""
+ versionEndpoint.Fragment = ""
+ versionURL := versionEndpoint.String()
req, err := http.NewRequestWithContext(ctx, http.MethodGet, versionURL, nil)
if err != nil {
return "", errYou can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 32c997e. Configure here.
|
Bugbot Autofix prepared fixes for both issues found in the latest run.
Or push these changes by commenting: Preview (78ab86654d)diff --git a/server/e2e/backend_hypeman.go b/server/e2e/backend_hypeman.go
--- a/server/e2e/backend_hypeman.go
+++ b/server/e2e/backend_hypeman.go
@@ -257,22 +257,36 @@
}
c.instanceID = inst.ID
+ cleanupOnError := func(startErr error) error {
+ cleanupCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+ defer cancel()
+ if err := c.client.Instances.Delete(cleanupCtx, c.instanceID); err != nil {
+ return fmt.Errorf("%w (cleanup failed for instance %s: %v)", startErr, c.instanceID, err)
+ }
+ c.instanceID = ""
+ c.ip = ""
+ return startErr
+ }
+
// Wait for the guest program to start. The SDK caps the server-side wait at
// a few minutes; loop until our context deadline if needed.
if err := c.waitForRunning(ctx); err != nil {
- return err
+ return cleanupOnError(err)
}
if c.useIngress {
// Ensure the wildcard ingress rules exist; endpoints derive from the
// instance name + domain, so no instance IP is needed.
- return c.ensureIngress(ctx)
+ if err := c.ensureIngress(ctx); err != nil {
+ return cleanupOnError(err)
+ }
+ return nil
}
// Raw-IP fallback: reach the instance directly on its private network IP.
ip, err := c.resolveIP(ctx)
if err != nil {
- return err
+ return cleanupOnError(err)
}
c.ip = ip
return nil
diff --git a/server/e2e/container.go b/server/e2e/container.go
--- a/server/e2e/container.go
+++ b/server/e2e/container.go
@@ -2,6 +2,7 @@
import (
"context"
+ "net/url"
"strings"
"testing"
@@ -68,12 +69,30 @@
// derived from ChromeDriverURL (without scheme). Useful for substring assertions
// on proxy-rewritten URLs.
func (c *TestContainer) ChromeDriverAddr() string {
- return strings.TrimPrefix(c.backend.ChromeDriverURL(), "http://")
+ u, err := url.Parse(c.backend.ChromeDriverURL())
+ if err == nil && u.Host != "" {
+ return u.Host
+ }
+
+ addr := strings.TrimPrefix(c.backend.ChromeDriverURL(), "http://")
+ return strings.TrimPrefix(addr, "https://")
}
// ChromeDriverWSURL returns the WebSocket URL (ws://host:port/path) for the
// instance's ChromeDriver proxy. path should include a leading slash.
func (c *TestContainer) ChromeDriverWSURL(path string) string {
+ u, err := url.Parse(c.backend.ChromeDriverURL())
+ if err == nil && u.Host != "" {
+ if u.Scheme == "https" {
+ u.Scheme = "wss"
+ } else {
+ u.Scheme = "ws"
+ }
+ u.Path = path
+ u.RawQuery = ""
+ u.Fragment = ""
+ return u.String()
+ }
return "ws://" + c.ChromeDriverAddr() + path
}
diff --git a/server/e2e/e2e_cdp_reconnect_test.go b/server/e2e/e2e_cdp_reconnect_test.go
--- a/server/e2e/e2e_cdp_reconnect_test.go
+++ b/server/e2e/e2e_cdp_reconnect_test.go
@@ -456,12 +456,24 @@
}
func fetchBrowserWebSocketURL(ctx context.Context, c *TestContainer) (string, error) {
- versionURL := fmt.Sprintf("http://%s/json/version", c.CDPAddr())
- req, err := http.NewRequestWithContext(ctx, http.MethodGet, versionURL, nil)
+ versionURL, err := url.Parse(c.CDPURL())
if err != nil {
return "", err
}
+ if versionURL.Scheme == "wss" {
+ versionURL.Scheme = "https"
+ } else {
+ versionURL.Scheme = "http"
+ }
+ versionURL.Path = "/json/version"
+ versionURL.RawQuery = ""
+ versionURL.Fragment = ""
+ req, err := http.NewRequestWithContext(ctx, http.MethodGet, versionURL.String(), nil)
+ if err != nil {
+ return "", err
+ }
+
resp, err := http.DefaultClient.Do(req)
if err != nil {
return "", errYou can send follow-ups to the cloud agent here. |
…rgets The test-hypeman job ran `make test`, which runs unit tests first — a flaky chromium-dependent unit test (lib/devtoolsproxy, unrelated to the backend) failed and blocked the e2e suite from running at all. Split `test` into `test-unit` + `test-e2e` and point the hypeman job at `test-e2e` so it exercises only the e2e suite on the Hypeman backend (unit tests already run in the `test` job). The var/secret fix is confirmed working — the prior config error is gone. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bugbot (correctly): after Instances.New, a failure in waitForRunning/ ensureIngress/resolveIP returned from Start without deleting the instance, and tests only register Stop after a successful Start — so failed runs leaked a remote VM. Start now tears the instance down (fresh ctx, so a cancelled/expired parent ctx still deletes) if bring-up fails. Defense in depth for the cases Start can't cover (panic/timeout/crashed runner after a successful Start): tag instances managed-by=ki-e2e on create, and add a nightly workflow (hypeman-reap-e2e.yml) that deletes "ki-e2e-" instances older than 3h (> the 2h e2e timeout, so it can't touch an in-progress run). One reaper covers instances from both this repo and the private fork (shared dev server). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t_ready) A freshly-pushed image tag isn't on the hypeman host yet on first use; the create call returns a retryable 400 image_not_ready while the pull runs in the background. Poll Instances.New until the pull completes or ctx is done, instead of failing the first test that uses a new tag. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rivate) The hypeman e2e backend lives upstream here, but actually *running* it against the staging hypeman server is moving to kernel-images-private on a Tailscale- joined runner: CDP/ChromeDriver are being made tailnet-only, kernel-images is public (its CI logs would leak live instance CDP URLs), and self-hosted/tailnet runners shouldn't be exposed to a public repo. The public CI keeps the docker- backend e2e only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>


Summary
Abstracts the e2e browser instance behind a
Backendinterface inserver/e2ewith two interchangeable implementations selected byKI_E2E_BACKEND(defaultdocker, so existing CI is unchanged):testcontainers-gologic, moved behind the interface.github.com/kernel/hypeman-go, reaching it through the host's wildcard ingress (hostname{instance}.<domain>, routed by listen port, TLS-terminated).apireuses the host's existing444→10001browser ingress;cdp 9222/cd 9224are find-or-created once per host (matched by rule shape across all ingresses, never per-instance). Domain is derived from the base URL (KI_E2E_HYPEMAN_INGRESS_DOMAINoverrides);KI_E2E_HYPEMAN_RAW_IP=1falls back to the instance's private IP.The ~24
e2e_*_test.gofiles keep using*TestContainerunchanged; it's now a thin facade over the selectedBackend.Config / env
KI_E2E_BACKEND=docker|hypemanHYPEMAN_BASE_URL+HYPEMAN_API_KEY(orKI_E2E_HYPEMAN_BASE_URL/HYPEMAN_AUTH_TOKEN); optionalKI_E2E_HYPEMAN_INGRESS_DOMAIN,KI_E2E_HYPEMAN_INGRESS_TLS(default on),KI_E2E_HYPEMAN_RAW_IP,KI_E2E_HYPEMAN_GPU_PROFILE(vGPU images),KI_E2E_HYPEMAN_SIZE,KI_E2E_HYPEMAN_DISK_IO_BPS(default62MB/s).hypemanConfigFromEnv):newHypemanBackend(image, cfg)andStartconsume an explicithypemanConfig, so the backend can be constructed programmatically with explicit options and never touches the env. Secrets are referenced by env-var name only, never hardcoded.Review feedback addressed
Container() testcontainers.Containerfrom the interface + facade — it leaked Docker specifics and was dead (no test used it). The Docker backend keeps*testcontainers.Containerinternally.HostAccess— reframed from "Docker host.docker.internal (Docker backend only)" to a backend-agnostic capability ("reach a service on the test host"); the Docker backend mapshost.docker.internal, the hypeman backend rejects it explicitly (no silent no-op) since a remote VM has no host-loopback bridge. Used by the private capmonster / persisted-login tests, which therefore stay on the Docker backend.Start()no longer reads the environment — introduced ahypemanConfigstruct holding every option (base URL, token, ingress domain/TLS, raw-IP, size, disk-IO, GPU devices/profile). Env parsing collapses to a singlehypemanConfigFromEnv()called by the e2e factory; the backend andStartconsume only the struct.DiskIoBpsto62MB/s(KI_E2E_HYPEMAN_DISK_IO_BPSoverrides). Ad-hoc Hypeman instances otherwise get ~15 MB/s, which starves a playwright-daemon-dependent test's cold first-read (~43 MB ofnode_modules) past the server's 5 s daemon-start budget; at 62 MB/s the daemon starts in time. (Validated cross-repo in the private-fork mirror kernel/kernel-images-private#226, where the cookie-persistence e2e — which exercisesExecutePlaywrightCode— now passes on Hypeman; it previously failed onplaywright daemon failed to start within 5s.)Verification (both backends exercised end-to-end)
Build/vet/unit:
go build ./...,go vet ./e2e/clean; table tests cover backend selection, raw-IP vs ingress vs TLS endpoint derivation, the per-role ingress params, domain derivation, the HostAccess rejection, andhypemanConfigFromEnvmapping (SDK-native fallbacks, TLS default, comma-split GPU devices). The live Hypeman smoke ran via the newhypemanConfigFromEnv → newHypemanBackend(image, cfg)construction path.Docker backend — PASS (
onkernel/chromium-headful-private+chromium-headless-private):Hypeman backend — PASS against the live staging dev server (
https://hypeman.dev-yul-hypeman-1.kernel.sh):This created a real instance, reused the
:444→10001ingress + createdki-e2e-cdp/ki-e2e-cdonce, then drovePATCH /display(1024→1920×1080→1280×720) and verified Xvfb resolution via the API server +Exec, all over the TLS ingress. Instance + behavior confirmed; created ingresses persist for reuse, instances are cleaned up onStop.GPU (vGPU image):
KI_E2E_HYPEMAN_GPU_PROFILElets the backend bootchromium-headful-vgpu; the GPU-specific tests live in the private fork. They currently boot the vGPU instance to Running but its in-guest API needs the production GPU/Neko/NVIDIA-licensing env to become ready — tracked there.Unblocks running the public e2e suite against the GPU image from kernel-images-private via the hypeman backend.
CI: running e2e against the Hypeman backend
Added a
test-hypemanjob toserver-test.yamlthat runs the same suite withKI_E2E_BACKEND=hypeman. It reuses the publiconkernel/chromium-{headful,headless}:<sha>images thatbuild-headful/build-headlesspush to Docker Hub — Hypeman pulls them itself on instance-create (any registry works via the host's docker creds; validated: the e2e suite already runs on Hypeman against a privateonkernel/chromium-headless-privatetag). Uses org variableHYPEMAN_BASE_URL+ secretHYPEMAN_API_KEY. The runner needs no docker login.This is the first full-suite run on the Hypeman backend; individual tests may still need backend-specific fixes, so it's reasonable to keep this check non-required in branch protection until it's consistently green.
Conceded: building images inside Hypeman (local-dev iteration) is blocked
The original goal also included a "build a local Dockerfile in Hypeman" path for local dev (edit Dockerfile → build in Hypeman → run e2e against it, without pushing to a registry). This is currently blocked and intentionally not implemented, because:
POST /builds, async, usable asInstanceNewParams.Image— verified end-to-end with a trivial image), butmemory_mb=16384(the API rejects more withmemory_mb exceeds maximum of 16384 MB), and 16 GB is not enough for the chromium image build. Measured scaling on the headless Dockerfile: 2 GB → apt fails at ~18 s, 8 GB → ~28 s, 16 GB → apt passes but the concurrent Go-module/node stages then fail withno space left on device.So building chromium in Hypeman needs a server-side change (a builder disk-size param decoupled from memory, a higher cap, or a pre-populated
global_cache_keyfor the heavy base layers — the param exists with"ubuntu"/"browser"as documented example keys, which looks like the intended scaling path). Until then:Note
Medium Risk
Large new e2e infrastructure path (remote VM lifecycle, ingress mutation, secrets via env) with moderate blast radius if misconfigured in CI, though default Docker behavior is preserved.
Overview
Introduces a pluggable
Backendfor browser e2e instances so the same*TestContainerAPI can run against Docker (testcontainers, default) or remote Hypeman VMs viaKI_E2E_BACKEND. Existing e2e tests stay onNewTestContainer/Start/Stop; Docker logic moves todockerBackend, and a newhypemanBackendprovisions VMs withhypeman-go, ingress or raw-IP routing, image-pull retries, and cleanup on failed bring-up.Hypeman-specific behavior: env is parsed once into
hypemanConfig;HostAccessis rejected (no host loopback bridge); default disk I/O is62MB/s; wildcard ingress rules are find-or-created per host.TestContainerdrops direct testcontainers exposure and adds helpers (ChromeDriverAddr,ChromeDriverWSURL) so BiDi/CDP tests use backend-derived URLs instead of hardcoded ports.Makefile:
testsplits intotest-unitandtest-e2efor CI jobs that only run e2e. Deps: addsgithub.com/kernel/hypeman-goplus unit tests inbackend_test.gofor backend selection and Hypeman endpoint/ingress logic.Reviewed by Cursor Bugbot for commit 810889c. Bugbot is set up for automated code reviews on this repo. Configure here.