feat: reuse mock-LLM E2E tests for Docker image validation by malhotra5 · Pull Request #992 · OpenHands/agent-canvas

malhotra5 · 2026-06-01T17:03:37Z

A human has tested these changes.

Why

Related to #511

The mock-LLM E2E tests currently only validate the npm build path (bin/agent-canvas.mjs + uvx). The Docker all-in-one image (ghcr.io/openhands/agent-canvas) has no automated behavioral validation — a broken entrypoint, misconfigured proxy route, or missing dependency would only be caught manually. The test specs and helpers are already well-factored and infrastructure-agnostic, so reusing them for Docker validation is straightforward.

Summary

Split MOCK_LLM_BASE_URL into test-facing (MOCK_LLM_BASE_URL) and agent-facing (MOCK_LLM_AGENT_URL) constants for Docker networking compatibility
Added playwright.mock-llm-docker.config.ts that launches a Docker container (--network host) instead of bin/agent-canvas.mjs, pointing at the exact same test specs
Added CI workflow (.github/workflows/mock-llm-docker-e2e.yml) that chains off the existing Docker CI via workflow_run — pulls the already-built image from GHCR (no rebuild), runs tests against it, posts PR comment

Issue Number

N/A

How to Test

npm path (unchanged behavior):

npm run build:app
npm run test:e2e:mock-llm

Docker path (new):

# Build or pull a Docker image
docker build -f docker/Dockerfile \
  --build-arg AGENT_SERVER_IMAGE=ghcr.io/openhands/agent-server:1.24.0-python \
  --build-arg AUTOMATION_VERSION=1.0.0a5 \
  -t agent-canvas-test:local .

# Run the same test specs against Docker
MOCK_LLM_DOCKER_IMAGE=agent-canvas-test:local npm run test:e2e:mock-llm:docker

CI: The Docker E2E workflow triggers automatically after the existing Docker workflow completes successfully (via workflow_run). Can also be triggered manually via workflow_dispatch with a custom image tag.

Type

Notes

The npm path is fully backward-compatible — MOCK_LLM_AGENT_URL defaults to MOCK_LLM_BASE_URL when not set
Docker config uses --network host (Linux-only). For macOS/Windows Docker Desktop, set MOCK_LLM_AGENT_URL=http://host.docker.internal:9999
The CI workflow does NOT rebuild the Docker image — it chains off the existing Docker workflow and pulls the already-pushed sha-<short>-amd64 tag from GHCR
workflow_dispatch still available for testing specific image versions manually
render-mock-llm-report.mjs now accepts --title flag to differentiate Docker vs npm reports

This PR was created by an AI agent (OpenHands) on behalf of the user.

🐳 Docker images for this PR

• GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component	Value
Image	`ghcr.io/openhands/agent-canvas`
Architectures	amd64, arm64
Agent Server	`ghcr.io/openhands/agent-server:1.24.0-python`
Automation	`openhands-automation==1.0.0a5`
Commit	`7b048d3246efbd60aba89133dedb322cb8f89b42`

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-7b048d3

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-7b048d3

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-7b048d3-amd64
ghcr.io/openhands/agent-canvas:feat-mock-llm-docker-e2e-amd64
ghcr.io/openhands/agent-canvas:pr-992-amd64
ghcr.io/openhands/agent-canvas:sha-7b048d3-arm64
ghcr.io/openhands/agent-canvas:feat-mock-llm-docker-e2e-arm64
ghcr.io/openhands/agent-canvas:pr-992-arm64
ghcr.io/openhands/agent-canvas:sha-7b048d3
ghcr.io/openhands/agent-canvas:feat-mock-llm-docker-e2e
ghcr.io/openhands/agent-canvas:pr-992

About Multi-Architecture Support

Each tag (e.g., sha-7b048d3) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., sha-7b048d3-amd64) are also available if needed

Add a Docker-specific Playwright config (playwright.mock-llm-docker.config.ts) that runs the exact same test specs and helpers against the agent-canvas Docker image instead of the npm build path (bin/agent-canvas.mjs + uvx). Key changes: - Split MOCK_LLM_BASE_URL into two constants in mock-llm-helpers.ts: - MOCK_LLM_BASE_URL: always host-local, used by tests for admin API - MOCK_LLM_AGENT_URL: env-overridable, used when configuring the LLM profile (the URL the agent-server uses for inference). Defaults to MOCK_LLM_BASE_URL for backward compatibility with the npm path. - New playwright.mock-llm-docker.config.ts: - Starts the mock LLM server on the host (same as npm path) - Runs the Docker container with --network host (Linux CI) - Points to the same testDir (tests/e2e/mock-llm/) and specs - Separate output dirs to avoid collision with npm path results - New CI workflow (.github/workflows/mock-llm-docker-e2e.yml): - Builds the Docker image from current code (or uses a pre-built image) - Runs the same specs against the container - Posts PR comment with differentiated report title - render-mock-llm-report.mjs: accept --title flag for Docker vs npm reports - npm run test:e2e:mock-llm:docker script added - .gitignore updated for docker test output dirs The npm path (test:e2e:mock-llm) is fully backward-compatible — no env var override needed since MOCK_LLM_AGENT_URL defaults to MOCK_LLM_BASE_URL. Co-authored-by: openhands <openhands@all-hands.dev>

vercel · 2026-06-01T17:03:44Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-canvas	Ready	Preview, Comment	Jun 1, 2026 7:00pm

github-actions · 2026-06-01T17:09:00Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: 873eeafd · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	285ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T17:11:24Z

❌ Mock-LLM Docker E2E Test Results

5/7 passed · 1 failed · 1 skipped

Commit: 873eeafd · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	550ms
❌	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	6.7s
⏭️	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	0ms
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

🔍 Failure details (1)

❌ mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI

Error: Expected at least 1 automation, got: {"automations":[],"total":0}

expect(received).toBeGreaterThanOrEqual(expected)

Expected: >= 1
Received:    0

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

Instead of rebuilding the Docker image in the E2E workflow (duplicating ~10-15 min of Docker build time), use workflow_run to trigger automatically after the existing 'Docker' workflow completes successfully. The workflow now: - Triggers on: workflow_run (Docker completed) + workflow_dispatch (manual) - Derives the image tag from the Docker build's commit SHA (ghcr.io/openhands/agent-canvas:sha-<short>-amd64) - Pulls the already-built image from GHCR — no rebuild needed - Checks out code at the same SHA as the Docker build - Extracts PR number from workflow_run.pull_requests[] for comments Removed: Docker build steps, Buildx setup, build-arg resolution. All image building stays in docker.yml where it belongs. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T17:14:49Z

❌ Mock-LLM E2E Tests

4/7 passed · 1 failed · 2 skipped

Commit: d0d30867 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	225ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	24.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.1s
❌	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	5.1s
⏭️	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	0ms
⏭️	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	0ms

🔍 Failure details (1)

❌ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API

Error: Profile "mock-llm-e2e" should have an "Active" badge

expect(received).toBe(expected) // Object.is equality

Expected: true
Received: false

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

The 'Active badge' check in step 2 used a hardcoded 1-second waitForTimeout before reloading. On a loaded CI runner the profile activation mutation may not persist in time, causing the reload to show stale state. This is a pre-existing flake (identical test code passed on the first push and failed on the second). Replace with expect.poll() that retries the reload+check cycle with increasing intervals (1s, 2s, 3s) up to 15 seconds total. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T17:20:35Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: efd67c83 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	276ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	24.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

workflow_run only fires when the workflow file exists on the default branch (main). Since mock-llm-docker-e2e.yml is new and only on the PR branch, GitHub doesn't recognize it as a workflow_run listener yet. Add pull_request trigger (gated by 'e2e-tests' label, skip forks) that polls the Docker workflow via gh API until it completes for the PR's head SHA, then pulls the already-built image from GHCR and runs tests. After merge, workflow_run takes over as the primary automatic trigger. The pull_request path remains as a fallback for label-gated runs. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T17:30:14Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: d8a44260 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	241ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	24.6s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.6s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	5.4s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T17:34:35Z

❌ Mock-LLM Docker E2E Test Results

5/7 passed · 1 failed · 1 skipped

Commit: d8a44260 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	590ms
❌	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	6.6s
⏭️	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	0ms
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

🔍 Failure details (1)

❌ mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI

Error: Expected at least 1 automation, got: {"automations":[],"total":0}

expect(received).toBeGreaterThanOrEqual(expected)

Expected: >= 1
Received:    0

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T17:43:54Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: 19624954 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	262ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.0s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T17:48:08Z

❌ Mock-LLM Docker E2E Test Results

5/7 passed · 1 failed · 1 skipped

Commit: 19624954 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	585ms
❌	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	6.7s
⏭️	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	0ms
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.8s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

🔍 Failure details (1)

❌ mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI

Error: Expected at least 1 automation, got: {"automations":[],"total":0}

expect(received).toBeGreaterThanOrEqual(expected)

Expected: >= 1
Received:    0

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

…o Docker entrypoint The Docker entrypoint was missing several environment variables that the npm path (dev-with-automation.mjs) sets for the automation backend: - FILE_STORE=local — without this, the automation backend may fall back to cloud storage (S3/GCS) which fails without credentials, causing tarball- based presets (preset/prompt, preset/plugin) to silently error - LOCAL_STORAGE_PATH — where to store files on the local filesystem - AUTOMATION_BASE_URL — publicly-reachable base URL for callback URLs - AUTOMATION_WORKSPACE_BASE — where automation runs unpack tarballs This explains the Docker E2E failure: the agent's curl to create an automation via /api/automation/v1/preset/prompt returned an error (likely 500 from missing storage config), but the mock LLM doesn't care about terminal output and proceeded to return the scripted final reply. The test then found 0 automations. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T18:01:34Z

✅ Mock-LLM E2E Tests

7/7 passed

Commit: 589800f2 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	236ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.6s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

# Conflicts: # tests/e2e/mock-llm/utils/mock-llm-helpers.ts

github-actions · 2026-06-01T18:04:49Z

⚠️ Mock-LLM Docker E2E Test Results

0/0 passed

Commit: 589800f2 · Workflow run

Status	Test	Duration

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T18:07:45Z

✅ Mock-LLM E2E Tests

12/12 passed

Commit: dc7307ff · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	215ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	23.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	7.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T18:12:13Z

❌ Mock-LLM Docker E2E Test Results

8/12 passed · 1 failed · 3 skipped

Commit: dc7307ff · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.8s
❌	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	209ms
⏭️	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	0ms
⏭️	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	0ms
⏭️	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	0ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	524ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	33.9s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

🔍 Failure details (1)

❌ mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured

Error: page.goto: net::ERR_CONNECTION_REFUSED at http://localhost:18301/
Call log:
  - navigating to "http://localhost:18301/", waiting until "domcontentloaded"

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

The mock-llm-auth-modes.spec.ts tests npm-binary-specific --auth-required behaviour (a second static-server instance on port 18301). The Docker image doesn't provide this second server — it has its own auth handling. Exclude the spec from the Docker test run via testIgnore. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T18:17:25Z

✅ Mock-LLM E2E Tests

12/12 passed

Commit: 8ad87b1d · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.1s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	214ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	23.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

Instead of excluding the auth-modes spec from the Docker E2E run or spinning up a host-side static server with a duplicate build/ directory, the Docker entrypoint now supports an optional PUBLIC_MODE_PORT env var. When set, entrypoint.sh starts a second static-server instance from the same baked-in frontend assets with --auth-required (no session key injected). This tests the actual Docker image's auth gate behaviour — not a host-side approximation. The Playwright Docker config passes -e PUBLIC_MODE_PORT=18301 to the container and exports MOCK_LLM_PUBLIC_MODE_URL so the auth-modes spec can reach it. With --network host the port is accessible from the host. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T18:20:30Z

⚠️ Mock-LLM Docker E2E Test Results

0/0 passed

Commit: 8ad87b1d · Workflow run

Status	Test	Duration

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T18:23:14Z

✅ Mock-LLM E2E Tests

12/12 passed

Commit: 12818344 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	213ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T18:28:25Z

✅ Mock-LLM Docker E2E Test Results

12/12 passed

Commit: 12818344 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	461ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	28.8s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.9s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot · 2026-06-01T18:31:05Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

Summary

Solid PR that cleanly extends the existing mock-LLM E2E test infrastructure to validate the Docker all-in-one image, reusing specs and helpers with minimal changes. The MOCK_LLM_BASE_URL → MOCK_LLM_AGENT_URL abstraction is the right design for Docker networking, the three-trigger CI chain (workflow_run / pull_request / workflow_dispatch) handles the ordering constraint elegantly, and the expect.poll refactor in the spec is a genuine CI robustness improvement.

A few items worth addressing before merge, noted inline.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

…es, document env vars - Drop 'unlabeled' from pull_request trigger types to avoid wasted workflow runs when any label is removed (the job-level if: condition would skip immediately anyway) - Distinguish 'no Docker run found' vs 'didn't complete in time' in the polling loop's final error message - Add comment explaining /api/automation/v1 probe returns 200 without auth so the readiness check won't spin for 180s - Document FILE_STORE, LOCAL_STORAGE_PATH, AUTOMATION_BASE_URL, and AUTOMATION_WORKSPACE_BASE in the entrypoint header — these affect production deployments, not just E2E tests Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T18:46:59Z

✅ Mock-LLM E2E Tests

12/12 passed

Commit: 422c1567 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.6s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	205ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	25.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T18:51:44Z

✅ Mock-LLM Docker E2E Test Results

12/12 passed

Commit: 422c1567 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	463ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	29.8s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.8s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

tofarr

🍰

github-actions · 2026-06-01T19:02:25Z

✅ Mock-LLM E2E Tests

12/12 passed

Commit: 7b048d32 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	217ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	32.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	4.4s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T19:07:09Z

✅ Mock-LLM Docker E2E Test Results

12/12 passed

Commit: 7b048d32 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.3s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	509ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	27.7s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T19:08:35Z

📸 Snapshot Test Report

✅ All snapshots match the main branch baselines.

Category	Count
🔴 Changed	0
🆕 New	0
✅ Unchanged	73
Total	73

✅ Unchanged snapshots (73)

archived-conversation

conversation-panel-with-archived-badges
conversation-view-archived
conversation-view-sandbox-error

automations

automations-delete-modal
automations-list-active-inactive
automations-no-automations
automations-search-no-results

backends-extended

backend-add-blank-disabled
backend-add-cloud-advanced-open
backend-add-cloud-no-key-disabled
backend-add-cloud-with-key-enabled
backend-add-form-partially-filled
backend-add-invalid-url-disabled
backend-add-local-ready
backend-add-name-only-disabled
backend-add-two-column-layout
backend-add-whitespace-host-disabled
backend-after-switch
backend-cancel-nothing-saved
backend-dropdown-two-backends
backend-edit-prefilled
backend-manage-after-removal
backend-manage-two-listed
backend-remove-cancelled
backend-remove-confirmation
backend-switch-overlay

backends

backend-add-modal
backend-manage-modal
backend-selector-open

changes-tab

changes-deleted-file
changes-diff-viewer
changes-empty

collapsible-thinking

reasoning-content-collapsed
reasoning-content-expanded
think-action-collapsed
think-action-expanded

mcp-page

mcp-custom-server-1-editor-open
mcp-custom-server-2-url-filled
mcp-custom-server-3-all-filled
mcp-custom-server-4-installed
mcp-custom-server-editor
mcp-empty-installed
mcp-search-filtered
mcp-slack-install-1-marketplace
mcp-slack-install-2-modal
mcp-slack-install-3-filled
mcp-slack-install-4-installed

onboarding

onboarding-step-0-choose-agent
onboarding-step-1-check-backend
onboarding-step-2-setup-llm
onboarding-step-3-say-hello

projects-workspace-browser

projects-workspace-browser

settings-page

add-backend-modal
analytics-consent-modal
home-screen
settings-app-page
settings-page

settings-secrets

secrets-add-form-filled
secrets-add-form
secrets-after-save
secrets-delete-confirm
secrets-list

settings-verification

condenser-settings
verification-settings-off
verification-settings-on

sidebar

sidebar-collapsed
sidebar-conversation-panel
sidebar-filter-menu

skills-page

skills-empty
skills-loaded
skills-no-match
skills-search-filtered
skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

* feat: reuse mock-LLM E2E tests for Docker image validation Add a Docker-specific Playwright config (playwright.mock-llm-docker.config.ts) that runs the exact same test specs and helpers against the agent-canvas Docker image instead of the npm build path (bin/agent-canvas.mjs + uvx). Key changes: - Split MOCK_LLM_BASE_URL into two constants in mock-llm-helpers.ts: - MOCK_LLM_BASE_URL: always host-local, used by tests for admin API - MOCK_LLM_AGENT_URL: env-overridable, used when configuring the LLM profile (the URL the agent-server uses for inference). Defaults to MOCK_LLM_BASE_URL for backward compatibility with the npm path. - New playwright.mock-llm-docker.config.ts: - Starts the mock LLM server on the host (same as npm path) - Runs the Docker container with --network host (Linux CI) - Points to the same testDir (tests/e2e/mock-llm/) and specs - Separate output dirs to avoid collision with npm path results - New CI workflow (.github/workflows/mock-llm-docker-e2e.yml): - Builds the Docker image from current code (or uses a pre-built image) - Runs the same specs against the container - Posts PR comment with differentiated report title - render-mock-llm-report.mjs: accept --title flag for Docker vs npm reports - npm run test:e2e:mock-llm:docker script added - .gitignore updated for docker test output dirs The npm path (test:e2e:mock-llm) is fully backward-compatible — no env var override needed since MOCK_LLM_AGENT_URL defaults to MOCK_LLM_BASE_URL. Co-authored-by: openhands <openhands@all-hands.dev> * refactor: chain Docker E2E off existing Docker CI via workflow_run Instead of rebuilding the Docker image in the E2E workflow (duplicating ~10-15 min of Docker build time), use workflow_run to trigger automatically after the existing 'Docker' workflow completes successfully. The workflow now: - Triggers on: workflow_run (Docker completed) + workflow_dispatch (manual) - Derives the image tag from the Docker build's commit SHA (ghcr.io/openhands/agent-canvas:sha-<short>-amd64) - Pulls the already-built image from GHCR — no rebuild needed - Checks out code at the same SHA as the Docker build - Extracts PR number from workflow_run.pull_requests[] for comments Removed: Docker build steps, Buildx setup, build-arg resolution. All image building stays in docker.yml where it belongs. Co-authored-by: openhands <openhands@all-hands.dev> * fix: replace flaky 1s timeout with polling for Active badge assertion The 'Active badge' check in step 2 used a hardcoded 1-second waitForTimeout before reloading. On a loaded CI runner the profile activation mutation may not persist in time, causing the reload to show stale state. This is a pre-existing flake (identical test code passed on the first push and failed on the second). Replace with expect.poll() that retries the reload+check cycle with increasing intervals (1s, 2s, 3s) up to 15 seconds total. Co-authored-by: openhands <openhands@all-hands.dev> * fix: add pull_request trigger for Docker E2E (workflow_run bootstrap) workflow_run only fires when the workflow file exists on the default branch (main). Since mock-llm-docker-e2e.yml is new and only on the PR branch, GitHub doesn't recognize it as a workflow_run listener yet. Add pull_request trigger (gated by 'e2e-tests' label, skip forks) that polls the Docker workflow via gh API until it completes for the PR's head SHA, then pulls the already-built image from GHCR and runs tests. After merge, workflow_run takes over as the primary automatic trigger. The pull_request path remains as a fallback for label-gated runs. Co-authored-by: openhands <openhands@all-hands.dev> * fix: add FILE_STORE, AUTOMATION_BASE_URL, AUTOMATION_WORKSPACE_BASE to Docker entrypoint The Docker entrypoint was missing several environment variables that the npm path (dev-with-automation.mjs) sets for the automation backend: - FILE_STORE=local — without this, the automation backend may fall back to cloud storage (S3/GCS) which fails without credentials, causing tarball- based presets (preset/prompt, preset/plugin) to silently error - LOCAL_STORAGE_PATH — where to store files on the local filesystem - AUTOMATION_BASE_URL — publicly-reachable base URL for callback URLs - AUTOMATION_WORKSPACE_BASE — where automation runs unpack tarballs This explains the Docker E2E failure: the agent's curl to create an automation via /api/automation/v1/preset/prompt returned an error (likely 500 from missing storage config), but the mock LLM doesn't care about terminal output and proceeded to return the scripted final reply. The test then found 0 automations. Co-authored-by: openhands <openhands@all-hands.dev> * fix: exclude auth-modes spec from Docker E2E tests The mock-llm-auth-modes.spec.ts tests npm-binary-specific --auth-required behaviour (a second static-server instance on port 18301). The Docker image doesn't provide this second server — it has its own auth handling. Exclude the spec from the Docker test run via testIgnore. Co-authored-by: openhands <openhands@all-hands.dev> * feat: run auth-modes tests inside Docker via PUBLIC_MODE_PORT Instead of excluding the auth-modes spec from the Docker E2E run or spinning up a host-side static server with a duplicate build/ directory, the Docker entrypoint now supports an optional PUBLIC_MODE_PORT env var. When set, entrypoint.sh starts a second static-server instance from the same baked-in frontend assets with --auth-required (no session key injected). This tests the actual Docker image's auth gate behaviour — not a host-side approximation. The Playwright Docker config passes -e PUBLIC_MODE_PORT=18301 to the container and exports MOCK_LLM_PUBLIC_MODE_URL so the auth-modes spec can reach it. With --network host the port is accessible from the host. Co-authored-by: openhands <openhands@all-hands.dev> * address review feedback: drop unlabeled trigger, improve error messages, document env vars - Drop 'unlabeled' from pull_request trigger types to avoid wasted workflow runs when any label is removed (the job-level if: condition would skip immediately anyway) - Distinguish 'no Docker run found' vs 'didn't complete in time' in the polling loop's final error message - Add comment explaining /api/automation/v1 probe returns 200 without auth so the readiness check won't spin for 180s - Document FILE_STORE, LOCAL_STORAGE_PATH, AUTOMATION_BASE_URL, and AUTOMATION_WORKSPACE_BASE in the entrypoint header — these affect production deployments, not just E2E tests Co-authored-by: openhands <openhands@all-hands.dev> --------- Co-authored-by: openhands <openhands@all-hands.dev>

vercel Bot deployed to Preview June 1, 2026 17:04 View deployment

malhotra5 added the e2e-tests Triggers mock-LLM E2E tests on PRs label Jun 1, 2026

malhotra5 marked this pull request as ready for review June 1, 2026 17:06

vercel Bot deployed to Preview June 1, 2026 17:06 View deployment

malhotra5 force-pushed the feat/mock-llm-docker-e2e branch from 873eeaf to d0d3086 Compare June 1, 2026 17:12

vercel Bot deployed to Preview June 1, 2026 17:12 View deployment

github-actions Bot added a commit that referenced this pull request Jun 1, 2026

snapshot images for PR #992 run 26770048633

b436943

vercel Bot deployed to Preview June 1, 2026 17:18 View deployment

vercel Bot deployed to Preview June 1, 2026 17:28 View deployment

Merge remote-tracking branch 'origin/main' into feat/mock-llm-docker-e2e

1962495

vercel Bot deployed to Preview June 1, 2026 17:42 View deployment

vercel Bot deployed to Preview June 1, 2026 17:59 View deployment

Merge remote-tracking branch 'origin/main' into feat/mock-llm-docker-e2e

dc7307f

# Conflicts: # tests/e2e/mock-llm/utils/mock-llm-helpers.ts

vercel Bot deployed to Preview June 1, 2026 18:05 View deployment

vercel Bot deployed to Preview June 1, 2026 18:15 View deployment

vercel Bot deployed to Preview June 1, 2026 18:21 View deployment

malhotra5 requested a review from all-hands-bot June 1, 2026 18:30

all-hands-bot reviewed Jun 1, 2026

View reviewed changes

vercel Bot deployed to Preview June 1, 2026 18:44 View deployment

tofarr approved these changes Jun 1, 2026

View reviewed changes

Merge branch 'main' into feat/mock-llm-docker-e2e

7b048d3

malhotra5 enabled auto-merge (squash) June 1, 2026 18:59

vercel Bot deployed to Preview June 1, 2026 19:00 View deployment

malhotra5 disabled auto-merge June 1, 2026 19:01

malhotra5 merged commit 70d1798 into main Jun 1, 2026
17 checks passed

malhotra5 deleted the feat/mock-llm-docker-e2e branch June 1, 2026 19:09

Conversation

malhotra5 commented Jun 1, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Issue Number

How to Test

Type

Notes

Uh oh!

vercel Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

❌ mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI

Uh oh!

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM E2E Tests

❌ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

❌ mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

❌ mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

⚠️ Mock-LLM Docker E2E Test Results

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

❌ mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

⚠️ Mock-LLM Docker E2E Test Results

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM Docker E2E Test Results

Uh oh!

all-hands-bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

malhotra5 commented Jun 1, 2026 •

edited by github-actions Bot

Loading

vercel Bot commented Jun 1, 2026 •

edited

Loading

all-hands-bot commented Jun 1, 2026 •

edited

Loading