From efd68cefd309480990943135306e26c246632cfb Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 20:44:34 -0700 Subject: [PATCH 01/14] =?UTF-8?q?docs:=20cockpit=20aimock=20e2e=20Phase=20?= =?UTF-8?q?2=20design=20=E2=80=94=20harness=20library=20+=20per-example=20?= =?UTF-8?q?layout?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 1's single-globalSetup pattern doesn't scale to 15+ cockpit examples. Phase 2 introduces libs/internal/aimock-harness with a createGlobalSetup factory and migrates per-example e2e dirs to live next to each example's Angular app. Streaming gets migrated; c-tool-calls lands as the first new-pattern example. Future phases each add one example as a small additive PR. --- ...05-15-cockpit-aimock-harness-lib-design.md | 277 ++++++++++++++++++ 1 file changed, 277 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-15-cockpit-aimock-harness-lib-design.md diff --git a/docs/superpowers/specs/2026-05-15-cockpit-aimock-harness-lib-design.md b/docs/superpowers/specs/2026-05-15-cockpit-aimock-harness-lib-design.md new file mode 100644 index 000000000..be9250b42 --- /dev/null +++ b/docs/superpowers/specs/2026-05-15-cockpit-aimock-harness-lib-design.md @@ -0,0 +1,277 @@ +# Cockpit aimock E2E — Phase 2: harness library + per-example layout + +> **Place in the larger plan.** Phase 1 ([#349](https://github.com/cacheplane/angular-agent-framework/pull/349)) shipped the cockpit aimock harness as a single dir under `apps/cockpit/e2e/` with one pilot (`streaming`). That pattern doesn't scale to 15+ cockpit examples — every new example would have to extend the same `globalSetup` to spin up an additional Angular dev server. Phase 2 restructures to a per-example layout backed by a shared internal library, then lands `c-tool-calls` as the second example to validate the new pattern. + +## Goal + +Refactor the cockpit aimock harness so each cockpit example owns its own e2e directory next to its Angular app. A shared internal library (`libs/internal/aimock-harness`) holds the runner, helpers, and a `createGlobalSetup` factory. Per-example dirs contain only the playwright config (calling the factory with this app's specifics) + fixtures + spec. Phase 2 ships the library, migrates `streaming`, and adds `c-tool-calls`. + +## Library + +Same as Phase 1: [`@copilotkit/aimock`](https://github.com/CopilotKit/aimock). The new internal library wraps it with the project's specific orchestration (langgraph + Angular dev server boot). + +## Non-goals + +- Adding more than two example specs in this PR (`streaming` migrated, `c-tool-calls` added). PRs 3+ each add one example. +- Promoting `libs/internal/aimock-harness` to a published `@ngaf/*` library — internal-only for now. +- Changing the chat aimock harness at `examples/chat/aimock-e2e/`. Independent and untouched (the chat harness doesn't have the same scaling concern — it's one example). +- CI workflow restructure beyond what's needed to invoke `nx run-many` over the cockpit-*-angular projects. + +## Architecture + +``` +[Playwright test on CI/local] + ↓ drives real Chromium +[Angular dev server (per example, port from project.json)] + ↓ /api proxy → :8123 +[LangGraph dev server :8123 (cockpit/langgraph/streaming/python)] + ↓ OPENAI_BASE_URL=http://localhost:AIMOCK_PORT/v1 +[aimock node process (one per Playwright run, fixtures from per-example dir)] +``` + +Each example's e2e run boots ONE Angular dev server (the one for that example), shares the langgraph dev server (the streaming/python deployment serving 12 graphs), and points aimock at the per-example fixtures dir. + +When Phase 3+ adds an example whose graph lives in a different python project (e.g., `cockpit/langgraph/memory/python`), that example's `playwright.config.ts` passes a different `langgraphCwd` to `createGlobalSetup`. The factory handles the difference; per-example configs stay simple. + +## File layout + +### Internal library (new) + +``` +libs/internal/aimock-harness/ +├── src/ +│ ├── aimock-runner.ts # Copy of examples/chat/aimock-e2e/aimock-runner.ts (proven shape). +│ ├── test-helpers.ts # sendPromptAndWait helper. Path defaults to '/' (single-page cockpit examples) but accepts an override. +│ ├── global-setup-factory.ts # createGlobalSetup({ langgraphCwd, angularProject, angularPort, fixturesDir }) → globalSetup function. +│ ├── global-teardown.ts # Generic teardown; reads the shared state slot. +│ └── index.ts # Public exports. +├── project.json # Nx library: name "internal-aimock-harness", no published artifact. +├── tsconfig.json +└── README.md +``` + +`project.json` declares `tags: ["scope:internal"]` and is excluded from the publish workflow. + +### Per-example e2e dirs + +For each cockpit example getting aimock coverage: + +``` +cockpit///angular/e2e/ +├── playwright.config.ts # imports createGlobalSetup from @ngaf-internal/aimock-harness; passes app-specific opts. +├── fixtures/ +│ └── .json # captured aimock fixture for this example. +├── scripts/ +│ └── record-.py # dev capture recipe for this example's fixture. +├── tsconfig.json +└── .spec.ts # Playwright test. +``` + +The Angular project's existing `project.json` gains an `e2e` target pointing at the per-example playwright config. CI invokes `npx nx run-many --target=e2e --projects=cockpit-*-angular --skip-nx-cache`. + +### Phase 2 PR concretely + +**Created (library + per-example dirs for streaming and tool-calls):** +- `libs/internal/aimock-harness/` (new lib + 5 src files + project.json + tsconfig + README) +- `cockpit/langgraph/streaming/angular/e2e/` (5 files: playwright.config.ts, tsconfig.json, fixtures/streaming.json, scripts/record-streaming.py, streaming.spec.ts) +- `cockpit/chat/tool-calls/angular/e2e/` (5 files: same shape) + +**Modified:** +- `cockpit/langgraph/streaming/angular/project.json` — add `e2e` target. +- `cockpit/chat/tool-calls/angular/project.json` — add `e2e` target. +- `apps/cockpit/project.json` — drop the now-orphaned `e2e` target. +- `.github/workflows/ci.yml` — `Cockpit — e2e` job runs `nx run-many --target=e2e --projects=cockpit-*-angular`. +- `tsconfig.json` (root) or `nx.json` paths — register the new internal library import alias if needed. + +**Deleted:** +- `apps/cockpit/e2e/` — entire directory (everything moved out). + +## Components + +### `libs/internal/aimock-harness/src/aimock-runner.ts` + +Byte-for-byte port of `examples/chat/aimock-e2e/aimock-runner.ts`. Same `LLMock({ port: 0, chunkSize: 4096 })` setup. Same `addFixturesFromJSON` API. + +The chat harness's copy stays as-is — the two harnesses are still independent per the Phase 1 spec; promoting the chat harness onto this library is out of scope (a future cleanup). + +### `libs/internal/aimock-harness/src/test-helpers.ts` + +```typescript +export interface SendPromptAndWaitOptions { + /** Route to navigate to before sending the prompt. Default: '/'. */ + path?: string; +} + +export async function sendPromptAndWait( + page: Page, + prompt: string, + opts?: SendPromptAndWaitOptions, +): Promise; +``` + +Same wait-on-`data-streaming="false"` invariant as today's helper. Path is configurable; cockpit examples default to `/`, but consumers like an `/embed`-routed app can pass `{path: '/embed'}`. + +### `libs/internal/aimock-harness/src/global-setup-factory.ts` + +```typescript +export interface CreateGlobalSetupOpts { + /** Repo-relative path to the python langgraph project (e.g., 'cockpit/langgraph/streaming/python'). */ + langgraphCwd: string; + /** Port the langgraph dev server binds. Defaults to 8123. Override when an example uses a different python project AND another langgraph might be running on 8123. */ + langgraphPort?: number; + /** Nx project name of the Angular dev server (e.g., 'cockpit-chat-tool-calls-angular'). */ + angularProject: string; + /** Port the Angular dev server should bind. */ + angularPort: number; + /** Repo-relative path to the per-example fixtures dir. */ + fixturesDir: string; + /** Optional: timeout overrides. */ + langgraphReadyTimeoutMs?: number; + angularReadyTimeoutMs?: number; +} + +export function createGlobalSetup(opts: CreateGlobalSetupOpts): () => Promise; +``` + +Phase 3+ examples that hit a different python project pass their own `langgraphPort`. When `nx run-many --parallel=N` (N>1) is enabled later, this also avoids cross-run port collisions. + +Boots aimock + langgraph (with `OPENAI_BASE_URL` injected) + the named Angular dev server, in order. Stores the shared state on a global slot keyed by `angularProject` so concurrent Playwright workers don't collide (today they don't run concurrently per Phase 1 config; this is just defensive). + +### `libs/internal/aimock-harness/src/global-teardown.ts` + +```typescript +export default async function globalTeardown(): Promise; +``` + +Walks the shared state slots, kills processes in reverse order, awaits aimock stop. Idempotent. + +### Per-example `playwright.config.ts` + +```typescript +// SPDX-License-Identifier: MIT +import { defineConfig, devices } from '@playwright/test'; +import { createGlobalSetup } from '@ngaf-internal/aimock-harness'; +import { resolve } from 'node:path'; + +export default defineConfig({ + testDir: '.', + testMatch: '**/*.spec.ts', + fullyParallel: false, + workers: 1, + retries: process.env.CI ? 2 : 0, + reporter: process.env.CI ? [['list'], ['html', { open: 'never' }]] : 'list', + use: { + baseURL: 'http://localhost:4504', + trace: 'retain-on-failure', + }, + projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }], + globalSetup: resolve(__dirname, './global-setup-impl.ts'), + globalTeardown: '@ngaf-internal/aimock-harness/global-teardown', +}); +``` + +The `globalSetup` field needs a file path (Playwright loads it as a module). We have two options: +- (a) Per-example `global-setup-impl.ts` that re-exports `createGlobalSetup({ ... })` with this app's specifics. +- (b) Use Playwright's globalSetup feature with the factory called inside. + +Option (a) is cleaner — one ~5-line file per example, fully explicit about which app it's wiring. Spec design picks (a). + +### Per-example `global-setup-impl.ts` + +```typescript +// SPDX-License-Identifier: MIT +import { createGlobalSetup } from '@ngaf-internal/aimock-harness'; +import { resolve } from 'node:path'; + +export default createGlobalSetup({ + langgraphCwd: 'cockpit/langgraph/streaming/python', + angularProject: 'cockpit-chat-tool-calls-angular', + angularPort: 4504, + fixturesDir: resolve(__dirname, 'fixtures'), +}); +``` + +### Per-example `.spec.ts` + +Standard Playwright spec, importing `sendPromptAndWait` from `@ngaf-internal/aimock-harness`. + +## c-tool-calls pilot scenario + +**Prompt:** `"What's the status of UA123?"` + +**Captured fixture (`fixtures/c-tool-calls.json`) — TWO entries, ordered:** +- `match: { userMessage: PROMPT, hasToolResult: true }`, `response: { content: "" }` — continuation call after tool result is in history. +- `match: { userMessage: PROMPT }`, `response: { toolCalls: [{ name: "lookup_flight", arguments: { flight_number: "UA123" } }] }` — first call. + +The `lookup_flight` tool itself executes server-side in the langgraph ToolNode (returns canned UA123 data from `aviation_data.py`). Aimock doesn't mock the tool — only the LLM calls. + +**Spec assertions:** +1. A `` (or whatever the `` primitive renders per call) is in the DOM with text mentioning `lookup_flight`. Proves the parent's tool_call routed through the chat-tool-calls UI. +2. The finalized assistant bubble (`chat-message[data-role="assistant"][data-streaming="false"]`) contains a phrase from the captured continuation response (likely `UA123` or the flight's origin/destination from the canned aviation data). Proves the continuation completed end-to-end. + +**Capture script:** mirrors `chat_graphs.py`'s `_build_tool_calls_graph()`: ChatOpenAI(gpt-5-mini, streaming=True) bound with `AVIATION_TOOLS`, system prompt from `prompts/tool-calls.md`. Captures parent first call (tool_calls), then re-invokes with synthetic AIMessage(tool_calls) + ToolMessage(tool result from `lookup_flight`) in history to capture the continuation. Same pattern as chat aimock Phase 2d's research-subagent capture. + +## CI integration + +Update the existing `Cockpit — e2e` job (no rename, no `deploy.needs` change): + +```yaml +- run: npx nx run-many --target=e2e --projects=cockpit-*-angular --skip-nx-cache +``` + +Each cockpit example with an `e2e` target runs sequentially in this single job. When wall-clock becomes a problem (probably 4–5 examples in), shard via Nx affected logic or split into per-product matrix jobs. + +The existing setup (`uv sync` for streaming/python, playwright install) stays — Phase 2 examples still hit `streaming/python`. Future examples hitting other python projects get additional `uv sync` steps when needed. + +## Local dev workflow + +``` +# Run a single example's e2e: +npx nx e2e cockpit-chat-tool-calls-angular + +# Run all cockpit example e2e: +npx nx run-many --target=e2e --projects=cockpit-*-angular + +# Refresh a fixture (needs OPENAI_API_KEY): +OPENAI_API_KEY=sk-... uv run --project cockpit/langgraph/streaming/python \ + python cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py +``` + +## Library import path + +Use Nx's TypeScript path aliases. Add to root `tsconfig.json` (or `tsconfig.base.json` if present): + +```json +"paths": { + "@ngaf-internal/aimock-harness": ["libs/internal/aimock-harness/src/index.ts"], + "@ngaf-internal/aimock-harness/global-teardown": ["libs/internal/aimock-harness/src/global-teardown.ts"] +} +``` + +Why `@ngaf-internal/*` prefix: signals "internal use only, not published" without colliding with the published `@ngaf/*` namespace. Pattern can be reused for future internal libraries. + +## Risks and unknowns + +- **Path-alias resolution at Playwright runtime.** Playwright loads its config as Node ESM/CJS; need to confirm the alias works at runtime (vitest/Angular handle aliases, but Playwright's Node loader may not). If aliases don't resolve, fall back to a relative import path from each playwright.config.ts (`import { createGlobalSetup } from '../../../../libs/internal/aimock-harness/src'`). De-risk first thing in the implementation plan. +- **Per-example port collisions during Nx parallel runs.** `nx run-many` by default runs targets in parallel. If two `e2e` runs try to bind the same `8123` langgraph port simultaneously, they collide. Mitigations: (a) make langgraph port configurable per example (most cockpit examples will use 8123 anyway since they share `streaming/python`), (b) configure `nx run-many --parallel=1` in the CI invocation. Phase 2 will start with `--parallel=1` for safety; future phase optimizes. +- **Migrating streaming spec without regression.** Phase 1's streaming spec passed in CI. Migration must preserve: same fixture content, same prompt, same assertion. Diff should be path moves only. +- **`apps/cockpit/project.json` e2e target removal.** Anything in CI workflow files or scripts that references `nx e2e cockpit` needs updating to `nx run-many --target=e2e --projects=cockpit-*-angular`. Grep confirms there's only the one CI step today (per recent PR #349). + +## Acceptance criteria + +Phase 2 merges when: +- `libs/internal/aimock-harness/` exists with the runner, helpers, factory, teardown, and a README documenting the API. +- TypeScript path alias `@ngaf-internal/aimock-harness` resolves at Playwright runtime. +- `cockpit/langgraph/streaming/angular/e2e/` exists with the migrated streaming spec; `nx e2e cockpit-langgraph-streaming-angular` passes. +- `cockpit/chat/tool-calls/angular/e2e/` exists with the new c-tool-calls spec; `nx e2e cockpit-chat-tool-calls-angular` passes. +- Both specs pass 3/3 consecutive local runs (with port cooldown between). +- `nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1` runs both green. +- `apps/cockpit/e2e/` deleted entirely; `apps/cockpit/project.json` no longer has an `e2e` target. +- CI `Cockpit — e2e` job updated to use `nx run-many`; passes on PR. +- The chat aimock harness at `examples/chat/aimock-e2e/` is unchanged. + +## What lands next (Phase 3+, NOT this PR) + +- **Phase 3**: `c-subagents` (also unblocked by PR #347). Adds one e2e dir, one fixture, one spec. The library handles all the orchestration. +- **Phase 4+**: c-interrupts, c-generative-ui, c-a2ui, etc. — each one PR. +- **Eventual cleanup**: migrate the chat aimock harness (`examples/chat/aimock-e2e/`) onto the same library. Currently independent for historical reasons; promoting once a third+ harness wants the same code. From a067651ec4c957bcc3f8eef7bd72d9e767d52f11 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 20:53:49 -0700 Subject: [PATCH 02/14] docs: cockpit aimock e2e Phase 2 implementation plan 11 tasks. Task 0 de-risks path-alias resolution at Playwright runtime (falls back to relative imports if aliases don't work). Tasks 1-5 scaffold + implement the library. Task 6 wires the alias (or skips if relative imports needed). Tasks 7-8 migrate streaming + add c-tool-calls. Tasks 9-10 delete the old layout + update CI. Task 11 verifies + ships. --- .../2026-05-15-cockpit-aimock-harness-lib.md | 1265 +++++++++++++++++ 1 file changed, 1265 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-15-cockpit-aimock-harness-lib.md diff --git a/docs/superpowers/plans/2026-05-15-cockpit-aimock-harness-lib.md b/docs/superpowers/plans/2026-05-15-cockpit-aimock-harness-lib.md new file mode 100644 index 000000000..5e2781921 --- /dev/null +++ b/docs/superpowers/plans/2026-05-15-cockpit-aimock-harness-lib.md @@ -0,0 +1,1265 @@ +# Cockpit aimock E2E — Phase 2 Implementation Plan (harness library + per-example layout) + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development. Steps use checkbox (`- [ ]`) syntax. + +**Goal:** Stand up `libs/internal/aimock-harness` (shared runner + helpers + globalSetup factory), migrate the existing `streaming` spec to a per-example layout under `cockpit/langgraph/streaming/angular/e2e/`, and add `c-tool-calls` as the second example under the new pattern. + +**Architecture:** Each cockpit example owns its own e2e dir next to its Angular app. Per-example `playwright.config.ts` calls `createGlobalSetup({ ... })` from the shared lib with this app's specifics (langgraph cwd, Angular project name + port, fixtures dir). CI runs all of them via `nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1`. + +**Tech Stack:** `@copilotkit/aimock`, Playwright, Nx, TypeScript path aliases. + +**Spec:** [docs/superpowers/specs/2026-05-15-cockpit-aimock-harness-lib-design.md](../specs/2026-05-15-cockpit-aimock-harness-lib-design.md) + +--- + +## Working environment + +- Worktree: `/tmp/aimock-harness` (branch `claude/aimock-harness-lib`). +- `node_modules` symlinked from main checkout; `npx`/`nx`/`uv` work directly. +- License header `// SPDX-License-Identifier: MIT` on line 1 of every new TS file. +- One commit per task. DO NOT push, amend, or `git add -A`. +- Spec commit (`efd68cef`) already on the branch; this plan adds another. + +--- + +## Task 0: De-risk path-alias resolution at Playwright runtime + +**Files:** None (investigation only). + +The spec assumes `import { createGlobalSetup } from '@ngaf-internal/aimock-harness'` resolves at Playwright config-load time. Vitest and Angular handle TS path aliases via their bundlers; Playwright loads its config through Node's module resolver, which does NOT honor `tsconfig.json` paths by default. If aliases don't resolve, the implementation falls back to relative imports. + +- [ ] **Step 1: Inspect existing Playwright configs in the repo** + +```bash +cd /tmp/aimock-harness +grep -rn "from '@ngaf\|from '@nx" apps/cockpit/e2e/*.ts examples/chat/aimock-e2e/*.ts 2>/dev/null | head -10 +``` + +Expected: zero results. The existing harnesses use only relative imports (`./aimock-runner`, etc.), so we can't crib path-alias-in-Playwright pattern from existing code. + +- [ ] **Step 2: Check if the repo uses tsconfig-paths or similar Node-side alias loaders** + +```bash +grep -n "tsconfig-paths\|register" /tmp/aimock-harness/package.json /tmp/aimock-harness/playwright.config.* 2>/dev/null +``` + +Expected: probably empty. Note in the report. + +- [ ] **Step 3: Test the alias one-shot** + +Create scratch files: + +```bash +mkdir -p /tmp/alias-test/lib/src /tmp/alias-test/consumer +cat > /tmp/alias-test/tsconfig.json << 'EOF' +{ + "compilerOptions": { + "module": "CommonJS", + "moduleResolution": "Node", + "target": "ES2022", + "esModuleInterop": true, + "strict": true, + "paths": { + "@scratch/lib": ["lib/src/index.ts"] + }, + "baseUrl": "." + } +} +EOF + +cat > /tmp/alias-test/lib/src/index.ts << 'EOF' +export const greeting = "hello from lib"; +EOF + +cat > /tmp/alias-test/consumer/playwright.config.ts << 'EOF' +import { greeting } from '@scratch/lib'; +import { defineConfig } from '@playwright/test'; +console.log('alias resolved:', greeting); +export default defineConfig({ testDir: '.' }); +EOF + +cd /tmp/alias-test/consumer +npx playwright test --list 2>&1 | head -10 +``` + +Expected outcomes: +- (a) Prints `alias resolved: hello from lib` → Playwright honors paths. Use the alias. +- (b) Throws `Cannot find module '@scratch/lib'` → Playwright doesn't resolve paths. Fall back to relative imports in the implementation. + +Report which outcome occurred. + +Cleanup: `rm -rf /tmp/alias-test`. + +- [ ] **Step 4: Report** + +DE-RISK COMPLETE. Note: +- Whether the path alias resolved. +- If it didn't, the relative-import paths the spec needs to use: + - From `cockpit/langgraph/streaming/angular/e2e/playwright.config.ts` → `libs/internal/aimock-harness/src/index.ts` is `../../../../../libs/internal/aimock-harness/src`. + - From `cockpit/chat/tool-calls/angular/e2e/playwright.config.ts` → same depth, same relative path: `../../../../../libs/internal/aimock-harness/src`. + +If aliases don't work, Tasks 6 and 7 swap import statements to use the relative path. The library's internal structure stays the same. + +--- + +## Task 1: Scaffold `libs/internal/aimock-harness/` + +**Files:** +- Create: `libs/internal/aimock-harness/project.json` +- Create: `libs/internal/aimock-harness/tsconfig.json` +- Create: `libs/internal/aimock-harness/README.md` +- Create: `libs/internal/aimock-harness/src/index.ts` + +- [ ] **Step 1: Create project.json** + +Write `libs/internal/aimock-harness/project.json`: + +```json +{ + "name": "internal-aimock-harness", + "$schema": "../../../node_modules/nx/schemas/project-schema.json", + "projectType": "library", + "sourceRoot": "libs/internal/aimock-harness/src", + "tags": ["scope:internal"], + "targets": { + "lint": { + "executor": "nx:run-commands", + "options": { + "cwd": "libs/internal/aimock-harness", + "command": "tsc --noEmit" + } + }, + "test": { + "executor": "nx:run-commands", + "options": { + "cwd": "libs/internal/aimock-harness", + "command": "vitest run" + } + } + } +} +``` + +- [ ] **Step 2: Create tsconfig.json** + +Write `libs/internal/aimock-harness/tsconfig.json`: + +```json +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "moduleResolution": "Bundler", + "esModuleInterop": true, + "strict": true, + "skipLibCheck": true, + "noEmit": true, + "types": ["node"] + }, + "include": ["src/**/*.ts"], + "exclude": ["node_modules"] +} +``` + +- [ ] **Step 3: Create README.md** + +Write `libs/internal/aimock-harness/README.md`: + +```markdown +# @ngaf-internal/aimock-harness + +Internal-only library that wraps [`@copilotkit/aimock`](https://github.com/CopilotKit/aimock) for our cockpit example aimock e2e suite. + +NOT published. The `@ngaf-internal/*` namespace is reserved for internal libraries that are tightly coupled to repo-specific orchestration (langgraph + Angular dev server boot) and shouldn't appear in consumer-facing API surfaces. + +## API + +```typescript +import { createGlobalSetup, sendPromptAndWait } from '@ngaf-internal/aimock-harness'; +``` + +- `createGlobalSetup(opts)` — returns a Playwright globalSetup function that boots aimock + langgraph + the named Angular dev server. +- `sendPromptAndWait(page, prompt, opts?)` — Playwright helper. Goes to a path (default `/`), sends the prompt, waits for `chat-message[data-role="assistant"][data-streaming="false"]`, returns the bubble locator. + +## Per-example consumer shape + +``` +cockpit///angular/e2e/ +├── playwright.config.ts // imports createGlobalSetup, passes app-specific opts +├── global-setup-impl.ts // re-exports createGlobalSetup({...}) as default +├── fixtures/.json +├── scripts/record-.py +└── .spec.ts +``` + +See `cockpit/langgraph/streaming/angular/e2e/` for a working example. +``` + +- [ ] **Step 4: Create src/index.ts (skeleton)** + +Write `libs/internal/aimock-harness/src/index.ts`: + +```typescript +// SPDX-License-Identifier: MIT +export { startAimock, type AimockHandle, type AimockStartOptions } from './aimock-runner'; +export { sendPromptAndWait, type SendPromptAndWaitOptions } from './test-helpers'; +export { createGlobalSetup, type CreateGlobalSetupOpts } from './global-setup-factory'; +``` + +(The imports point at files Tasks 2/3/5 create; this file is committed now and the modules are added in their tasks. tsc will fail until those tasks land — that's intentional and tracked by the per-task verification.) + +- [ ] **Step 5: Commit Task 1** + +```bash +cd /tmp/aimock-harness +git add libs/internal/aimock-harness/project.json \ + libs/internal/aimock-harness/tsconfig.json \ + libs/internal/aimock-harness/README.md \ + libs/internal/aimock-harness/src/index.ts +git commit -m "feat(internal-aimock-harness): scaffold internal library" +``` + +--- + +## Task 2: Port aimock-runner.ts + tests from chat harness + +**Files:** +- Create: `libs/internal/aimock-harness/src/aimock-runner.ts` +- Create: `libs/internal/aimock-harness/src/aimock-runner.spec.ts` + +- [ ] **Step 1: Copy aimock-runner.ts byte-for-byte from chat harness** + +```bash +cd /tmp/aimock-harness +cp examples/chat/aimock-e2e/aimock-runner.ts libs/internal/aimock-harness/src/aimock-runner.ts +``` + +- [ ] **Step 2: Copy aimock-runner.spec.ts byte-for-byte** + +```bash +cp examples/chat/aimock-e2e/aimock-runner.spec.ts libs/internal/aimock-harness/src/aimock-runner.spec.ts +``` + +The spec's `import { startAimock, type AimockHandle } from './aimock-runner';` is already correct for the new location (same relative). + +- [ ] **Step 3: Run vitest on the new lib** + +```bash +cd /tmp/aimock-harness/libs/internal/aimock-harness +npx vitest run aimock-runner.spec.ts +``` + +Expected: 3 passed (boots replay server, stop is idempotent, loads directory of fixtures). + +- [ ] **Step 4: Commit Task 2** + +```bash +cd /tmp/aimock-harness +git add libs/internal/aimock-harness/src/aimock-runner.ts \ + libs/internal/aimock-harness/src/aimock-runner.spec.ts +git commit -m "feat(internal-aimock-harness): port aimock-runner + tests from chat harness" +``` + +--- + +## Task 3: Implement test-helpers.ts with configurable path + +**Files:** +- Create: `libs/internal/aimock-harness/src/test-helpers.ts` +- Create: `libs/internal/aimock-harness/src/test-helpers.spec.ts` + +- [ ] **Step 1: Write test-helpers.ts** + +Write `libs/internal/aimock-harness/src/test-helpers.ts`: + +```typescript +// SPDX-License-Identifier: MIT +import { expect, type Locator, type Page } from '@playwright/test'; + +export interface SendPromptAndWaitOptions { + /** Route to navigate to before sending the prompt. Default: '/'. */ + path?: string; +} + +/** + * Send a user prompt and wait for the assistant bubble to finalize. + * + * "Finalized" means `chat-message[data-role="assistant"][data-streaming="false"]`: + * the chat composition wires `[streaming]` to `agent.isLoading() && i === lastIndex` + * on the latest assistant ``, so the attribute flips to `"false"` + * once the agent stops loading and the markdown render has settled. + * + * Asserting on intermediate streaming-state DOM (partial `
    `, in-flight + * code fences, etc.) is the source of e2e flake — always wait on this + * attribute before counting or text-matching downstream of the assistant turn. + */ +export async function sendPromptAndWait( + page: Page, + prompt: string, + opts?: SendPromptAndWaitOptions, +): Promise { + const path = opts?.path ?? '/'; + await page.goto(path); + const input = page.getByRole('textbox', { name: /message|prompt/i }); + await input.fill(prompt); + await page.getByRole('button', { name: /send/i }).click(); + + const finalizedAssistant = page + .locator('chat-message[data-role="assistant"][data-streaming="false"]') + .last(); + await expect(finalizedAssistant).toBeAttached({ timeout: 45_000 }); + await expect + .poll(async () => ((await finalizedAssistant.innerText()) ?? '').trim().length, { + timeout: 30_000, + }) + .toBeGreaterThan(0); + return finalizedAssistant; +} +``` + +- [ ] **Step 2: Write a small unit test for the path defaulting** + +Write `libs/internal/aimock-harness/src/test-helpers.spec.ts`: + +```typescript +// SPDX-License-Identifier: MIT +import { describe, it, expect } from 'vitest'; +import type { SendPromptAndWaitOptions } from './test-helpers'; + +// The helper itself is integration-level (drives a real Playwright page); +// per-example specs exercise it. This file just locks in the type contract. + +describe('SendPromptAndWaitOptions', () => { + it('accepts an empty options object', () => { + const opts: SendPromptAndWaitOptions = {}; + expect(opts.path).toBeUndefined(); + }); + + it('accepts a path override', () => { + const opts: SendPromptAndWaitOptions = { path: '/embed' }; + expect(opts.path).toBe('/embed'); + }); +}); +``` + +- [ ] **Step 3: Run vitest** + +```bash +cd /tmp/aimock-harness/libs/internal/aimock-harness +npx vitest run test-helpers.spec.ts +``` + +Expected: 2 passed. + +- [ ] **Step 4: Commit Task 3** + +```bash +cd /tmp/aimock-harness +git add libs/internal/aimock-harness/src/test-helpers.ts \ + libs/internal/aimock-harness/src/test-helpers.spec.ts +git commit -m "feat(internal-aimock-harness): test-helpers with configurable path" +``` + +--- + +## Task 4: Implement global-teardown.ts + +**Files:** +- Create: `libs/internal/aimock-harness/src/global-teardown.ts` + +- [ ] **Step 1: Write global-teardown.ts** + +Write `libs/internal/aimock-harness/src/global-teardown.ts`: + +```typescript +// SPDX-License-Identifier: MIT +import type { ChildProcess } from 'node:child_process'; +import type { AimockHandle } from './aimock-runner'; + +interface SharedState { + aimock: AimockHandle; + langgraph: ChildProcess; + angular: ChildProcess; +} + +declare global { + // eslint-disable-next-line no-var + var __AIMOCK_HARNESS_STATE__: Map | undefined; +} + +/** + * Default Playwright globalTeardown. Walks every state slot the factory + * registered (one per Angular project), kills processes in reverse order + * (Angular → langgraph → aimock), awaits aimock stop. Idempotent. + */ +export default async function globalTeardown(): Promise { + const states = globalThis.__AIMOCK_HARNESS_STATE__; + if (!states) return; + for (const state of states.values()) { + state.angular.kill('SIGTERM'); + state.langgraph.kill('SIGTERM'); + await state.aimock.stop(); + } + globalThis.__AIMOCK_HARNESS_STATE__ = undefined; +} +``` + +- [ ] **Step 2: Type-check** + +```bash +cd /tmp/aimock-harness/libs/internal/aimock-harness +npx tsc --noEmit +``` + +Expected: errors only on `global-setup-factory.ts` (Task 5 hasn't created it yet). The error list should mention only `global-setup-factory` and the index re-export of it. If anything else fails, STOP. + +- [ ] **Step 3: Commit Task 4** + +```bash +cd /tmp/aimock-harness +git add libs/internal/aimock-harness/src/global-teardown.ts +git commit -m "feat(internal-aimock-harness): global-teardown with multi-slot state" +``` + +--- + +## Task 5: Implement global-setup-factory.ts + +**Files:** +- Create: `libs/internal/aimock-harness/src/global-setup-factory.ts` + +- [ ] **Step 1: Write global-setup-factory.ts** + +Write `libs/internal/aimock-harness/src/global-setup-factory.ts`: + +```typescript +// SPDX-License-Identifier: MIT +import { spawn, type ChildProcess } from 'node:child_process'; +import { setTimeout as delay } from 'node:timers/promises'; +import { resolve } from 'node:path'; +import { startAimock, type AimockHandle } from './aimock-runner'; + +export interface CreateGlobalSetupOpts { + /** Repo-relative path to the python langgraph project. */ + langgraphCwd: string; + /** Port the langgraph dev server binds. Default: 8123. */ + langgraphPort?: number; + /** Nx project name of the Angular dev server. */ + angularProject: string; + /** Port the Angular dev server should bind. */ + angularPort: number; + /** Absolute path to the per-example fixtures dir. */ + fixturesDir: string; + /** Default 90_000. */ + langgraphReadyTimeoutMs?: number; + /** Default 120_000. */ + angularReadyTimeoutMs?: number; +} + +interface SharedState { + aimock: AimockHandle; + langgraph: ChildProcess; + angular: ChildProcess; +} + +declare global { + // eslint-disable-next-line no-var + var __AIMOCK_HARNESS_STATE__: Map | undefined; +} + +async function waitForPort(url: string, timeoutMs: number, label: string): Promise { + const start = Date.now(); + while (Date.now() - start < timeoutMs) { + try { + const res = await fetch(url); + if (res.ok || res.status === 404) return; + } catch { + // server not up yet + } + await delay(500); + } + throw new Error(`[${label}] not ready at ${url} within ${timeoutMs}ms`); +} + +function repoRoot(opts: CreateGlobalSetupOpts): string { + // The factory is called from per-example playwright configs that themselves + // live many levels deep. We compute REPO_ROOT relative to the fixturesDir + // (which the consumer passes as an absolute path) so the consumer doesn't + // need to pass it explicitly. The repo root is the nearest ancestor that + // contains a `cockpit/` directory; for our layout, walking up from any + // per-example fixturesDir hits the repo root in 5 levels. + let dir = opts.fixturesDir; + for (let i = 0; i < 10; i++) { + if (require('node:fs').existsSync(require('node:path').join(dir, 'cockpit'))) { + return dir; + } + dir = require('node:path').dirname(dir); + } + throw new Error('repo root not found from fixturesDir; passed: ' + opts.fixturesDir); +} + +export function createGlobalSetup(opts: CreateGlobalSetupOpts): () => Promise { + const langgraphPort = opts.langgraphPort ?? 8123; + const langgraphTimeout = opts.langgraphReadyTimeoutMs ?? 90_000; + const angularTimeout = opts.angularReadyTimeoutMs ?? 120_000; + + return async function globalSetup(): Promise { + const root = repoRoot(opts); + const aimock = await startAimock({ mode: 'replay', fixturePath: opts.fixturesDir }); + // eslint-disable-next-line no-console + console.log(`[aimock-harness] aimock listening at ${aimock.baseUrl}`); + + const langgraph = spawn( + 'uv', + ['run', 'langgraph', 'dev', '--port', String(langgraphPort), '--no-browser'], + { + cwd: resolve(root, opts.langgraphCwd), + env: { + ...process.env, + OPENAI_BASE_URL: aimock.baseUrl, + OPENAI_API_KEY: 'test-not-used', + }, + stdio: 'pipe', + }, + ); + langgraph.stdout?.on('data', (b) => process.stdout.write(`[langgraph] ${b}`)); + langgraph.stderr?.on('data', (b) => process.stderr.write(`[langgraph] ${b}`)); + + await waitForPort(`http://localhost:${langgraphPort}/ok`, langgraphTimeout, 'langgraph'); + // eslint-disable-next-line no-console + console.log(`[aimock-harness] langgraph ready on :${langgraphPort}`); + + const angular = spawn( + 'npx', + ['nx', 'serve', opts.angularProject, '--port', String(opts.angularPort)], + { + cwd: root, + env: { ...process.env }, + stdio: 'pipe', + }, + ); + angular.stdout?.on('data', (b) => process.stdout.write(`[angular] ${b}`)); + angular.stderr?.on('data', (b) => process.stderr.write(`[angular] ${b}`)); + + await waitForPort(`http://localhost:${opts.angularPort}/`, angularTimeout, 'angular'); + // eslint-disable-next-line no-console + console.log(`[aimock-harness] angular ready on :${opts.angularPort} (${opts.angularProject})`); + + if (!globalThis.__AIMOCK_HARNESS_STATE__) { + globalThis.__AIMOCK_HARNESS_STATE__ = new Map(); + } + globalThis.__AIMOCK_HARNESS_STATE__.set(opts.angularProject, { aimock, langgraph, angular }); + }; +} +``` + +- [ ] **Step 2: Type-check the whole library** + +```bash +cd /tmp/aimock-harness/libs/internal/aimock-harness +npx tsc --noEmit +``` + +Expected: no errors. + +- [ ] **Step 3: Run all lib tests** + +```bash +cd /tmp/aimock-harness/libs/internal/aimock-harness +npx vitest run +``` + +Expected: 5 passed (3 from aimock-runner + 2 from test-helpers). + +- [ ] **Step 4: Commit Task 5** + +```bash +cd /tmp/aimock-harness +git add libs/internal/aimock-harness/src/global-setup-factory.ts +git commit -m "feat(internal-aimock-harness): createGlobalSetup factory" +``` + +--- + +## Task 6: Wire path alias (or use relative imports per Task 0 finding) + +**Files:** +- Modify: root tsconfig (`tsconfig.json` or `tsconfig.base.json`) + +If Task 0 reported aliases work, do this task. Otherwise SKIP and note in your report that consumers will use relative imports; Tasks 7 and 8 use the relative paths instead. + +- [ ] **Step 1: Locate the root tsconfig with `paths`** + +```bash +cd /tmp/aimock-harness +test -f tsconfig.base.json && echo "base" || (test -f tsconfig.json && echo "root") +grep -l '"paths"' /tmp/aimock-harness/tsconfig*.json +``` + +Use whichever file already declares `paths` for `@ngaf/*` (or similar). If neither exists with paths, add to `tsconfig.json` (the root). + +- [ ] **Step 2: Add the alias** + +In the chosen tsconfig file, add to `compilerOptions.paths`: + +```json +"@ngaf-internal/aimock-harness": ["libs/internal/aimock-harness/src/index.ts"], +"@ngaf-internal/aimock-harness/global-teardown": ["libs/internal/aimock-harness/src/global-teardown.ts"] +``` + +Verify the file is valid JSON: +```bash +python3 -c "import json; json.load(open(''))" && echo "OK" +``` + +- [ ] **Step 3: Commit Task 6** + +```bash +cd /tmp/aimock-harness +git add +git commit -m "chore(tsconfig): add @ngaf-internal/aimock-harness path alias" +``` + +If Task 0 found aliases don't resolve in Playwright, commit message is irrelevant — skip this task entirely. + +--- + +## Task 7: Migrate the streaming spec to per-example layout + +**Files:** +- Create: `cockpit/langgraph/streaming/angular/e2e/playwright.config.ts` +- Create: `cockpit/langgraph/streaming/angular/e2e/global-setup-impl.ts` +- Create: `cockpit/langgraph/streaming/angular/e2e/tsconfig.json` +- Create: `cockpit/langgraph/streaming/angular/e2e/.gitignore` +- Create: `cockpit/langgraph/streaming/angular/e2e/fixtures/streaming.json` (copied from existing) +- Create: `cockpit/langgraph/streaming/angular/e2e/scripts/record-streaming.py` (copied from existing) +- Create: `cockpit/langgraph/streaming/angular/e2e/streaming.spec.ts` (copied + import fixed) +- Modify: `cockpit/langgraph/streaming/angular/project.json` (add `e2e` target) + +This task migrates the Phase 1 spec verbatim to its new home. Task 9 deletes the old `apps/cockpit/e2e/` location. + +- [ ] **Step 1: Create the playwright config** + +Write `cockpit/langgraph/streaming/angular/e2e/playwright.config.ts`. **Use the alias if Task 0 confirmed it works; otherwise use the relative import path the de-risk reported.** + +```typescript +// SPDX-License-Identifier: MIT +import { defineConfig, devices } from '@playwright/test'; + +export default defineConfig({ + testDir: '.', + testMatch: '**/*.spec.ts', + fullyParallel: false, + workers: 1, + retries: process.env.CI ? 2 : 0, + reporter: process.env.CI ? [['list'], ['html', { open: 'never' }]] : 'list', + use: { + baseURL: 'http://localhost:4300', + trace: 'retain-on-failure', + }, + projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }], + globalSetup: './global-setup-impl.ts', + globalTeardown: require.resolve('@ngaf-internal/aimock-harness/global-teardown'), +}); +``` + +If aliases don't work: replace the `globalTeardown` line with `globalTeardown: require.resolve('../../../../../libs/internal/aimock-harness/src/global-teardown'),`. + +- [ ] **Step 2: Create global-setup-impl.ts** + +Write `cockpit/langgraph/streaming/angular/e2e/global-setup-impl.ts`: + +```typescript +// SPDX-License-Identifier: MIT +import { resolve } from 'node:path'; +import { createGlobalSetup } from '@ngaf-internal/aimock-harness'; + +export default createGlobalSetup({ + langgraphCwd: 'cockpit/langgraph/streaming/python', + angularProject: 'cockpit-langgraph-streaming-angular', + angularPort: 4300, + fixturesDir: resolve(__dirname, 'fixtures'), +}); +``` + +If aliases don't work: replace the `import` with `import { createGlobalSetup } from '../../../../../libs/internal/aimock-harness/src';`. + +- [ ] **Step 3: Create tsconfig.json + .gitignore** + +Write `cockpit/langgraph/streaming/angular/e2e/tsconfig.json`: + +```json +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "moduleResolution": "Bundler", + "esModuleInterop": true, + "strict": true, + "skipLibCheck": true, + "noEmit": true, + "types": ["node"] + }, + "include": ["**/*.ts"], + "exclude": ["node_modules", "test-results", "playwright-report"] +} +``` + +Write `cockpit/langgraph/streaming/angular/e2e/.gitignore`: + +``` +test-results/ +playwright-report/ +*.tmp +``` + +- [ ] **Step 4: Copy the streaming fixture, capture script, and spec from the old location** + +```bash +cd /tmp/aimock-harness +mkdir -p cockpit/langgraph/streaming/angular/e2e/fixtures cockpit/langgraph/streaming/angular/e2e/scripts +cp apps/cockpit/e2e/fixtures/streaming.json cockpit/langgraph/streaming/angular/e2e/fixtures/streaming.json +cp apps/cockpit/e2e/scripts/record-streaming.py cockpit/langgraph/streaming/angular/e2e/scripts/record-streaming.py +cp apps/cockpit/e2e/streaming.spec.ts cockpit/langgraph/streaming/angular/e2e/streaming.spec.ts +``` + +The streaming.spec.ts currently imports `from './test-helpers'`. Update it to use the library: + +```bash +cd /tmp/aimock-harness +# Edit cockpit/langgraph/streaming/angular/e2e/streaming.spec.ts: +# Change: import { sendPromptAndWait } from './test-helpers'; +# To: import { sendPromptAndWait } from '@ngaf-internal/aimock-harness'; +# (or the relative path if aliases don't work) +``` + +- [ ] **Step 5: Add the `e2e` target to the angular project** + +Open `cockpit/langgraph/streaming/angular/project.json`. Add to `targets`: + +```json +"e2e": { + "executor": "@nx/playwright:playwright", + "options": { + "config": "cockpit/langgraph/streaming/angular/e2e/playwright.config.ts" + } +} +``` + +Verify JSON is valid: +```bash +python3 -c "import json; json.load(open('cockpit/langgraph/streaming/angular/project.json'))" && echo "OK" +``` + +- [ ] **Step 6: Run the migrated spec** + +```bash +cd /tmp/aimock-harness +cp /Users/blove/repos/angular-agent-framework/examples/chat/python/.env cockpit/langgraph/streaming/python/.env +node libs/licensing/scripts/generate-public-key.mjs 2>&1 | tail -1 +npx playwright install --with-deps chromium +npx nx e2e cockpit-langgraph-streaming-angular +``` + +Expected: 1 test passes (the migrated streaming spec). Wall-clock ~60-120s. + +If the test fails, STOP and report. Likely causes: +- Path alias / relative import still wrong. +- `repoRoot()` heuristic in the factory didn't land on the correct dir — check the log lines from `[aimock-harness]`. +- langgraph or Angular fail to start — check `[langgraph]` / `[angular]` log lines. + +- [ ] **Step 7: Commit Task 7** + +```bash +cd /tmp/aimock-harness +git add cockpit/langgraph/streaming/angular/e2e/ cockpit/langgraph/streaming/angular/project.json +git commit -m "feat(cockpit-langgraph-streaming): migrate aimock e2e to per-example layout" +``` + +--- + +## Task 8: Add c-tool-calls e2e (capture fixture + write spec) + +**Files:** +- Create: `cockpit/chat/tool-calls/angular/e2e/playwright.config.ts` +- Create: `cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts` +- Create: `cockpit/chat/tool-calls/angular/e2e/tsconfig.json` +- Create: `cockpit/chat/tool-calls/angular/e2e/.gitignore` +- Create: `cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py` +- Create: `cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json` (generated by script) +- Create: `cockpit/chat/tool-calls/angular/e2e/c-tool-calls.spec.ts` +- Modify: `cockpit/chat/tool-calls/angular/project.json` (add `e2e` target) + +- [ ] **Step 1: Create the per-example dir scaffolding** + +Write `cockpit/chat/tool-calls/angular/e2e/playwright.config.ts`: + +```typescript +// SPDX-License-Identifier: MIT +import { defineConfig, devices } from '@playwright/test'; + +export default defineConfig({ + testDir: '.', + testMatch: '**/*.spec.ts', + fullyParallel: false, + workers: 1, + retries: process.env.CI ? 2 : 0, + reporter: process.env.CI ? [['list'], ['html', { open: 'never' }]] : 'list', + use: { + baseURL: 'http://localhost:4504', + trace: 'retain-on-failure', + }, + projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }], + globalSetup: './global-setup-impl.ts', + globalTeardown: require.resolve('@ngaf-internal/aimock-harness/global-teardown'), +}); +``` + +(If aliases don't work: same relative-path swap as Task 7 Step 1.) + +Write `cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts`: + +```typescript +// SPDX-License-Identifier: MIT +import { resolve } from 'node:path'; +import { createGlobalSetup } from '@ngaf-internal/aimock-harness'; + +export default createGlobalSetup({ + langgraphCwd: 'cockpit/langgraph/streaming/python', + angularProject: 'cockpit-chat-tool-calls-angular', + angularPort: 4504, + fixturesDir: resolve(__dirname, 'fixtures'), +}); +``` + +Write `cockpit/chat/tool-calls/angular/e2e/tsconfig.json` and `.gitignore` — same content as Task 7 Step 3. + +- [ ] **Step 2: Write the capture script** + +Write `cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py`: + +```python +"""Capture parent first-call (tool_call) + continuation (text) for c-tool-calls. + +Mirrors cockpit/langgraph/streaming/python/src/chat_graphs.py's +_build_tool_calls_graph() LLM setup: ChatOpenAI(gpt-5-mini, streaming=True) +bound with AVIATION_TOOLS, system prompt from prompts/tool-calls.md. + +Two LLM calls captured, written into one fixture with the hasToolResult +discriminator on the continuation entry. + +Run from repo root: + OPENAI_API_KEY=sk-... uv run --project cockpit/langgraph/streaming/python \\ + python cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py +""" +import json +import os +import sys +import uuid +from pathlib import Path + +env_path = Path("cockpit/langgraph/streaming/python/.env") +if env_path.exists(): + for line in env_path.read_text().splitlines(): + line = line.strip() + if line and not line.startswith("#") and "=" in line: + k, _, v = line.partition("=") + os.environ.setdefault(k.strip(), v.strip().strip('"').strip("'")) + +if not os.environ.get("OPENAI_API_KEY"): + print("OPENAI_API_KEY not set", file=sys.stderr) + sys.exit(1) + +sys.path.insert(0, str(Path("cockpit/langgraph/streaming/python/src").resolve())) + +from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage +from langchain_openai import ChatOpenAI + +from src.aviation_tools import AVIATION_TOOLS, lookup_flight # type: ignore + +PROMPT = "What's the status of UA123?" +SYSTEM_PROMPT = ( + Path("cockpit/langgraph/streaming/python/prompts/tool-calls.md").read_text() +) + +llm = ChatOpenAI(model="gpt-5-mini", temperature=0).bind_tools(AVIATION_TOOLS) + +# 1. Parent's first call. +first = llm.invoke([SystemMessage(content=SYSTEM_PROMPT), HumanMessage(content=PROMPT)]) +assert first.tool_calls, f"Parent did not emit tool_calls; content={first.content!r}" +tc = first.tool_calls[0] +tc_args = tc.get("args") or {} +tc_id = tc.get("id") or f"call_{uuid.uuid4().hex[:12]}" +print(f"1. parent tool_call name={tc.get('name')} args={tc_args}") + +# 2. Tool result (real lookup_flight). +tool_result = lookup_flight.invoke(tc_args) # returns canned aviation data +print(f"2. tool result length={len(str(tool_result))}") + +# 3. Parent's continuation call. +continuation = llm.invoke( + [ + SystemMessage(content=SYSTEM_PROMPT), + HumanMessage(content=PROMPT), + AIMessage( + content="", + tool_calls=[{"name": tc.get("name"), "args": tc_args, "id": tc_id, "type": "tool_call"}], + ), + ToolMessage(content=str(tool_result), tool_call_id=tc_id), + ], +) +text = continuation.content if isinstance(continuation.content, str) else "" +if not text.strip(): + print("Continuation returned empty; aborting", file=sys.stderr) + sys.exit(2) +print(f"3. continuation: {len(text)} chars; first 80: {text[:80]!r}") + +fixture = { + "fixtures": [ + # ORDER MATTERS: continuation match is more specific (hasToolResult); + # aimock evaluates fixtures top-to-bottom and picks the first match. + { + "match": {"userMessage": PROMPT, "hasToolResult": True}, + "response": {"content": text}, + }, + { + "match": {"userMessage": PROMPT}, + "response": {"toolCalls": [{"name": tc.get("name"), "arguments": tc_args}]}, + }, + ] +} + +out_path = Path("cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json") +out_path.parent.mkdir(parents=True, exist_ok=True) +out_path.write_text(json.dumps(fixture, indent=2) + "\n") +print(f"\nWrote fixture to {out_path}") +``` + +- [ ] **Step 3: Run the capture script** + +```bash +cd /tmp/aimock-harness +uv run --project cockpit/langgraph/streaming/python python cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py +``` + +Expected: prints all three steps; writes the fixture. + +If the parent doesn't emit `tool_calls`: STOP. Check that the `tool-calls.md` prompt is asking the LLM to use tools (it should after PR #347). + +- [ ] **Step 4: Inspect the fixture and pick a phrase** + +```bash +head -50 cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json +``` + +Pick a 1-2 word phrase from the continuation `content` that's likely to render verbatim. Good candidates: `UA123`, the flight number, or specific flight detail wording (e.g., `delayed`, `on time`, an airport code). + +- [ ] **Step 5: Write the spec** + +Write `cockpit/chat/tool-calls/angular/e2e/c-tool-calls.spec.ts` (replace `` with the phrase from Step 4): + +```typescript +// SPDX-License-Identifier: MIT +import { test, expect } from '@playwright/test'; +import { sendPromptAndWait } from '@ngaf-internal/aimock-harness'; + +const PROMPT = "What's the status of UA123?"; + +test('c-tool-calls: parent dispatches lookup_flight tool, continuation surfaces flight data', async ({ page }) => { + const bubble = await sendPromptAndWait(page, PROMPT); + + // The chat-tool-calls primitive renders a card per tool call. Card label + // includes the tool name. Asserting it's in the DOM proves the parent's + // tool_call routed through the chat-tool-calls UI primitive. + const toolCallChip = page.getByRole('button', { name: /lookup_flight|tool/i }).first(); + await expect(toolCallChip).toBeVisible({ timeout: 30_000 }); + + // The continuation's text mentions a distinctive phrase from the captured + // response — proves the tool-result-then-text loop completed end-to-end. + const finalText = await bubble.innerText(); + expect(finalText.toLowerCase()).toContain(''.toLowerCase()); +}); +``` + +(If aliases don't work: `import { sendPromptAndWait } from '../../../../../libs/internal/aimock-harness/src';`) + +- [ ] **Step 6: Add the e2e target to the angular project** + +Open `cockpit/chat/tool-calls/angular/project.json`. Add to `targets`: + +```json +"e2e": { + "executor": "@nx/playwright:playwright", + "options": { + "config": "cockpit/chat/tool-calls/angular/e2e/playwright.config.ts" + } +} +``` + +Verify JSON valid: +```bash +python3 -c "import json; json.load(open('cockpit/chat/tool-calls/angular/project.json'))" && echo "OK" +``` + +- [ ] **Step 7: Run the spec** + +```bash +cd /tmp/aimock-harness +npx nx e2e cockpit-chat-tool-calls-angular +``` + +Expected: 1 test passes within ~60-120s. + +If it fails: +- "tool call chip not visible" → inspect the trace; the chip's accessible name may differ from `lookup_flight`. Adjust the selector. +- "innerText missing the phrase" → pick a different phrase from the fixture's content that's more likely to render verbatim. +- "no chat-message" → harness wiring problem; check `[aimock-harness]` log lines and `cockpit-chat-tool-calls-angular`'s app code (it should use `` per `tool-calls.component.ts`). + +- [ ] **Step 8: Stability check** + +Run 3 consecutive runs with port cooldown: + +```bash +for i in 1 2 3; do + echo "=== Run $i ===" + rm -rf cockpit/chat/tool-calls/angular/e2e/test-results cockpit/chat/tool-calls/angular/e2e/playwright-report + sleep 8 + npx nx e2e cockpit-chat-tool-calls-angular +done +``` + +Expected: 3/3 pass. + +- [ ] **Step 9: Commit Task 8** + +```bash +cd /tmp/aimock-harness +git add cockpit/chat/tool-calls/angular/e2e/ cockpit/chat/tool-calls/angular/project.json +git commit -m "test(cockpit-chat-tool-calls): aimock e2e — multi-turn tool-call flow" +``` + +--- + +## Task 9: Delete `apps/cockpit/e2e/` and remove its target + +**Files:** +- Delete: entire `apps/cockpit/e2e/` directory +- Modify: `apps/cockpit/project.json` (remove `e2e` target) + +- [ ] **Step 1: Delete the old dir** + +```bash +cd /tmp/aimock-harness +git rm -r apps/cockpit/e2e/ +``` + +Expected: removes the harness scaffolding, fixtures, scripts, and the streaming spec (already migrated in Task 7). + +- [ ] **Step 2: Remove the e2e target from apps/cockpit/project.json** + +Open `apps/cockpit/project.json` and locate the `"e2e"` target block: + +```json + "e2e": { + "executor": "@nx/playwright:playwright", + "options": { + "config": "apps/cockpit/e2e/playwright.config.ts" + } + }, +``` + +Delete the entire `"e2e": { ... }` entry (and adjust trailing commas in the surrounding JSON). + +Verify the file is still valid JSON: +```bash +python3 -c "import json; json.load(open('apps/cockpit/project.json'))" && echo "OK" +``` + +- [ ] **Step 3: Verify nothing references the deleted dir** + +```bash +cd /tmp/aimock-harness +grep -rn "apps/cockpit/e2e/\|nx e2e cockpit\b" \ + --include='*.ts' --include='*.json' --include='*.yml' --include='*.md' \ + | grep -v 'node_modules\|test-results\|playwright-report\|docs/superpowers/' +``` + +Expected: zero matches (the docs under `docs/superpowers/` are excluded since they document the migration). + +If any matches remain, STOP and report. + +- [ ] **Step 4: Commit Task 9** + +```bash +cd /tmp/aimock-harness +git add apps/cockpit/project.json +git commit -m "chore(cockpit): drop legacy apps/cockpit/e2e (migrated to per-example dirs)" +``` + +The `git rm -r` from Step 1 staged the deletions; the `git add` here stages the project.json modification. + +--- + +## Task 10: Update CI workflow + +**Files:** +- Modify: `.github/workflows/ci.yml` + +- [ ] **Step 1: Locate and update the cockpit-e2e job** + +Open `.github/workflows/ci.yml`, find the `cockpit-e2e` job. The `npx nx e2e cockpit` line needs to change to `npx nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1 --skip-nx-cache`. + +Updated job body: + +```yaml + cockpit-e2e: + name: Cockpit — e2e + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6.0.2 + - uses: actions/setup-node@v6.3.0 + with: + node-version: 22 + cache: npm + - name: Install uv + uses: astral-sh/setup-uv@v8.0.0 + with: + python-version: '3.12' + - run: npm ci + - working-directory: cockpit/langgraph/streaming/python + run: uv sync + - run: npx playwright install --with-deps chromium + - run: npx nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1 --skip-nx-cache + - name: Upload Playwright trace on failure + if: failure() + uses: actions/upload-artifact@v4 + with: + name: cockpit-e2e-trace + path: | + cockpit/**/angular/e2e/test-results/ + retention-days: 7 +``` + +Note the trace upload path now uses a glob across all per-example test-results dirs. + +- [ ] **Step 2: Verify YAML parses** + +```bash +cd /tmp/aimock-harness +npx -y js-yaml .github/workflows/ci.yml > /dev/null && echo "OK" +``` + +- [ ] **Step 3: Commit Task 10** + +```bash +cd /tmp/aimock-harness +git add .github/workflows/ci.yml +git commit -m "ci(cockpit): nx run-many for per-example aimock e2e" +``` + +--- + +## Task 11: Verify, push, open PR + +- [ ] **Step 1: Run nx run-many locally** + +```bash +cd /tmp/aimock-harness +npx nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1 --skip-nx-cache +``` + +Expected: 2 projects targeted (`cockpit-langgraph-streaming-angular` + `cockpit-chat-tool-calls-angular`); both pass. + +If it fails for either, STOP and report. + +- [ ] **Step 2: Run library unit tests one final time** + +```bash +cd /tmp/aimock-harness/libs/internal/aimock-harness +npx vitest run +``` + +Expected: 5 passed. + +- [ ] **Step 3: Confirm working tree clean** + +```bash +cd /tmp/aimock-harness +git status --short +``` + +Expected: empty (only `node_modules` symlink + any test-results dirs as untracked). + +Remove any stray `cockpit/langgraph/streaming/python/.env` (gitignored, but verify): `rm -f cockpit/langgraph/streaming/python/.env`. + +- [ ] **Step 4: Push** + +```bash +cd /tmp/aimock-harness +git push -u origin claude/aimock-harness-lib +``` + +- [ ] **Step 5: Open PR** + +```bash +gh pr create --title "feat(cockpit): aimock harness library + per-example e2e layout (Phase 2)" --body "$(cat <<'EOF' +## Summary + +Phase 2 of the cockpit aimock e2e plan. Restructures the harness so each cockpit example owns its own e2e dir next to its Angular app, backed by a shared internal library (`libs/internal/aimock-harness`). + +- **New library** `@ngaf-internal/aimock-harness` exporting `createGlobalSetup`, `sendPromptAndWait`, `startAimock`, and a default global-teardown. Internal-only (not published). +- **Migrated** the Phase 1 streaming spec to `cockpit/langgraph/streaming/angular/e2e/`. +- **Added** `c-tool-calls` as the first new-pattern example with full multi-turn fixture (parent tool_call → tool result → continuation). Asserts the chat-tool-calls UI primitive activates AND the continuation's text surfaces flight data from the captured response. +- **Deleted** `apps/cockpit/e2e/` entirely; dropped the e2e target from `apps/cockpit/project.json`. +- **CI** `Cockpit — e2e` job now runs `nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1`. + +Sits on Phase 1 (#349) + the c-* aviation refactor (#347 + #350). + +## Test plan + +- [x] Library vitest suite green (5 tests) +- [x] streaming spec passes 3/3 stability runs after migration +- [x] c-tool-calls spec passes 3/3 stability runs +- [x] `nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1` runs both green +- [x] No production code touched (only harness lib, per-example e2e dirs, project.json e2e target additions/removal, CI workflow) +- [ ] CI green on this PR + +## Notes for reviewers + +- Each future cockpit example PR is now small: one new `e2e/` dir under the example's angular app, one fixture, one spec, one project.json `e2e` target. The library handles all orchestration. +- Path-alias resolution at Playwright runtime was de-risked in Task 0 (the implementer reports the result in the PR body if a fallback to relative imports was needed). +- Module duplication with the chat aimock harness (`examples/chat/aimock-e2e/aimock-runner.ts` etc.) is intentional and untouched here. Migrating the chat harness onto this library is a separate cleanup PR. + +Spec: `docs/superpowers/specs/2026-05-15-cockpit-aimock-harness-lib-design.md` +Plan: `docs/superpowers/plans/2026-05-15-cockpit-aimock-harness-lib.md` +EOF +)" +``` + +- [ ] **Step 6: Watch CI** + +```bash +gh pr checks --watch --interval 30 +``` + +Report when CI completes. + +--- + +## Self-review checklist + +- [x] Spec coverage: + - Library scaffolding → Tasks 1-5 + - Path alias → Task 6 (skippable per Task 0 finding) + - Streaming migration → Task 7 + - c-tool-calls → Task 8 + - Old dir deletion → Task 9 + - CI update → Task 10 + - Risks (path alias, port collisions) → Task 0 + `--parallel=1` flag in Task 10 +- [x] Placeholder scan: no TBD/TODO. Path-alias and `` are intentional implementer-fills based on Task 0 / Task 8 Step 4 findings. +- [x] Type consistency: `AimockHandle`, `AimockStartOptions`, `startAimock`, `sendPromptAndWait`, `createGlobalSetup`, `CreateGlobalSetupOpts` all consistent across tasks. +- [x] Constraints: `@copilotkit/aimock` only in TS imports, package.json, plan/spec/README. Not in commit messages. + +## Execution handoff + +Plan complete. Recommended: **subagent-driven-development** with Task 0 dispatched first as a blocking gate. If Task 0 finds aliases don't resolve, the implementer adapts Tasks 7 + 8 to use relative imports throughout (and skips Task 6). From 429f0775b7c72b5d09a1ddc3d2eb5cfd9f86f7f6 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 21:06:06 -0700 Subject: [PATCH 03/14] feat(internal-aimock-harness): scaffold internal library --- libs/internal/aimock-harness/README.md | 27 ++++++++++++++++++++++ libs/internal/aimock-harness/project.json | 23 ++++++++++++++++++ libs/internal/aimock-harness/src/index.ts | 4 ++++ libs/internal/aimock-harness/tsconfig.json | 14 +++++++++++ 4 files changed, 68 insertions(+) create mode 100644 libs/internal/aimock-harness/README.md create mode 100644 libs/internal/aimock-harness/project.json create mode 100644 libs/internal/aimock-harness/src/index.ts create mode 100644 libs/internal/aimock-harness/tsconfig.json diff --git a/libs/internal/aimock-harness/README.md b/libs/internal/aimock-harness/README.md new file mode 100644 index 000000000..b10391a94 --- /dev/null +++ b/libs/internal/aimock-harness/README.md @@ -0,0 +1,27 @@ +# @ngaf-internal/aimock-harness + +Internal-only library that wraps [`@copilotkit/aimock`](https://github.com/CopilotKit/aimock) for our cockpit example aimock e2e suite. + +NOT published. The `@ngaf-internal/*` namespace is reserved for internal libraries that are tightly coupled to repo-specific orchestration (langgraph + Angular dev server boot) and shouldn't appear in consumer-facing API surfaces. + +## API + +```typescript +import { createGlobalSetup, sendPromptAndWait } from '@ngaf-internal/aimock-harness'; +``` + +- `createGlobalSetup(opts)` — returns a Playwright globalSetup function that boots aimock + langgraph + the named Angular dev server. +- `sendPromptAndWait(page, prompt, opts?)` — Playwright helper. Goes to a path (default `/`), sends the prompt, waits for `chat-message[data-role="assistant"][data-streaming="false"]`, returns the bubble locator. + +## Per-example consumer shape + +``` +cockpit///angular/e2e/ +├── playwright.config.ts // imports createGlobalSetup, passes app-specific opts +├── global-setup-impl.ts // re-exports createGlobalSetup({...}) as default +├── fixtures/.json +├── scripts/record-.py +└── .spec.ts +``` + +See `cockpit/langgraph/streaming/angular/e2e/` for a working example. diff --git a/libs/internal/aimock-harness/project.json b/libs/internal/aimock-harness/project.json new file mode 100644 index 000000000..74d394186 --- /dev/null +++ b/libs/internal/aimock-harness/project.json @@ -0,0 +1,23 @@ +{ + "name": "internal-aimock-harness", + "$schema": "../../../node_modules/nx/schemas/project-schema.json", + "projectType": "library", + "sourceRoot": "libs/internal/aimock-harness/src", + "tags": ["scope:internal"], + "targets": { + "lint": { + "executor": "nx:run-commands", + "options": { + "cwd": "libs/internal/aimock-harness", + "command": "tsc --noEmit" + } + }, + "test": { + "executor": "nx:run-commands", + "options": { + "cwd": "libs/internal/aimock-harness", + "command": "vitest run" + } + } + } +} diff --git a/libs/internal/aimock-harness/src/index.ts b/libs/internal/aimock-harness/src/index.ts new file mode 100644 index 000000000..ea8d48eef --- /dev/null +++ b/libs/internal/aimock-harness/src/index.ts @@ -0,0 +1,4 @@ +// SPDX-License-Identifier: MIT +export { startAimock, type AimockHandle, type AimockStartOptions } from './aimock-runner'; +export { sendPromptAndWait, type SendPromptAndWaitOptions } from './test-helpers'; +export { createGlobalSetup, type CreateGlobalSetupOpts } from './global-setup-factory'; diff --git a/libs/internal/aimock-harness/tsconfig.json b/libs/internal/aimock-harness/tsconfig.json new file mode 100644 index 000000000..7377c0089 --- /dev/null +++ b/libs/internal/aimock-harness/tsconfig.json @@ -0,0 +1,14 @@ +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "moduleResolution": "Bundler", + "esModuleInterop": true, + "strict": true, + "skipLibCheck": true, + "noEmit": true, + "types": ["node"] + }, + "include": ["src/**/*.ts"], + "exclude": ["node_modules"] +} From 1b0abcee47f9c6acd095012c038ca100c48ebad6 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 21:08:20 -0700 Subject: [PATCH 04/14] feat(internal-aimock-harness): port aimock-runner + tests from chat harness --- .../aimock-harness/src/aimock-runner.spec.ts | 71 +++++++++++++++++ .../aimock-harness/src/aimock-runner.ts | 78 +++++++++++++++++++ 2 files changed, 149 insertions(+) create mode 100644 libs/internal/aimock-harness/src/aimock-runner.spec.ts create mode 100644 libs/internal/aimock-harness/src/aimock-runner.ts diff --git a/libs/internal/aimock-harness/src/aimock-runner.spec.ts b/libs/internal/aimock-harness/src/aimock-runner.spec.ts new file mode 100644 index 000000000..7c096476d --- /dev/null +++ b/libs/internal/aimock-harness/src/aimock-runner.spec.ts @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: MIT +import { describe, it, expect, afterEach } from 'vitest'; +import { writeFileSync, mkdtempSync, rmSync } from 'node:fs'; +import { join } from 'node:path'; +import { tmpdir } from 'node:os'; +import { startAimock, type AimockHandle } from './aimock-runner'; + +describe('startAimock', () => { + let handle: AimockHandle | null = null; + let workDir = ''; + + afterEach(async () => { + if (handle) await handle.stop(); + handle = null; + if (workDir) rmSync(workDir, { recursive: true, force: true }); + workDir = ''; + }); + + it('boots a replay server backed by a fixture file', async () => { + workDir = mkdtempSync(join(tmpdir(), 'aimock-test-')); + const fixturePath = join(workDir, 'hi.json'); + writeFileSync( + fixturePath, + JSON.stringify({ + fixtures: [ + { match: { userMessage: 'say hi briefly' }, response: { content: 'Hi!' } }, + ], + }), + ); + + handle = await startAimock({ mode: 'replay', fixturePath }); + expect(handle.port).toBeGreaterThan(0); + expect(handle.baseUrl).toMatch(/^http:\/\/.+\/v1$/); + + // The OpenAI SDK call path is exercised in Task 0's de-risk; this + // unit test stops at "the harness started cleanly and exposes the + // documented shape." + }); + + it('stop() is idempotent', async () => { + workDir = mkdtempSync(join(tmpdir(), 'aimock-test-')); + const fixturePath = join(workDir, 'hi.json'); + writeFileSync(fixturePath, JSON.stringify({ fixtures: [] })); + handle = await startAimock({ mode: 'replay', fixturePath }); + await handle.stop(); + await handle.stop(); + expect(true).toBe(true); + }); + + it('loads and merges all .json files in a directory', async () => { + workDir = mkdtempSync(join(tmpdir(), 'aimock-test-')); + writeFileSync( + join(workDir, 'a.json'), + JSON.stringify({ + fixtures: [{ match: { userMessage: 'one' }, response: { content: 'A' } }], + }), + ); + writeFileSync( + join(workDir, 'b.json'), + JSON.stringify({ + fixtures: [{ match: { userMessage: 'two' }, response: { content: 'B' } }], + }), + ); + // Non-JSON file in the dir should be ignored. + writeFileSync(join(workDir, 'README.md'), '# not a fixture'); + + handle = await startAimock({ mode: 'replay', fixturePath: workDir }); + expect(handle.port).toBeGreaterThan(0); + expect(handle.baseUrl).toMatch(/^http:\/\/.+\/v1$/); + }); +}); diff --git a/libs/internal/aimock-harness/src/aimock-runner.ts b/libs/internal/aimock-harness/src/aimock-runner.ts new file mode 100644 index 000000000..5392cb777 --- /dev/null +++ b/libs/internal/aimock-harness/src/aimock-runner.ts @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: MIT +import { LLMock } from '@copilotkit/aimock'; +import { readFileSync, readdirSync, statSync } from 'node:fs'; +import { join } from 'node:path'; + +export interface AimockHandle { + /** Port the mock server is listening on. */ + readonly port: number; + /** Full base URL the OpenAI SDK should target (includes /v1 suffix). */ + readonly baseUrl: string; + /** Tear down the server. Safe to call multiple times. */ + stop(): Promise; +} + +export interface AimockStartOptions { + mode: 'replay'; + /** Path to a single fixture file OR a directory of fixture files. */ + fixturePath: string; +} + +// Raw JSON entry shape passes through to aimock's FixtureFileEntry — the +// `match` block can carry richer discriminators (toolName, hasToolResult, +// turnIndex, etc.) that are needed to distinguish a parent LLM's first call +// from its continuation after a tool round. We don't narrow the shape here; +// aimock's `addFixturesFromJSON` validates structure at load time. +type FixtureFileEntry = Record; + +function loadFixtureEntries(fixturePath: string): FixtureFileEntry[] { + const stats = statSync(fixturePath); + const out: FixtureFileEntry[] = []; + const readFile = (full: string): void => { + const raw = readFileSync(full, 'utf-8'); + const parsed = JSON.parse(raw) as { fixtures: FixtureFileEntry[] }; + for (const fx of parsed.fixtures) out.push(fx); + }; + if (stats.isDirectory()) { + const files = readdirSync(fixturePath) + .filter((f) => f.endsWith('.json')) + .sort(); + for (const file of files) readFile(join(fixturePath, file)); + return out; + } + readFile(fixturePath); + return out; +} + +export async function startAimock(opts: AimockStartOptions): Promise { + const entries = loadFixtureEntries(opts.fixturePath); + + // Use a large chunkSize so each response arrives in 1-2 SSE deltas. This + // intentionally turns off the partial-markdown streaming path for harness + // tests: structural assertions (code fence, list) measure the FINAL rendered + // DOM, not the progressive render. With aggressive default chunking, the + // partial-markdown parser sometimes can't recover a triple-backtick fence + // that gets split mid-token, and the final state ends up as inline + // instead of
    . Streaming-progressive behavior is covered by the
    +  // Phase 1 unit-variance tables; the e2e harness is for final-state
    +  // invariants and cross-stack integration.
    +  const mock = new LLMock({ port: 0, chunkSize: 4096 });
    +  if (entries.length > 0) {
    +    mock.addFixturesFromJSON(entries as never);
    +  }
    +  await mock.start();
    +
    +  const port = mock.port;
    +  const baseUrl = `${mock.url}/v1`;
    +  let stopped = false;
    +
    +  return {
    +    port,
    +    baseUrl,
    +    async stop() {
    +      if (stopped) return;
    +      stopped = true;
    +      await mock.stop();
    +    },
    +  };
    +}
    
    From 74aa34575e484e3158c27ab4a71bc695c3e70a8e Mon Sep 17 00:00:00 2001
    From: Brian Love 
    Date: Fri, 15 May 2026 22:20:26 -0700
    Subject: [PATCH 05/14] feat(internal-aimock-harness): test-helpers with
     configurable path
    
    ---
     .../aimock-harness/src/test-helpers.spec.ts   | 18 ++++++++
     .../aimock-harness/src/test-helpers.ts        | 42 +++++++++++++++++++
     2 files changed, 60 insertions(+)
     create mode 100644 libs/internal/aimock-harness/src/test-helpers.spec.ts
     create mode 100644 libs/internal/aimock-harness/src/test-helpers.ts
    
    diff --git a/libs/internal/aimock-harness/src/test-helpers.spec.ts b/libs/internal/aimock-harness/src/test-helpers.spec.ts
    new file mode 100644
    index 000000000..4c90d3007
    --- /dev/null
    +++ b/libs/internal/aimock-harness/src/test-helpers.spec.ts
    @@ -0,0 +1,18 @@
    +// SPDX-License-Identifier: MIT
    +import { describe, it, expect } from 'vitest';
    +import type { SendPromptAndWaitOptions } from './test-helpers';
    +
    +// The helper itself is integration-level (drives a real Playwright page);
    +// per-example specs exercise it. This file just locks in the type contract.
    +
    +describe('SendPromptAndWaitOptions', () => {
    +  it('accepts an empty options object', () => {
    +    const opts: SendPromptAndWaitOptions = {};
    +    expect(opts.path).toBeUndefined();
    +  });
    +
    +  it('accepts a path override', () => {
    +    const opts: SendPromptAndWaitOptions = { path: '/embed' };
    +    expect(opts.path).toBe('/embed');
    +  });
    +});
    diff --git a/libs/internal/aimock-harness/src/test-helpers.ts b/libs/internal/aimock-harness/src/test-helpers.ts
    new file mode 100644
    index 000000000..d8bf44f47
    --- /dev/null
    +++ b/libs/internal/aimock-harness/src/test-helpers.ts
    @@ -0,0 +1,42 @@
    +// SPDX-License-Identifier: MIT
    +import { expect, type Locator, type Page } from '@playwright/test';
    +
    +export interface SendPromptAndWaitOptions {
    +  /** Route to navigate to before sending the prompt. Default: '/'. */
    +  path?: string;
    +}
    +
    +/**
    + * Send a user prompt and wait for the assistant bubble to finalize.
    + *
    + * "Finalized" means `chat-message[data-role="assistant"][data-streaming="false"]`:
    + * the chat composition wires `[streaming]` to `agent.isLoading() && i === lastIndex`
    + * on the latest assistant ``, so the attribute flips to `"false"`
    + * once the agent stops loading and the markdown render has settled.
    + *
    + * Asserting on intermediate streaming-state DOM (partial `
      `, in-flight + * code fences, etc.) is the source of e2e flake — always wait on this + * attribute before counting or text-matching downstream of the assistant turn. + */ +export async function sendPromptAndWait( + page: Page, + prompt: string, + opts?: SendPromptAndWaitOptions, +): Promise { + const path = opts?.path ?? '/'; + await page.goto(path); + const input = page.getByRole('textbox', { name: /message|prompt/i }); + await input.fill(prompt); + await page.getByRole('button', { name: /send/i }).click(); + + const finalizedAssistant = page + .locator('chat-message[data-role="assistant"][data-streaming="false"]') + .last(); + await expect(finalizedAssistant).toBeAttached({ timeout: 45_000 }); + await expect + .poll(async () => ((await finalizedAssistant.innerText()) ?? '').trim().length, { + timeout: 30_000, + }) + .toBeGreaterThan(0); + return finalizedAssistant; +} From 06cd32d75651b62e361c532f0514fe3bf5c34525 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 22:22:44 -0700 Subject: [PATCH 06/14] feat(internal-aimock-harness): global-teardown with multi-slot state --- .../aimock-harness/src/global-teardown.ts | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 libs/internal/aimock-harness/src/global-teardown.ts diff --git a/libs/internal/aimock-harness/src/global-teardown.ts b/libs/internal/aimock-harness/src/global-teardown.ts new file mode 100644 index 000000000..678dbc3ba --- /dev/null +++ b/libs/internal/aimock-harness/src/global-teardown.ts @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: MIT +import type { ChildProcess } from 'node:child_process'; +import type { AimockHandle } from './aimock-runner'; + +interface SharedState { + aimock: AimockHandle; + langgraph: ChildProcess; + angular: ChildProcess; +} + +declare global { + // eslint-disable-next-line no-var + var __AIMOCK_HARNESS_STATE__: Map | undefined; +} + +/** + * Default Playwright globalTeardown. Walks every state slot the factory + * registered (one per Angular project), kills processes in reverse order + * (Angular → langgraph → aimock), awaits aimock stop. Idempotent. + */ +export default async function globalTeardown(): Promise { + const states = globalThis.__AIMOCK_HARNESS_STATE__; + if (!states) return; + for (const state of states.values()) { + state.angular.kill('SIGTERM'); + state.langgraph.kill('SIGTERM'); + await state.aimock.stop(); + } + globalThis.__AIMOCK_HARNESS_STATE__ = undefined; +} From f77b95c94b0d5b0dc506ed64677c7145ec378e0c Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 22:25:21 -0700 Subject: [PATCH 07/14] feat(internal-aimock-harness): createGlobalSetup factory --- .../src/global-setup-factory.ts | 118 ++++++++++++++++++ 1 file changed, 118 insertions(+) create mode 100644 libs/internal/aimock-harness/src/global-setup-factory.ts diff --git a/libs/internal/aimock-harness/src/global-setup-factory.ts b/libs/internal/aimock-harness/src/global-setup-factory.ts new file mode 100644 index 000000000..f7d069de0 --- /dev/null +++ b/libs/internal/aimock-harness/src/global-setup-factory.ts @@ -0,0 +1,118 @@ +// SPDX-License-Identifier: MIT +import { spawn, type ChildProcess } from 'node:child_process'; +import { setTimeout as delay } from 'node:timers/promises'; +import { resolve } from 'node:path'; +import { startAimock, type AimockHandle } from './aimock-runner'; + +export interface CreateGlobalSetupOpts { + /** Repo-relative path to the python langgraph project. */ + langgraphCwd: string; + /** Port the langgraph dev server binds. Default: 8123. */ + langgraphPort?: number; + /** Nx project name of the Angular dev server. */ + angularProject: string; + /** Port the Angular dev server should bind. */ + angularPort: number; + /** Absolute path to the per-example fixtures dir. */ + fixturesDir: string; + /** Default 90_000. */ + langgraphReadyTimeoutMs?: number; + /** Default 120_000. */ + angularReadyTimeoutMs?: number; +} + +interface SharedState { + aimock: AimockHandle; + langgraph: ChildProcess; + angular: ChildProcess; +} + +declare global { + // eslint-disable-next-line no-var + var __AIMOCK_HARNESS_STATE__: Map | undefined; +} + +async function waitForPort(url: string, timeoutMs: number, label: string): Promise { + const start = Date.now(); + while (Date.now() - start < timeoutMs) { + try { + const res = await fetch(url); + if (res.ok || res.status === 404) return; + } catch { + // server not up yet + } + await delay(500); + } + throw new Error(`[${label}] not ready at ${url} within ${timeoutMs}ms`); +} + +function repoRoot(opts: CreateGlobalSetupOpts): string { + // The factory is called from per-example playwright configs that themselves + // live many levels deep. We compute REPO_ROOT relative to the fixturesDir + // (which the consumer passes as an absolute path) so the consumer doesn't + // need to pass it explicitly. The repo root is the nearest ancestor that + // contains a `cockpit/` directory; for our layout, walking up from any + // per-example fixturesDir hits the repo root in 5 levels. + let dir = opts.fixturesDir; + for (let i = 0; i < 10; i++) { + if (require('node:fs').existsSync(require('node:path').join(dir, 'cockpit'))) { + return dir; + } + dir = require('node:path').dirname(dir); + } + throw new Error('repo root not found from fixturesDir; passed: ' + opts.fixturesDir); +} + +export function createGlobalSetup(opts: CreateGlobalSetupOpts): () => Promise { + const langgraphPort = opts.langgraphPort ?? 8123; + const langgraphTimeout = opts.langgraphReadyTimeoutMs ?? 90_000; + const angularTimeout = opts.angularReadyTimeoutMs ?? 120_000; + + return async function globalSetup(): Promise { + const root = repoRoot(opts); + const aimock = await startAimock({ mode: 'replay', fixturePath: opts.fixturesDir }); + // eslint-disable-next-line no-console + console.log(`[aimock-harness] aimock listening at ${aimock.baseUrl}`); + + const langgraph = spawn( + 'uv', + ['run', 'langgraph', 'dev', '--port', String(langgraphPort), '--no-browser'], + { + cwd: resolve(root, opts.langgraphCwd), + env: { + ...process.env, + OPENAI_BASE_URL: aimock.baseUrl, + OPENAI_API_KEY: 'test-not-used', + }, + stdio: 'pipe', + }, + ); + langgraph.stdout?.on('data', (b) => process.stdout.write(`[langgraph] ${b}`)); + langgraph.stderr?.on('data', (b) => process.stderr.write(`[langgraph] ${b}`)); + + await waitForPort(`http://localhost:${langgraphPort}/ok`, langgraphTimeout, 'langgraph'); + // eslint-disable-next-line no-console + console.log(`[aimock-harness] langgraph ready on :${langgraphPort}`); + + const angular = spawn( + 'npx', + ['nx', 'serve', opts.angularProject, '--port', String(opts.angularPort)], + { + cwd: root, + env: { ...process.env }, + stdio: 'pipe', + }, + ); + angular.stdout?.on('data', (b) => process.stdout.write(`[angular] ${b}`)); + angular.stderr?.on('data', (b) => process.stderr.write(`[angular] ${b}`)); + + await waitForPort(`http://localhost:${opts.angularPort}/`, angularTimeout, 'angular'); + // eslint-disable-next-line no-console + console.log(`[aimock-harness] angular ready on :${opts.angularPort} (${opts.angularProject})`); + + if (!globalThis.__AIMOCK_HARNESS_STATE__) { + globalThis.__AIMOCK_HARNESS_STATE__ = new Map(); + } + globalThis.__AIMOCK_HARNESS_STATE__.set(opts.angularProject, { aimock, langgraph, angular }); + }; +} From 2b0b52e923b1d11d900b7b307ebc64815bc2b8af Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 22:28:42 -0700 Subject: [PATCH 08/14] chore(tsconfig): add @ngaf-internal/aimock-harness path alias --- tsconfig.base.json | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tsconfig.base.json b/tsconfig.base.json index 470b9d414..d40c120ed 100644 --- a/tsconfig.base.json +++ b/tsconfig.base.json @@ -36,7 +36,9 @@ "@ngaf/render": ["libs/render/src/public-api.ts"], "@ngaf/telemetry": ["libs/telemetry/src/index.ts"], "@ngaf/telemetry/browser": ["libs/telemetry/src/browser/public-api.ts"], - "@ngaf/telemetry/node": ["libs/telemetry/src/node/index.ts"] + "@ngaf/telemetry/node": ["libs/telemetry/src/node/index.ts"], + "@ngaf-internal/aimock-harness": ["libs/internal/aimock-harness/src/index.ts"], + "@ngaf-internal/aimock-harness/global-teardown": ["libs/internal/aimock-harness/src/global-teardown.ts"] }, "skipLibCheck": true, "strict": true, From be7ecf4a2850ac7ba2e4cfa54452efe568ecf868 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 22:40:10 -0700 Subject: [PATCH 09/14] feat(cockpit-langgraph-streaming): migrate aimock e2e to per-example layout --- .../streaming/angular/e2e/.gitignore | 3 + .../angular/e2e/fixtures/streaming.json | 12 ++++ .../angular/e2e/global-setup-impl.ts | 10 ++++ .../angular/e2e/playwright.config.ts | 18 ++++++ .../angular/e2e/scripts/record-streaming.py | 58 +++++++++++++++++++ .../streaming/angular/e2e/streaming.spec.ts | 43 +++++--------- .../streaming/angular/e2e/tsconfig.json | 14 +++++ .../langgraph/streaming/angular/project.json | 6 ++ 8 files changed, 136 insertions(+), 28 deletions(-) create mode 100644 cockpit/langgraph/streaming/angular/e2e/.gitignore create mode 100644 cockpit/langgraph/streaming/angular/e2e/fixtures/streaming.json create mode 100644 cockpit/langgraph/streaming/angular/e2e/global-setup-impl.ts create mode 100644 cockpit/langgraph/streaming/angular/e2e/playwright.config.ts create mode 100644 cockpit/langgraph/streaming/angular/e2e/scripts/record-streaming.py create mode 100644 cockpit/langgraph/streaming/angular/e2e/tsconfig.json diff --git a/cockpit/langgraph/streaming/angular/e2e/.gitignore b/cockpit/langgraph/streaming/angular/e2e/.gitignore new file mode 100644 index 000000000..059a55910 --- /dev/null +++ b/cockpit/langgraph/streaming/angular/e2e/.gitignore @@ -0,0 +1,3 @@ +test-results/ +playwright-report/ +*.tmp diff --git a/cockpit/langgraph/streaming/angular/e2e/fixtures/streaming.json b/cockpit/langgraph/streaming/angular/e2e/fixtures/streaming.json new file mode 100644 index 000000000..d54869ff9 --- /dev/null +++ b/cockpit/langgraph/streaming/angular/e2e/fixtures/streaming.json @@ -0,0 +1,12 @@ +{ + "fixtures": [ + { + "match": { + "userMessage": "Tell me one quick fact about Angular signals in two sentences." + }, + "response": { + "content": "Angular signals are a reactive primitive (signal, computed, effect) that track dependencies to provide fine-grained reactivity and more efficient change detection. They let you update state synchronously via set()/update() and ensure only consumers that read an affected signal are re\u2011evaluated." + } + } + ] +} diff --git a/cockpit/langgraph/streaming/angular/e2e/global-setup-impl.ts b/cockpit/langgraph/streaming/angular/e2e/global-setup-impl.ts new file mode 100644 index 000000000..3f2390e7b --- /dev/null +++ b/cockpit/langgraph/streaming/angular/e2e/global-setup-impl.ts @@ -0,0 +1,10 @@ +// SPDX-License-Identifier: MIT +import { resolve } from 'node:path'; +import { createGlobalSetup } from '../../../../../libs/internal/aimock-harness/src'; + +export default createGlobalSetup({ + langgraphCwd: 'cockpit/langgraph/streaming/python', + angularProject: 'cockpit-langgraph-streaming-angular', + angularPort: 4300, + fixturesDir: resolve(__dirname, 'fixtures'), +}); diff --git a/cockpit/langgraph/streaming/angular/e2e/playwright.config.ts b/cockpit/langgraph/streaming/angular/e2e/playwright.config.ts new file mode 100644 index 000000000..9ec8c24f5 --- /dev/null +++ b/cockpit/langgraph/streaming/angular/e2e/playwright.config.ts @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: MIT +import { defineConfig, devices } from '@playwright/test'; + +export default defineConfig({ + testDir: '.', + testMatch: '**/*.spec.ts', + fullyParallel: false, + workers: 1, + retries: process.env.CI ? 2 : 0, + reporter: process.env.CI ? [['list'], ['html', { open: 'never' }]] : 'list', + use: { + baseURL: 'http://localhost:4300', + trace: 'retain-on-failure', + }, + projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }], + globalSetup: './global-setup-impl.ts', + globalTeardown: require.resolve('../../../../../libs/internal/aimock-harness/src/global-teardown'), +}); diff --git a/cockpit/langgraph/streaming/angular/e2e/scripts/record-streaming.py b/cockpit/langgraph/streaming/angular/e2e/scripts/record-streaming.py new file mode 100644 index 000000000..3a9228085 --- /dev/null +++ b/cockpit/langgraph/streaming/angular/e2e/scripts/record-streaming.py @@ -0,0 +1,58 @@ +"""Capture a real text response from the streaming graph's LLM. + +Mirrors cockpit/langgraph/streaming/python/src/graph.py's +build_streaming_graph() setup: ChatOpenAI(gpt-5-mini, streaming=True) ++ system prompt from prompts/streaming.md. + +Run from repo root: + OPENAI_API_KEY=sk-... uv run --project cockpit/langgraph/streaming/python \ + python apps/cockpit/e2e/scripts/record-streaming.py +""" +import json +import os +import sys +from pathlib import Path + +env_path = Path("cockpit/langgraph/streaming/python/.env") +if env_path.exists(): + for line in env_path.read_text().splitlines(): + line = line.strip() + if line and not line.startswith("#") and "=" in line: + k, _, v = line.partition("=") + os.environ.setdefault(k.strip(), v.strip().strip('"').strip("'")) + +if not os.environ.get("OPENAI_API_KEY"): + print("OPENAI_API_KEY not set (in env or .env)", file=sys.stderr) + sys.exit(1) + +from langchain_core.messages import HumanMessage, SystemMessage +from langchain_openai import ChatOpenAI + +PROMPT = "Tell me one quick fact about Angular signals in two sentences." +SYSTEM_PROMPT = ( + Path("cockpit/langgraph/streaming/python/prompts/streaming.md").read_text() +) + +llm = ChatOpenAI(model="gpt-5-mini", temperature=0) +response = llm.invoke( + [SystemMessage(content=SYSTEM_PROMPT), HumanMessage(content=PROMPT)], +) +text = response.content if isinstance(response.content, str) else "" +if not text.strip(): + print("LLM returned empty content; cannot build fixture", file=sys.stderr) + sys.exit(2) +print(f"captured {len(text)} chars; first 80: {text[:80]!r}") + +fixture = { + "fixtures": [ + { + "match": {"userMessage": PROMPT}, + "response": {"content": text}, + } + ] +} + +out_path = Path("apps/cockpit/e2e/fixtures/streaming.json") +out_path.parent.mkdir(parents=True, exist_ok=True) +out_path.write_text(json.dumps(fixture, indent=2) + "\n") +print(f"\nWrote fixture to {out_path}") diff --git a/cockpit/langgraph/streaming/angular/e2e/streaming.spec.ts b/cockpit/langgraph/streaming/angular/e2e/streaming.spec.ts index b5938670f..18231181a 100644 --- a/cockpit/langgraph/streaming/angular/e2e/streaming.spec.ts +++ b/cockpit/langgraph/streaming/angular/e2e/streaming.spec.ts @@ -1,31 +1,18 @@ -import { expect, test } from '@playwright/test'; +// SPDX-License-Identifier: MIT +import { test, expect } from '@playwright/test'; +import { sendPromptAndWait } from '../../../../../libs/internal/aimock-harness/src'; -test.describe('LangGraph Streaming Example', () => { - test.beforeEach(async ({ page }) => { - await page.goto('http://localhost:4300'); - await page.waitForSelector('app-streaming', { state: 'attached' }); - }); +test('streaming: assistant text from the mocked LLM renders in the cockpit chat composition', async ({ page }) => { + const bubble = await sendPromptAndWait( + page, + 'Tell me one quick fact about Angular signals in two sentences.', + ); - test('renders the chat interface', async ({ page }) => { - await expect(page.locator('textarea[name="messageText"]')).toBeVisible(); - await expect(page.locator('button[type="submit"]')).toBeVisible(); - await expect(page.locator('button[type="submit"]')).toHaveText('Send'); - }); - - test('sends a message and receives a streamed response', async ({ page }) => { - // Type a message - await page.fill('textarea[name="messageText"]', 'Say exactly: test response ok'); - - // Click send - await page.click('button[type="submit"]'); - - // Wait for the AI response to appear - await expect(page.locator('.chat-md').first()).toBeVisible({ timeout: 30000 }); - - // The AI response should have content - await expect(page.locator('.chat-md').first()).not.toBeEmpty({ timeout: 30000 }); - - // The button should show Send again (not Streaming...) - await expect(page.locator('button[type="submit"]')).toHaveText('Send', { timeout: 30000 }); - }); + // The captured fixture's content (Angular signals fact) must reach the + // rendered bubble. Proves: aimock served the streaming graph's LLM call, + // langgraph routed back the AI message, the cockpit-langgraph-streaming-angular + // app rendered it via the chat composition, and the streaming-finalized + // signal (data-streaming="false") settled. + const finalText = await bubble.innerText(); + expect(finalText.toLowerCase()).toContain('signal'); }); diff --git a/cockpit/langgraph/streaming/angular/e2e/tsconfig.json b/cockpit/langgraph/streaming/angular/e2e/tsconfig.json new file mode 100644 index 000000000..0b5aeecbf --- /dev/null +++ b/cockpit/langgraph/streaming/angular/e2e/tsconfig.json @@ -0,0 +1,14 @@ +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "moduleResolution": "Bundler", + "esModuleInterop": true, + "strict": true, + "skipLibCheck": true, + "noEmit": true, + "types": ["node"] + }, + "include": ["**/*.ts"], + "exclude": ["node_modules", "test-results", "playwright-report"] +} diff --git a/cockpit/langgraph/streaming/angular/project.json b/cockpit/langgraph/streaming/angular/project.json index 6f75b03d1..f059e9715 100644 --- a/cockpit/langgraph/streaming/angular/project.json +++ b/cockpit/langgraph/streaming/angular/project.json @@ -56,6 +56,12 @@ "cwd": "cockpit/langgraph/streaming/angular", "command": "npx tsx -e \"import { langgraphStreamingAngularModule } from './src/index.ts'; const module = langgraphStreamingAngularModule; if (module.id !== 'langgraph-streaming-angular' || module.title !== 'LangGraph Streaming (Angular)') { throw new Error('Unexpected module shape for ' + module.id); }\"" } + }, + "e2e": { + "executor": "@nx/playwright:playwright", + "options": { + "config": "cockpit/langgraph/streaming/angular/e2e/playwright.config.ts" + } } } } From 67f93cd8097316fcc0e4d9d22577a32c0a1b7ee2 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 23:21:54 -0700 Subject: [PATCH 10/14] fix(internal-aimock-harness): wait for agent-idle signal not per-message streaming --- .../tool-calls/angular/e2e/tool-calls.spec.ts | 20 ------------- .../aimock-harness/src/test-helpers.ts | 29 ++++++++++++++----- 2 files changed, 22 insertions(+), 27 deletions(-) delete mode 100644 cockpit/chat/tool-calls/angular/e2e/tool-calls.spec.ts diff --git a/cockpit/chat/tool-calls/angular/e2e/tool-calls.spec.ts b/cockpit/chat/tool-calls/angular/e2e/tool-calls.spec.ts deleted file mode 100644 index 5f33c76f6..000000000 --- a/cockpit/chat/tool-calls/angular/e2e/tool-calls.spec.ts +++ /dev/null @@ -1,20 +0,0 @@ -import { expect, test } from '@playwright/test'; - -test.describe('Chat Tool Calls Example', () => { - test.beforeEach(async ({ page }) => { - await page.goto('http://localhost:4504'); - await page.waitForSelector('app-tool-calls', { state: 'attached' }); - }); - - test('renders the chat interface with tool calls sidebar', async ({ page }) => { - await expect(page.locator('chat')).toBeVisible(); - await expect(page.locator('aside')).toBeVisible(); - await expect(page.locator('aside h3')).toHaveText('Tool Calls'); - }); - - test('displays the available tools list', async ({ page }) => { - await expect(page.locator('aside')).toContainText('search'); - await expect(page.locator('aside')).toContainText('calculator'); - await expect(page.locator('aside')).toContainText('weather'); - }); -}); diff --git a/libs/internal/aimock-harness/src/test-helpers.ts b/libs/internal/aimock-harness/src/test-helpers.ts index d8bf44f47..4231bef23 100644 --- a/libs/internal/aimock-harness/src/test-helpers.ts +++ b/libs/internal/aimock-harness/src/test-helpers.ts @@ -27,16 +27,31 @@ export async function sendPromptAndWait( await page.goto(path); const input = page.getByRole('textbox', { name: /message|prompt/i }); await input.fill(prompt); - await page.getByRole('button', { name: /send/i }).click(); + // Capture the send button BEFORE click — same node will flip to "Stop + // generating" while loading, then back to "Send" when the agent finishes + // ALL turns (tool calls + continuations included). + const sendButton = page.getByRole('button', { name: /send/i }); + await sendButton.click(); + // Wait for the agent to enter the loading state (Stop generating visible). + // Brief — typically <1s. Catches the case where the click didn't dispatch. + await expect(page.getByRole('button', { name: /stop generating/i })).toBeVisible({ + timeout: 10_000, + }); + + // Now wait for the agent to fully finish: Stop generating gone, Send back. + // This is the durable agent-level idle signal — survives multi-turn flows + // (tool_call → tool_result → continuation). Per-message data-streaming + // flips multiple times during a single turn and races with .last(). + await expect(page.getByRole('button', { name: /stop generating/i })).not.toBeAttached({ + timeout: 60_000, + }); + + // Return the last finalized assistant bubble — guaranteed to be the + // FINAL message in the turn now that the agent is fully idle. const finalizedAssistant = page .locator('chat-message[data-role="assistant"][data-streaming="false"]') .last(); - await expect(finalizedAssistant).toBeAttached({ timeout: 45_000 }); - await expect - .poll(async () => ((await finalizedAssistant.innerText()) ?? '').trim().length, { - timeout: 30_000, - }) - .toBeGreaterThan(0); + await expect(finalizedAssistant).toBeAttached({ timeout: 5_000 }); return finalizedAssistant; } From e30b50f4d565801db064e28006f28bfdf54a22bd Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 23:43:46 -0700 Subject: [PATCH 11/14] =?UTF-8?q?test(cockpit-chat-tool-calls):=20aimock?= =?UTF-8?q?=20e2e=20=E2=80=94=20multi-turn=20tool-call=20flow?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../chat/tool-calls/angular/e2e/.gitignore | 3 + .../angular/e2e/c-tool-calls.spec.ts | 20 ++++ .../angular/e2e/fixtures/c-tool-calls.json | 28 ++++++ .../angular/e2e/global-setup-impl.ts | 10 ++ .../angular/e2e/playwright.config.ts | 18 ++++ .../e2e/scripts/record-c-tool-calls.py | 95 +++++++++++++++++++ .../chat/tool-calls/angular/e2e/tsconfig.json | 14 +++ cockpit/chat/tool-calls/angular/project.json | 6 ++ 8 files changed, 194 insertions(+) create mode 100644 cockpit/chat/tool-calls/angular/e2e/.gitignore create mode 100644 cockpit/chat/tool-calls/angular/e2e/c-tool-calls.spec.ts create mode 100644 cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json create mode 100644 cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts create mode 100644 cockpit/chat/tool-calls/angular/e2e/playwright.config.ts create mode 100644 cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py create mode 100644 cockpit/chat/tool-calls/angular/e2e/tsconfig.json diff --git a/cockpit/chat/tool-calls/angular/e2e/.gitignore b/cockpit/chat/tool-calls/angular/e2e/.gitignore new file mode 100644 index 000000000..059a55910 --- /dev/null +++ b/cockpit/chat/tool-calls/angular/e2e/.gitignore @@ -0,0 +1,3 @@ +test-results/ +playwright-report/ +*.tmp diff --git a/cockpit/chat/tool-calls/angular/e2e/c-tool-calls.spec.ts b/cockpit/chat/tool-calls/angular/e2e/c-tool-calls.spec.ts new file mode 100644 index 000000000..2aa396c68 --- /dev/null +++ b/cockpit/chat/tool-calls/angular/e2e/c-tool-calls.spec.ts @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: MIT +import { test, expect } from '@playwright/test'; +import { sendPromptAndWait } from '../../../../../libs/internal/aimock-harness/src'; + +const PROMPT = "What's the status of UA123?"; + +test('c-tool-calls: parent dispatches lookup_flight tool, continuation surfaces flight data', async ({ page }) => { + const bubble = await sendPromptAndWait(page, PROMPT); + + // The chat-tool-calls primitive renders a card per tool call. Card label + // includes the tool name. Asserting it's in the DOM proves the parent's + // tool_call routed through the chat-tool-calls UI primitive. + const toolCallChip = page.getByRole('button', { name: /lookup_flight|tool/i }).first(); + await expect(toolCallChip).toBeVisible({ timeout: 30_000 }); + + // The continuation's text mentions a distinctive phrase from the captured + // response — proves the tool-result-then-text loop completed end-to-end. + const finalText = await bubble.innerText(); + expect(finalText.toLowerCase()).toContain('ua123'); +}); diff --git a/cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json b/cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json new file mode 100644 index 000000000..c7527f58c --- /dev/null +++ b/cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json @@ -0,0 +1,28 @@ +{ + "fixtures": [ + { + "match": { + "userMessage": "What's the status of UA123?", + "hasToolResult": true + }, + "response": { + "content": "I looked up UA123 using lookup_flight. Status: on time.\n\nSummary:\n- Flight: UA123 (United Airlines)\n- Route: LAX \u2192 JFK\n- Departure (local): 08:00 from gate B14\n- Arrival (local): 16:30\n- Aircraft: Boeing 787\n- Scheduled duration: 5 hr 30 min\n\nWould you like me to monitor this flight for updates or find alternate flights?" + } + }, + { + "match": { + "userMessage": "What's the status of UA123?" + }, + "response": { + "toolCalls": [ + { + "name": "lookup_flight", + "arguments": { + "flight_number": "UA123" + } + } + ] + } + } + ] +} diff --git a/cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts b/cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts new file mode 100644 index 000000000..8cd6a301a --- /dev/null +++ b/cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts @@ -0,0 +1,10 @@ +// SPDX-License-Identifier: MIT +import { resolve } from 'node:path'; +import { createGlobalSetup } from '../../../../../libs/internal/aimock-harness/src'; + +export default createGlobalSetup({ + langgraphCwd: 'cockpit/langgraph/streaming/python', + angularProject: 'cockpit-chat-tool-calls-angular', + angularPort: 4504, + fixturesDir: resolve(__dirname, 'fixtures'), +}); diff --git a/cockpit/chat/tool-calls/angular/e2e/playwright.config.ts b/cockpit/chat/tool-calls/angular/e2e/playwright.config.ts new file mode 100644 index 000000000..d473509d5 --- /dev/null +++ b/cockpit/chat/tool-calls/angular/e2e/playwright.config.ts @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: MIT +import { defineConfig, devices } from '@playwright/test'; + +export default defineConfig({ + testDir: '.', + testMatch: '**/*.spec.ts', + fullyParallel: false, + workers: 1, + retries: process.env.CI ? 2 : 0, + reporter: process.env.CI ? [['list'], ['html', { open: 'never' }]] : 'list', + use: { + baseURL: 'http://localhost:4504', + trace: 'retain-on-failure', + }, + projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }], + globalSetup: './global-setup-impl.ts', + globalTeardown: require.resolve('../../../../../libs/internal/aimock-harness/src/global-teardown'), +}); diff --git a/cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py b/cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py new file mode 100644 index 000000000..a0d529fe7 --- /dev/null +++ b/cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py @@ -0,0 +1,95 @@ +"""Capture parent first-call (tool_call) + continuation (text) for c-tool-calls. + +Mirrors cockpit/langgraph/streaming/python/src/chat_graphs.py's +_build_tool_calls_graph() LLM setup: ChatOpenAI(gpt-5-mini, streaming=True) +bound with AVIATION_TOOLS, system prompt from prompts/tool-calls.md. + +Two LLM calls captured, written into one fixture with the hasToolResult +discriminator on the continuation entry. + +Run from repo root: + OPENAI_API_KEY=sk-... uv run --project cockpit/langgraph/streaming/python \ + python cockpit/chat/tool-calls/angular/e2e/scripts/record-c-tool-calls.py +""" +import asyncio +import json +import os +import sys +import uuid +from pathlib import Path + +env_path = Path("cockpit/langgraph/streaming/python/.env") +if env_path.exists(): + for line in env_path.read_text().splitlines(): + line = line.strip() + if line and not line.startswith("#") and "=" in line: + k, _, v = line.partition("=") + os.environ.setdefault(k.strip(), v.strip().strip('"').strip("'")) + +if not os.environ.get("OPENAI_API_KEY"): + print("OPENAI_API_KEY not set", file=sys.stderr) + sys.exit(1) + +sys.path.insert(0, str(Path("cockpit/langgraph/streaming/python/src").resolve())) + +from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage +from langchain_openai import ChatOpenAI + +from src.aviation_tools import ALL_TOOLS as AVIATION_TOOLS, lookup_flight # type: ignore + +PROMPT = "What's the status of UA123?" +SYSTEM_PROMPT = ( + Path("cockpit/langgraph/streaming/python/prompts/tool-calls.md").read_text() +) + +llm = ChatOpenAI(model="gpt-5-mini", temperature=0).bind_tools(AVIATION_TOOLS) + +# 1. Parent's first call. +first = llm.invoke([SystemMessage(content=SYSTEM_PROMPT), HumanMessage(content=PROMPT)]) +assert first.tool_calls, f"Parent did not emit tool_calls; content={first.content!r}" +tc = first.tool_calls[0] +tc_args = tc.get("args") or {} +tc_id = tc.get("id") or f"call_{uuid.uuid4().hex[:12]}" +print(f"1. parent tool_call name={tc.get('name')} args={tc_args}") + +# 2. Tool result (real lookup_flight). +tool_result = asyncio.run(lookup_flight.ainvoke(tc_args)) # returns canned aviation data +print(f"2. tool result length={len(str(tool_result))}") + +# 3. Parent's continuation call. +continuation = llm.invoke( + [ + SystemMessage(content=SYSTEM_PROMPT), + HumanMessage(content=PROMPT), + AIMessage( + content="", + tool_calls=[{"name": tc.get("name"), "args": tc_args, "id": tc_id, "type": "tool_call"}], + ), + ToolMessage(content=str(tool_result), tool_call_id=tc_id), + ], +) +text = continuation.content if isinstance(continuation.content, str) else "" +if not text.strip(): + print("Continuation returned empty; aborting", file=sys.stderr) + sys.exit(2) +print(f"3. continuation: {len(text)} chars; first 80: {text[:80]!r}") + +fixture = { + "fixtures": [ + # ORDER MATTERS: continuation match is more specific (hasToolResult); + # aimock evaluates fixtures top-to-bottom and picks the first match. + { + "match": {"userMessage": PROMPT, "hasToolResult": True}, + "response": {"content": text}, + }, + { + "match": {"userMessage": PROMPT}, + "response": {"toolCalls": [{"name": tc.get("name"), "arguments": tc_args}]}, + }, + ] +} + +out_path = Path("cockpit/chat/tool-calls/angular/e2e/fixtures/c-tool-calls.json") +out_path.parent.mkdir(parents=True, exist_ok=True) +out_path.write_text(json.dumps(fixture, indent=2) + "\n") +print(f"\nWrote fixture to {out_path}") diff --git a/cockpit/chat/tool-calls/angular/e2e/tsconfig.json b/cockpit/chat/tool-calls/angular/e2e/tsconfig.json new file mode 100644 index 000000000..0b5aeecbf --- /dev/null +++ b/cockpit/chat/tool-calls/angular/e2e/tsconfig.json @@ -0,0 +1,14 @@ +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "moduleResolution": "Bundler", + "esModuleInterop": true, + "strict": true, + "skipLibCheck": true, + "noEmit": true, + "types": ["node"] + }, + "include": ["**/*.ts"], + "exclude": ["node_modules", "test-results", "playwright-report"] +} diff --git a/cockpit/chat/tool-calls/angular/project.json b/cockpit/chat/tool-calls/angular/project.json index 533b9f573..d7faed5ff 100644 --- a/cockpit/chat/tool-calls/angular/project.json +++ b/cockpit/chat/tool-calls/angular/project.json @@ -56,6 +56,12 @@ "cwd": "cockpit/chat/tool-calls/angular", "command": "npx tsx -e \"import { chatToolCallsAngularModule } from './src/index.ts'; const module = chatToolCallsAngularModule; if (module.id !== 'chat-tool-calls-angular' || module.title !== 'Chat Tool Calls (Angular)') { throw new Error('Unexpected module shape for ' + module.id); }\"" } + }, + "e2e": { + "executor": "@nx/playwright:playwright", + "options": { + "config": "cockpit/chat/tool-calls/angular/e2e/playwright.config.ts" + } } } } From 07f41e03768d224742a2fd859dc995772996b59a Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 23:54:43 -0700 Subject: [PATCH 12/14] chore(cockpit): drop legacy apps/cockpit/e2e (migrated to per-example dirs) --- apps/cockpit/e2e/.gitignore | 3 - apps/cockpit/e2e/README.md | 33 -------- apps/cockpit/e2e/aimock-runner.spec.ts | 71 ------------------ apps/cockpit/e2e/aimock-runner.ts | 78 ------------------- apps/cockpit/e2e/fixtures/streaming.json | 12 --- apps/cockpit/e2e/global-setup.ts | 79 -------------------- apps/cockpit/e2e/global-teardown.ts | 9 --- apps/cockpit/e2e/playwright.config.ts | 24 ------ apps/cockpit/e2e/scripts/record-streaming.py | 58 -------------- apps/cockpit/e2e/streaming.spec.ts | 18 ----- apps/cockpit/e2e/test-helpers.ts | 32 -------- apps/cockpit/e2e/tsconfig.json | 15 ---- apps/cockpit/project.json | 6 -- 13 files changed, 438 deletions(-) delete mode 100644 apps/cockpit/e2e/.gitignore delete mode 100644 apps/cockpit/e2e/README.md delete mode 100644 apps/cockpit/e2e/aimock-runner.spec.ts delete mode 100644 apps/cockpit/e2e/aimock-runner.ts delete mode 100644 apps/cockpit/e2e/fixtures/streaming.json delete mode 100644 apps/cockpit/e2e/global-setup.ts delete mode 100644 apps/cockpit/e2e/global-teardown.ts delete mode 100644 apps/cockpit/e2e/playwright.config.ts delete mode 100644 apps/cockpit/e2e/scripts/record-streaming.py delete mode 100644 apps/cockpit/e2e/streaming.spec.ts delete mode 100644 apps/cockpit/e2e/test-helpers.ts delete mode 100644 apps/cockpit/e2e/tsconfig.json diff --git a/apps/cockpit/e2e/.gitignore b/apps/cockpit/e2e/.gitignore deleted file mode 100644 index 059a55910..000000000 --- a/apps/cockpit/e2e/.gitignore +++ /dev/null @@ -1,3 +0,0 @@ -test-results/ -playwright-report/ -*.tmp diff --git a/apps/cockpit/e2e/README.md b/apps/cockpit/e2e/README.md deleted file mode 100644 index 708953da5..000000000 --- a/apps/cockpit/e2e/README.md +++ /dev/null @@ -1,33 +0,0 @@ -# cockpit e2e - -Cross-stack E2E harness for cockpit example apps. Uses [`@copilotkit/aimock`](https://github.com/CopilotKit/aimock) as a deterministic mock for LLM API calls; the per-product Python LangGraph dev server is launched with `OPENAI_BASE_URL` pointed at it; Playwright drives the example Angular app in real Chromium. - -Phase 1 covers `c-messages` only. Future phases each add one example (one fixture + one spec file per PR). - -## Run the suite - -``` -npx nx e2e cockpit -``` - -Replay-only. No `OPENAI_API_KEY` needed. Reads committed fixtures from `fixtures/`. - -## Refresh a fixture - -Each captured fixture has a recipe script under `scripts/`. Example for the c-messages fixture: - -``` -OPENAI_API_KEY=sk-... uv run --project cockpit/langgraph/streaming/python \ - python apps/cockpit/e2e/scripts/record-c-messages.py -``` - -Commit the updated `fixtures/c-messages.json`. Scripts are dev-only; CI never runs them. - -## Layout - -- `aimock-runner.ts` — programmatic boot of the mock server (mirrors `examples/chat/aimock-e2e/aimock-runner.ts`). -- `test-helpers.ts` — `sendPromptAndWait` helper that waits on `chat-message[data-streaming="false"]`. -- `fixtures/` — committed JSON fixtures keyed by example. -- `scripts/` — fixture-capture recipes (one per fixture). -- `playwright.config.ts` — Playwright config with globalSetup that boots aimock + LangGraph + Angular dev server. -- `c-messages.spec.ts` — Phase 1 pilot. diff --git a/apps/cockpit/e2e/aimock-runner.spec.ts b/apps/cockpit/e2e/aimock-runner.spec.ts deleted file mode 100644 index 7c096476d..000000000 --- a/apps/cockpit/e2e/aimock-runner.spec.ts +++ /dev/null @@ -1,71 +0,0 @@ -// SPDX-License-Identifier: MIT -import { describe, it, expect, afterEach } from 'vitest'; -import { writeFileSync, mkdtempSync, rmSync } from 'node:fs'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { startAimock, type AimockHandle } from './aimock-runner'; - -describe('startAimock', () => { - let handle: AimockHandle | null = null; - let workDir = ''; - - afterEach(async () => { - if (handle) await handle.stop(); - handle = null; - if (workDir) rmSync(workDir, { recursive: true, force: true }); - workDir = ''; - }); - - it('boots a replay server backed by a fixture file', async () => { - workDir = mkdtempSync(join(tmpdir(), 'aimock-test-')); - const fixturePath = join(workDir, 'hi.json'); - writeFileSync( - fixturePath, - JSON.stringify({ - fixtures: [ - { match: { userMessage: 'say hi briefly' }, response: { content: 'Hi!' } }, - ], - }), - ); - - handle = await startAimock({ mode: 'replay', fixturePath }); - expect(handle.port).toBeGreaterThan(0); - expect(handle.baseUrl).toMatch(/^http:\/\/.+\/v1$/); - - // The OpenAI SDK call path is exercised in Task 0's de-risk; this - // unit test stops at "the harness started cleanly and exposes the - // documented shape." - }); - - it('stop() is idempotent', async () => { - workDir = mkdtempSync(join(tmpdir(), 'aimock-test-')); - const fixturePath = join(workDir, 'hi.json'); - writeFileSync(fixturePath, JSON.stringify({ fixtures: [] })); - handle = await startAimock({ mode: 'replay', fixturePath }); - await handle.stop(); - await handle.stop(); - expect(true).toBe(true); - }); - - it('loads and merges all .json files in a directory', async () => { - workDir = mkdtempSync(join(tmpdir(), 'aimock-test-')); - writeFileSync( - join(workDir, 'a.json'), - JSON.stringify({ - fixtures: [{ match: { userMessage: 'one' }, response: { content: 'A' } }], - }), - ); - writeFileSync( - join(workDir, 'b.json'), - JSON.stringify({ - fixtures: [{ match: { userMessage: 'two' }, response: { content: 'B' } }], - }), - ); - // Non-JSON file in the dir should be ignored. - writeFileSync(join(workDir, 'README.md'), '# not a fixture'); - - handle = await startAimock({ mode: 'replay', fixturePath: workDir }); - expect(handle.port).toBeGreaterThan(0); - expect(handle.baseUrl).toMatch(/^http:\/\/.+\/v1$/); - }); -}); diff --git a/apps/cockpit/e2e/aimock-runner.ts b/apps/cockpit/e2e/aimock-runner.ts deleted file mode 100644 index 5392cb777..000000000 --- a/apps/cockpit/e2e/aimock-runner.ts +++ /dev/null @@ -1,78 +0,0 @@ -// SPDX-License-Identifier: MIT -import { LLMock } from '@copilotkit/aimock'; -import { readFileSync, readdirSync, statSync } from 'node:fs'; -import { join } from 'node:path'; - -export interface AimockHandle { - /** Port the mock server is listening on. */ - readonly port: number; - /** Full base URL the OpenAI SDK should target (includes /v1 suffix). */ - readonly baseUrl: string; - /** Tear down the server. Safe to call multiple times. */ - stop(): Promise; -} - -export interface AimockStartOptions { - mode: 'replay'; - /** Path to a single fixture file OR a directory of fixture files. */ - fixturePath: string; -} - -// Raw JSON entry shape passes through to aimock's FixtureFileEntry — the -// `match` block can carry richer discriminators (toolName, hasToolResult, -// turnIndex, etc.) that are needed to distinguish a parent LLM's first call -// from its continuation after a tool round. We don't narrow the shape here; -// aimock's `addFixturesFromJSON` validates structure at load time. -type FixtureFileEntry = Record; - -function loadFixtureEntries(fixturePath: string): FixtureFileEntry[] { - const stats = statSync(fixturePath); - const out: FixtureFileEntry[] = []; - const readFile = (full: string): void => { - const raw = readFileSync(full, 'utf-8'); - const parsed = JSON.parse(raw) as { fixtures: FixtureFileEntry[] }; - for (const fx of parsed.fixtures) out.push(fx); - }; - if (stats.isDirectory()) { - const files = readdirSync(fixturePath) - .filter((f) => f.endsWith('.json')) - .sort(); - for (const file of files) readFile(join(fixturePath, file)); - return out; - } - readFile(fixturePath); - return out; -} - -export async function startAimock(opts: AimockStartOptions): Promise { - const entries = loadFixtureEntries(opts.fixturePath); - - // Use a large chunkSize so each response arrives in 1-2 SSE deltas. This - // intentionally turns off the partial-markdown streaming path for harness - // tests: structural assertions (code fence, list) measure the FINAL rendered - // DOM, not the progressive render. With aggressive default chunking, the - // partial-markdown parser sometimes can't recover a triple-backtick fence - // that gets split mid-token, and the final state ends up as inline - // instead of
      . Streaming-progressive behavior is covered by the
      -  // Phase 1 unit-variance tables; the e2e harness is for final-state
      -  // invariants and cross-stack integration.
      -  const mock = new LLMock({ port: 0, chunkSize: 4096 });
      -  if (entries.length > 0) {
      -    mock.addFixturesFromJSON(entries as never);
      -  }
      -  await mock.start();
      -
      -  const port = mock.port;
      -  const baseUrl = `${mock.url}/v1`;
      -  let stopped = false;
      -
      -  return {
      -    port,
      -    baseUrl,
      -    async stop() {
      -      if (stopped) return;
      -      stopped = true;
      -      await mock.stop();
      -    },
      -  };
      -}
      diff --git a/apps/cockpit/e2e/fixtures/streaming.json b/apps/cockpit/e2e/fixtures/streaming.json
      deleted file mode 100644
      index d54869ff9..000000000
      --- a/apps/cockpit/e2e/fixtures/streaming.json
      +++ /dev/null
      @@ -1,12 +0,0 @@
      -{
      -  "fixtures": [
      -    {
      -      "match": {
      -        "userMessage": "Tell me one quick fact about Angular signals in two sentences."
      -      },
      -      "response": {
      -        "content": "Angular signals are a reactive primitive (signal, computed, effect) that track dependencies to provide fine-grained reactivity and more efficient change detection. They let you update state synchronously via set()/update() and ensure only consumers that read an affected signal are re\u2011evaluated."
      -      }
      -    }
      -  ]
      -}
      diff --git a/apps/cockpit/e2e/global-setup.ts b/apps/cockpit/e2e/global-setup.ts
      deleted file mode 100644
      index ac5c7a157..000000000
      --- a/apps/cockpit/e2e/global-setup.ts
      +++ /dev/null
      @@ -1,79 +0,0 @@
      -// SPDX-License-Identifier: MIT
      -import { spawn, type ChildProcess } from 'node:child_process';
      -import { setTimeout as delay } from 'node:timers/promises';
      -import { resolve } from 'node:path';
      -import { startAimock, type AimockHandle } from './aimock-runner';
      -
      -interface SharedState {
      -  aimock: AimockHandle;
      -  langgraph: ChildProcess;
      -  angular: ChildProcess;
      -}
      -
      -declare global {
      -  // eslint-disable-next-line no-var
      -  var __COCKPIT_AIMOCK_E2E_STATE__: SharedState | undefined;
      -}
      -
      -const REPO_ROOT = resolve(__dirname, '../../..');
      -const FIXTURE_PATH = process.env.AIMOCK_FIXTURE
      -  ? resolve(__dirname, process.env.AIMOCK_FIXTURE)
      -  : resolve(__dirname, 'fixtures');
      -
      -async function waitForPort(url: string, timeoutMs: number): Promise {
      -  const start = Date.now();
      -  while (Date.now() - start < timeoutMs) {
      -    try {
      -      const res = await fetch(url);
      -      if (res.ok || res.status === 404) return;
      -    } catch {
      -      // server not up yet
      -    }
      -    await delay(500);
      -  }
      -  throw new Error(`Server at ${url} did not become ready within ${timeoutMs}ms`);
      -}
      -
      -export default async function globalSetup(): Promise {
      -  const aimock = await startAimock({ mode: 'replay', fixturePath: FIXTURE_PATH });
      -  // eslint-disable-next-line no-console
      -  console.log(`[cockpit] aimock listening at ${aimock.baseUrl}`);
      -
      -  const langgraph = spawn(
      -    'uv',
      -    ['run', 'langgraph', 'dev', '--port', '8123', '--no-browser'],
      -    {
      -      cwd: resolve(REPO_ROOT, 'cockpit/langgraph/streaming/python'),
      -      env: {
      -        ...process.env,
      -        OPENAI_BASE_URL: aimock.baseUrl,
      -        OPENAI_API_KEY: 'test-not-used',
      -      },
      -      stdio: 'pipe',
      -    },
      -  );
      -  langgraph.stdout?.on('data', (b) => process.stdout.write(`[langgraph] ${b}`));
      -  langgraph.stderr?.on('data', (b) => process.stderr.write(`[langgraph] ${b}`));
      -
      -  await waitForPort('http://localhost:8123/ok', 90_000);
      -  // eslint-disable-next-line no-console
      -  console.log('[cockpit] langgraph ready on :8123');
      -
      -  const angular = spawn(
      -    'npx',
      -    ['nx', 'serve', 'cockpit-langgraph-streaming-angular', '--port', '4300'],
      -    {
      -      cwd: REPO_ROOT,
      -      env: { ...process.env },
      -      stdio: 'pipe',
      -    },
      -  );
      -  angular.stdout?.on('data', (b) => process.stdout.write(`[angular] ${b}`));
      -  angular.stderr?.on('data', (b) => process.stderr.write(`[angular] ${b}`));
      -
      -  await waitForPort('http://localhost:4300/', 120_000);
      -  // eslint-disable-next-line no-console
      -  console.log('[cockpit] angular ready on :4300');
      -
      -  globalThis.__COCKPIT_AIMOCK_E2E_STATE__ = { aimock, langgraph, angular };
      -}
      diff --git a/apps/cockpit/e2e/global-teardown.ts b/apps/cockpit/e2e/global-teardown.ts
      deleted file mode 100644
      index 6bdbe43d1..000000000
      --- a/apps/cockpit/e2e/global-teardown.ts
      +++ /dev/null
      @@ -1,9 +0,0 @@
      -// SPDX-License-Identifier: MIT
      -export default async function globalTeardown(): Promise {
      -  const state = globalThis.__COCKPIT_AIMOCK_E2E_STATE__;
      -  if (!state) return;
      -  state.angular.kill('SIGTERM');
      -  state.langgraph.kill('SIGTERM');
      -  await state.aimock.stop();
      -  globalThis.__COCKPIT_AIMOCK_E2E_STATE__ = undefined;
      -}
      diff --git a/apps/cockpit/e2e/playwright.config.ts b/apps/cockpit/e2e/playwright.config.ts
      deleted file mode 100644
      index de3ffaa70..000000000
      --- a/apps/cockpit/e2e/playwright.config.ts
      +++ /dev/null
      @@ -1,24 +0,0 @@
      -// SPDX-License-Identifier: MIT
      -import { defineConfig, devices } from '@playwright/test';
      -
      -export default defineConfig({
      -  testDir: '.',
      -  testMatch: '**/*.spec.ts',
      -  testIgnore: ['aimock-runner.spec.ts'],
      -  fullyParallel: false,
      -  workers: 1,
      -  retries: process.env.CI ? 2 : 0,
      -  reporter: process.env.CI ? [['list'], ['html', { open: 'never' }]] : 'list',
      -  use: {
      -    baseURL: 'http://localhost:4300',
      -    trace: 'retain-on-failure',
      -  },
      -  projects: [
      -    {
      -      name: 'chromium',
      -      use: { ...devices['Desktop Chrome'] },
      -    },
      -  ],
      -  globalSetup: './global-setup.ts',
      -  globalTeardown: './global-teardown.ts',
      -});
      diff --git a/apps/cockpit/e2e/scripts/record-streaming.py b/apps/cockpit/e2e/scripts/record-streaming.py
      deleted file mode 100644
      index 3a9228085..000000000
      --- a/apps/cockpit/e2e/scripts/record-streaming.py
      +++ /dev/null
      @@ -1,58 +0,0 @@
      -"""Capture a real text response from the streaming graph's LLM.
      -
      -Mirrors cockpit/langgraph/streaming/python/src/graph.py's
      -build_streaming_graph() setup: ChatOpenAI(gpt-5-mini, streaming=True)
      -+ system prompt from prompts/streaming.md.
      -
      -Run from repo root:
      -  OPENAI_API_KEY=sk-... uv run --project cockpit/langgraph/streaming/python \
      -    python apps/cockpit/e2e/scripts/record-streaming.py
      -"""
      -import json
      -import os
      -import sys
      -from pathlib import Path
      -
      -env_path = Path("cockpit/langgraph/streaming/python/.env")
      -if env_path.exists():
      -    for line in env_path.read_text().splitlines():
      -        line = line.strip()
      -        if line and not line.startswith("#") and "=" in line:
      -            k, _, v = line.partition("=")
      -            os.environ.setdefault(k.strip(), v.strip().strip('"').strip("'"))
      -
      -if not os.environ.get("OPENAI_API_KEY"):
      -    print("OPENAI_API_KEY not set (in env or .env)", file=sys.stderr)
      -    sys.exit(1)
      -
      -from langchain_core.messages import HumanMessage, SystemMessage
      -from langchain_openai import ChatOpenAI
      -
      -PROMPT = "Tell me one quick fact about Angular signals in two sentences."
      -SYSTEM_PROMPT = (
      -    Path("cockpit/langgraph/streaming/python/prompts/streaming.md").read_text()
      -)
      -
      -llm = ChatOpenAI(model="gpt-5-mini", temperature=0)
      -response = llm.invoke(
      -    [SystemMessage(content=SYSTEM_PROMPT), HumanMessage(content=PROMPT)],
      -)
      -text = response.content if isinstance(response.content, str) else ""
      -if not text.strip():
      -    print("LLM returned empty content; cannot build fixture", file=sys.stderr)
      -    sys.exit(2)
      -print(f"captured {len(text)} chars; first 80: {text[:80]!r}")
      -
      -fixture = {
      -    "fixtures": [
      -        {
      -            "match": {"userMessage": PROMPT},
      -            "response": {"content": text},
      -        }
      -    ]
      -}
      -
      -out_path = Path("apps/cockpit/e2e/fixtures/streaming.json")
      -out_path.parent.mkdir(parents=True, exist_ok=True)
      -out_path.write_text(json.dumps(fixture, indent=2) + "\n")
      -print(f"\nWrote fixture to {out_path}")
      diff --git a/apps/cockpit/e2e/streaming.spec.ts b/apps/cockpit/e2e/streaming.spec.ts
      deleted file mode 100644
      index ca0074474..000000000
      --- a/apps/cockpit/e2e/streaming.spec.ts
      +++ /dev/null
      @@ -1,18 +0,0 @@
      -// SPDX-License-Identifier: MIT
      -import { test, expect } from '@playwright/test';
      -import { sendPromptAndWait } from './test-helpers';
      -
      -test('streaming: assistant text from the mocked LLM renders in the cockpit chat composition', async ({ page }) => {
      -  const bubble = await sendPromptAndWait(
      -    page,
      -    'Tell me one quick fact about Angular signals in two sentences.',
      -  );
      -
      -  // The captured fixture's content (Angular signals fact) must reach the
      -  // rendered bubble. Proves: aimock served the streaming graph's LLM call,
      -  // langgraph routed back the AI message, the cockpit-langgraph-streaming-angular
      -  // app rendered it via the chat composition, and the streaming-finalized
      -  // signal (data-streaming="false") settled.
      -  const finalText = await bubble.innerText();
      -  expect(finalText.toLowerCase()).toContain('signal');
      -});
      diff --git a/apps/cockpit/e2e/test-helpers.ts b/apps/cockpit/e2e/test-helpers.ts
      deleted file mode 100644
      index 0bbe9a252..000000000
      --- a/apps/cockpit/e2e/test-helpers.ts
      +++ /dev/null
      @@ -1,32 +0,0 @@
      -// SPDX-License-Identifier: MIT
      -import { expect, type Locator, type Page } from '@playwright/test';
      -
      -/**
      - * Send a user prompt and wait for the assistant bubble to finalize.
      - *
      - * "Finalized" means `chat-message[data-role="assistant"][data-streaming="false"]`:
      - * the chat composition wires `[streaming]` to `agent.isLoading() && i === lastIndex`
      - * on the latest assistant ``, so the attribute flips to `"false"`
      - * once the agent stops loading and the markdown render has settled.
      - *
      - * Asserting on intermediate streaming-state DOM (partial `
        `, in-flight - * code fences, etc.) is the source of e2e flake — always wait on this - * attribute before counting or text-matching downstream of the assistant turn. - */ -export async function sendPromptAndWait(page: Page, prompt: string): Promise { - await page.goto('/'); - const input = page.getByRole('textbox', { name: /message|prompt/i }); - await input.fill(prompt); - await page.getByRole('button', { name: /send/i }).click(); - - const finalizedAssistant = page - .locator('chat-message[data-role="assistant"][data-streaming="false"]') - .last(); - await expect(finalizedAssistant).toBeAttached({ timeout: 45_000 }); - await expect - .poll(async () => ((await finalizedAssistant.innerText()) ?? '').trim().length, { - timeout: 30_000, - }) - .toBeGreaterThan(0); - return finalizedAssistant; -} diff --git a/apps/cockpit/e2e/tsconfig.json b/apps/cockpit/e2e/tsconfig.json deleted file mode 100644 index 234dd6a8b..000000000 --- a/apps/cockpit/e2e/tsconfig.json +++ /dev/null @@ -1,15 +0,0 @@ -{ - "compilerOptions": { - "target": "ES2022", - "module": "ES2022", - "moduleResolution": "Bundler", - "esModuleInterop": true, - "strict": true, - "skipLibCheck": true, - "allowImportingTsExtensions": false, - "noEmit": true, - "types": ["node"] - }, - "include": ["**/*.ts"], - "exclude": ["node_modules", "test-results", "playwright-report"] -} diff --git a/apps/cockpit/project.json b/apps/cockpit/project.json index 66343afef..3579dae9a 100644 --- a/apps/cockpit/project.json +++ b/apps/cockpit/project.json @@ -46,12 +46,6 @@ "configFile": "apps/cockpit/vite.config.mts" } }, - "e2e": { - "executor": "@nx/playwright:playwright", - "options": { - "config": "apps/cockpit/e2e/playwright.config.ts" - } - }, "serve-streaming": { "executor": "nx:run-commands", "options": { From 7712b56d1007b6f4d229ad927a74a00b774c4559 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Fri, 15 May 2026 23:57:00 -0700 Subject: [PATCH 13/14] ci(cockpit): nx run-many for per-example aimock e2e --- .github/workflows/ci.yml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index f3aeb9c3f..727fee354 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -169,13 +169,14 @@ jobs: - working-directory: cockpit/langgraph/streaming/python run: uv sync - run: npx playwright install --with-deps chromium - - run: npx nx e2e cockpit --skip-nx-cache + - run: npx nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1 --skip-nx-cache - name: Upload Playwright trace on failure if: failure() uses: actions/upload-artifact@v4 with: name: cockpit-e2e-trace - path: apps/cockpit/e2e/test-results/ + path: | + cockpit/**/angular/e2e/test-results/ retention-days: 7 website-e2e: From 582e888f9616ee7cea0fb9df551f7cc4ec313046 Mon Sep 17 00:00:00 2001 From: Brian Love Date: Sat, 16 May 2026 01:17:15 -0700 Subject: [PATCH 14/14] fix(cockpit-aimock): per-example langgraph ports + harder teardown for sequential CI runs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CI's sequential per-example e2e loop hit `OSError: Port 8123 is already in use` on the second run. Three compounding causes: 1. The factory spawned langgraph and Angular non-detached, so SIGTERM to the `uv`/`npx` parent didn't propagate to the actual server children (`python langgraph dev`, `node nx serve`). Process tree survived teardown. 2. The teardown's "wait for port free" did a TCP connect-refused check, not a real bind() check. langgraph's _is_port_available does a real bind, which fails on TIME_WAIT sockets that connect refuses don't surface. 3. Even with both fixed, TIME_WAIT sockets on 8123 from the first run's client connections (Playwright + Angular both opened many) blocked langgraph's bind() on the second run for far longer than the 5s sleep between targets. Fixes: - spawn(detached: true) + process.kill(-pid, 'SIGKILL') in teardown to kill the whole process group. - waitForPortFree now does a real bind() check (mirrors langgraph's check). - Each per-example pins its OWN langgraph port: streaming keeps 8123, tool-calls offsets to 8124. Angular proxy.conf.json target updated to match. Future examples pick the next unused port. Decouples examples from each other — TIME_WAIT on one example's port no longer blocks the next example. - CI loop replaced with explicit shell loop (was nx run-many --parallel=1) for clearer per-example failure attribution and a 5s settle between targets. Verified locally: 2-run sequential loop passes consistently. --- .github/workflows/ci.yml | 20 +++++- .../angular/e2e/global-setup-impl.ts | 6 ++ .../chat/tool-calls/angular/proxy.conf.json | 2 +- .../src/global-setup-factory.ts | 18 ++++- .../aimock-harness/src/global-teardown.ts | 68 ++++++++++++++++++- 5 files changed, 108 insertions(+), 6 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 727fee354..ab2d581c5 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -169,7 +169,25 @@ jobs: - working-directory: cockpit/langgraph/streaming/python run: uv sync - run: npx playwright install --with-deps chromium - - run: npx nx run-many --target=e2e --projects=cockpit-*-angular --parallel=1 --skip-nx-cache + # Explicit sequential loop (not `nx run-many --parallel=1`) so each + # per-example e2e gets a fresh Playwright process and a clean port + # state between invocations. nx run-many's scheduler doesn't insert + # a delay between target completions, which races OS-level port + # release on the second iteration (langgraph dev binds 8123 in + # every example). The 5s sleep between targets gives the OS time + # to fully release the port despite the harness's globalTeardown + # already process-group-killing the langgraph child tree. + # + # Each new cockpit example with an `e2e` target adds one line below. + - name: Run cockpit example aimock e2e suites + run: | + set -e + for proj in cockpit-langgraph-streaming-angular cockpit-chat-tool-calls-angular; do + echo "::group::nx e2e $proj" + npx nx e2e "$proj" --skip-nx-cache + echo "::endgroup::" + sleep 5 + done - name: Upload Playwright trace on failure if: failure() uses: actions/upload-artifact@v4 diff --git a/cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts b/cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts index 8cd6a301a..9125b1a47 100644 --- a/cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts +++ b/cockpit/chat/tool-calls/angular/e2e/global-setup-impl.ts @@ -4,6 +4,12 @@ import { createGlobalSetup } from '../../../../../libs/internal/aimock-harness/s export default createGlobalSetup({ langgraphCwd: 'cockpit/langgraph/streaming/python', + // Each cockpit example pins its OWN langgraph port to avoid TIME_WAIT + // collisions when a sequential CI loop runs multiple per-example e2es + // back-to-back. The streaming pilot keeps the historical 8123 default; + // tool-calls offsets to 8124. Future examples pick the next unused port. + // The Angular proxy.conf.json target must match. + langgraphPort: 8124, angularProject: 'cockpit-chat-tool-calls-angular', angularPort: 4504, fixturesDir: resolve(__dirname, 'fixtures'), diff --git a/cockpit/chat/tool-calls/angular/proxy.conf.json b/cockpit/chat/tool-calls/angular/proxy.conf.json index 8523362d7..33430440a 100644 --- a/cockpit/chat/tool-calls/angular/proxy.conf.json +++ b/cockpit/chat/tool-calls/angular/proxy.conf.json @@ -1,6 +1,6 @@ { "/api": { - "target": "http://localhost:8123", + "target": "http://localhost:8124", "secure": false, "changeOrigin": true, "pathRewrite": { "^/api": "" }, diff --git a/libs/internal/aimock-harness/src/global-setup-factory.ts b/libs/internal/aimock-harness/src/global-setup-factory.ts index f7d069de0..a3d22a892 100644 --- a/libs/internal/aimock-harness/src/global-setup-factory.ts +++ b/libs/internal/aimock-harness/src/global-setup-factory.ts @@ -24,7 +24,9 @@ export interface CreateGlobalSetupOpts { interface SharedState { aimock: AimockHandle; langgraph: ChildProcess; + langgraphPort: number; angular: ChildProcess; + angularPort: number; } declare global { @@ -85,6 +87,11 @@ export function createGlobalSetup(opts: CreateGlobalSetupOpts): () => Promise process.stdout.write(`[langgraph] ${b}`)); @@ -101,6 +108,9 @@ export function createGlobalSetup(opts: CreateGlobalSetupOpts): () => Promise process.stdout.write(`[angular] ${b}`)); @@ -113,6 +123,12 @@ export function createGlobalSetup(opts: CreateGlobalSetupOpts): () => Promise | undefined; } +/** + * Returns true when the port can be BOUND (not merely connected to — + * TIME_WAIT sockets refuse connections but still block fresh `bind()` + * without SO_REUSEADDR, which is the check langgraph dev does on + * startup). We mirror that check by trying a real bind+listen here. + */ +async function portBindable(port: number): Promise { + return new Promise((resolve) => { + const server = createServer(); + server.once('error', () => { + server.close(); + resolve(false); + }); + server.listen(port, '127.0.0.1', () => { + server.close(() => resolve(true)); + }); + }); +} + +async function waitForPortFree(port: number, timeoutMs = 60_000): Promise { + const start = Date.now(); + while (Date.now() - start < timeoutMs) { + if (await portBindable(port)) return; + await delay(500); + } + // Don't throw — teardown should be best-effort. The next run's + // globalSetup will report a clearer error if the port is genuinely stuck. +} + /** * Default Playwright globalTeardown. Walks every state slot the factory * registered (one per Angular project), kills processes in reverse order - * (Angular → langgraph → aimock), awaits aimock stop. Idempotent. + * (Angular → langgraph → aimock), awaits aimock stop, then waits for the + * langgraph and Angular ports to actually release. Idempotent. + * + * Port-release wait matters under `nx run-many --parallel=1` where the + * NEXT per-example e2e starts moments after this teardown returns. Without + * the wait, sequential runs race the OS's TCP TIME_WAIT cleanup and the + * next setup hits EADDRINUSE. */ +function killGroup(proc: ChildProcess): void { + // The processes are spawned with detached: true, so each has its own + // process group with pgid === pid. Signaling -pid hits the whole group + // (parent + all descendants), which is needed because uv/npx wrap + // actual long-lived servers (python/node) and don't forward signals + // to children on their own. + if (!proc.pid) return; + try { + process.kill(-proc.pid, 'SIGKILL'); + } catch { + // Process group may already be gone; fall back to direct kill. + try { + proc.kill('SIGKILL'); + } catch { + // already dead + } + } +} + export default async function globalTeardown(): Promise { const states = globalThis.__AIMOCK_HARNESS_STATE__; if (!states) return; for (const state of states.values()) { - state.angular.kill('SIGTERM'); - state.langgraph.kill('SIGTERM'); + killGroup(state.angular); + killGroup(state.langgraph); await state.aimock.stop(); + await Promise.all([ + waitForPortFree(state.langgraphPort), + waitForPortFree(state.angularPort), + ]); } globalThis.__AIMOCK_HARNESS_STATE__ = undefined; }