diff --git a/.cursor/rules/agent-evaluation-authoring.mdc b/.cursor/rules/agent-evaluation-authoring.mdc new file mode 100644 index 000000000..34e509cce --- /dev/null +++ b/.cursor/rules/agent-evaluation-authoring.mdc @@ -0,0 +1,14 @@ +--- +description: Authoring standards for docs/agent-evaluation (no eval leakage in user turns) +globs: docs/agent-evaluation/**/* +--- + +When editing anything under `docs/agent-evaluation/`, read and follow **`docs/agent-evaluation/AGENTS.md`**. + +**Quick guardrails for `scenarios/*.md`:** + +- **`### Turn N — User`** blockquotes = in-character **product engineer** speech only. +- **Never** in user lines: `Option 1/2/3`, `Turn 0`, `scenario`, `eval`, `success criteria`, `scoreScenario`, references to “the prompt/instructions you already have” or named template sections. +- Put rubric detail in **`## Success criteria`** / **Intent** / **Failure modes**, not in the user quote. + +Full checklist and rationale: **`docs/agent-evaluation/AGENTS.md`**. diff --git a/.github/workflows/docs-agent-eval-ci.yml b/.github/workflows/docs-agent-eval-ci.yml new file mode 100644 index 000000000..49fb76e87 --- /dev/null +++ b/.github/workflows/docs-agent-eval-ci.yml @@ -0,0 +1,70 @@ +# Runs scenarios 01+02 (curl + TypeScript SDK) with heuristic + LLM judge. +# Sets EVAL_LOCAL_DOCS=1 so the agent reads repo docs under docs/ (not production WebFetch). +# Triggers: workflow_dispatch, or push (main) / pull_request when docs / OpenAPI / agent-eval / TS SDK paths change. +# Each run bills Anthropic (agent + judge). +# Requires repo secrets: ANTHROPIC_API_KEY, EVAL_TEST_DESTINATION_URL, OUTPOST_API_KEY +# (OUTPOST_TEST_WEBHOOK_URL uses the same URL as EVAL_TEST_DESTINATION_URL in CI.) +# See docs/agent-evaluation/README.md § CI (recommended slice). +name: Docs agent eval (CI slice) + +on: + workflow_dispatch: + push: + branches: + - main + paths: + - "docs/content/**" + - "docs/apis/**" + - "docs/agent-evaluation/**" + - "docs/README.md" + - "docs/AGENTS.md" + - "sdks/outpost-typescript/**" + - ".github/workflows/docs-agent-eval-ci.yml" + pull_request: + paths: + - "docs/content/**" + - "docs/apis/**" + - "docs/agent-evaluation/**" + - "docs/README.md" + - "docs/AGENTS.md" + - "sdks/outpost-typescript/**" + - ".github/workflows/docs-agent-eval-ci.yml" + +jobs: + eval-ci: + # Fork PRs cannot use repository secrets; skip instead of failing a required-looking job. + if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository + runs-on: ubuntu-latest + timeout-minutes: 60 + defaults: + run: + working-directory: docs/agent-evaluation + + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Set up Node.js + uses: actions/setup-node@v4 + with: + node-version: "20" + cache: npm + cache-dependency-path: docs/agent-evaluation/package-lock.json + + - name: Install dependencies + run: npm ci + + - name: Run eval CI slice (scenarios 01, 02) + env: + ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} + EVAL_TEST_DESTINATION_URL: ${{ secrets.EVAL_TEST_DESTINATION_URL }} + EVAL_LOCAL_DOCS: "1" + run: ./scripts/ci-eval.sh + + - name: Execute generated curl + TypeScript artifacts (live Outpost) + env: + OUTPOST_API_KEY: ${{ secrets.OUTPOST_API_KEY }} + OUTPOST_TEST_WEBHOOK_URL: ${{ secrets.EVAL_TEST_DESTINATION_URL }} + OUTPOST_API_BASE_URL: https://api.outpost.hookdeck.com/2025-07-01 + OUTPOST_CI_PUBLISH_TOPIC: user.created + run: ./scripts/execute-ci-artifacts.sh diff --git a/.gitignore b/.gitignore index 0790e846d..23b769f99 100644 --- a/.gitignore +++ b/.gitignore @@ -1,10 +1,15 @@ # Environment variables .env +.env.ci .outpost.yaml # Built binaries /dist /bin + +# Documentation (local build artifacts; content lives under docs/content/) +/docs/dist/ +/docs/TEMP-*.md /tmp # Golang test coverage diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..0fb773eda --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,5 @@ +# Coding agent notes (Outpost) + +When you change files under **`docs/agent-evaluation/`** (scenarios, scoring, harness docs), read and apply **[`docs/agent-evaluation/AGENTS.md`](docs/agent-evaluation/AGENTS.md)** first. It defines anti–“teach to the test” rules for user-turn wording and scenario structure. + +For this repo’s PR review format, see **`CLAUDE.md`**. diff --git a/README.md b/README.md index b5978eb84..1e00d30e7 100644 --- a/README.md +++ b/README.md @@ -53,7 +53,7 @@ Outpost is built and maintained by [Hookdeck](https://hookdeck.com?ref=github-ou ![Outpost architecture](docs/public/images/architecture.png) -Read [Outpost Concepts](https://outpost.hookdeck.com/docs/concepts) to learn more about the Outpost architecture and design. +Read [Outpost Concepts](https://hookdeck.com/docs/outpost/concepts) to learn more about the Outpost architecture and design. ## Features @@ -70,17 +70,17 @@ Read [Outpost Concepts](https://outpost.hookdeck.com/docs/concepts) to learn mor - **Webhook best practices**: Opt-out webhook best practices, such as headers for idempotency, timestamp and signature, and signature rotation. - **SDKs and MCP server**: Go, Python, and TypeScript SDK are available. Outpost also ships with an MCP server. All generated by [Speakeasy](https://speakeasy.com). -See the [Outpost Features](https://outpost.hookdeck.com/docs/features) for more information. +See the [Outpost Features](https://hookdeck.com/docs/outpost/features) for more information. ## Documentation -- [Overview](https://outpost.hookdeck.com/docs/overview) -- [Concepts](https://outpost.hookdeck.com/docs/concepts) -- [Quickstarts](https://outpost.hookdeck.com/docs/quickstarts) -- [Features](https://outpost.hookdeck.com/docs/features) -- [Guides](https://outpost.hookdeck.com/docs/guides) -- [API Reference](https://outpost.hookdeck.com/docs/api) -- [Configuration Reference](https://outpost.hookdeck.com/docs/references/configuration) +- [Overview](https://hookdeck.com/docs/outpost/overview) +- [Concepts](https://hookdeck.com/docs/outpost/concepts) +- [Quickstarts](https://hookdeck.com/docs/outpost/quickstarts) +- [Features](https://hookdeck.com/docs/outpost/features) +- [Guides](https://hookdeck.com/docs/outpost/guides) +- [API Reference](https://hookdeck.com/docs/outpost/api) +- [Configuration Reference](https://hookdeck.com/docs/outpost/self-hosting/configuration) _The Outpost documentation is built using the [Zudoku documentation framework](https://zuplo.link/outpost)._ @@ -144,7 +144,7 @@ For other cloud Redis services or self-hosted Redis clusters, set `REDIS_CLUSTER ```sh go run cmd/redis-debug/main.go your-redis-host 6379 password 0 [tls] [cluster] ``` -See the [Redis Troubleshooting Guide](https://docs.outpost.hookdeck.com/references/troubleshooting-redis) for detailed guidance. +See the [Redis Troubleshooting Guide](https://hookdeck.com/docs/outpost/self-hosting/guides/troubleshooting-redis) for detailed guidance. Start the Outpost dependencies and services: @@ -241,7 +241,7 @@ Open the `redirect_url` link to view the Outpost portal. ![Dashboard homepage](docs/public/images/dashboard-homepage.png) -Continue to use the [Outpost API](https://outpost.hookdeck.com/docs/api) or the Outpost portal to add and test more destinations. +Continue to use the [Outpost API](https://hookdeck.com/docs/outpost/api) or the Outpost portal to add and test more destinations. ## Contributing diff --git a/build/entrypoint.sh b/build/entrypoint.sh index ab22587f8..fce97672c 100755 --- a/build/entrypoint.sh +++ b/build/entrypoint.sh @@ -23,7 +23,7 @@ if ! /usr/local/bin/outpost migrate init --current --log-format=json; then echo " docker run --rm hookdeck/outpost migrate --help" echo "" echo "Learn more about Outpost migration workflow at:" - echo " https://outpost.hookdeck.com/docs/guides/migration" + echo " https://hookdeck.com/docs/outpost/self-hosting/guides/migration" echo "" exit 1 fi diff --git a/docs/agent-evaluation/.env.example b/docs/agent-evaluation/.env.example new file mode 100644 index 000000000..79e210a37 --- /dev/null +++ b/docs/agent-evaluation/.env.example @@ -0,0 +1,37 @@ +# Copy to .env and fill in. .env is gitignored at the repo root. + +# Required for npm run eval (Claude Agent SDK — calls Anthropic only) +ANTHROPIC_API_KEY= + +# Required for Turn 0 template (test webhook URL injected into the prompt) +EVAL_TEST_DESTINATION_URL= + +# Strongly recommended for a *full* eval: run the agent’s curl/script/app against a real project. +# The harness does not read this key; you (or a future verifier) use it after the run. +# OUTPOST_API_KEY= # required for ./scripts/execute-ci-artifacts.sh after eval:ci; GitHub Actions CI execution step +# OUTPOST_API_BASE_URL=https://api.outpost.hookdeck.com/2025-07-01 +# OUTPOST_TEST_WEBHOOK_URL=https://hkdk.events/your-source-id # often same as EVAL_TEST_DESTINATION_URL +# OUTPOST_CI_PUBLISH_TOPIC=user.created # optional; publish topic for npm run smoke:execute-ci (must exist in project) + +# Optional (see npm run eval -- --help) +# EVAL_API_BASE_URL=https://api.outpost.hookdeck.com/2025-07-01 +# EVAL_TOPICS_LIST=- user.created +# EVAL_DOCS_URL=https://hookdeck.com/docs/outpost +# EVAL_LOCAL_DOCS=1 +# EVAL_LLMS_FULL_URL= +# Default includes Write, Edit, Bash (per-run workspace + installs). Override to narrow: +# EVAL_TOOLS=Read,Glob,Grep,WebFetch,Write,Edit,Bash +# EVAL_MODEL= +# EVAL_MAX_TURNS=40 +# Long runs (08–10): periodic stderr heartbeats while each agent query is in flight +# EVAL_PROGRESS=1 +# EVAL_PROGRESS_INTERVAL_MS=30000 +# EVAL_PERMISSION_MODE=dontAsk +# EVAL_PERSIST_SESSION=true +# Debug only: allow Write/Edit outside the per-run workspace (not recommended) +# EVAL_DISABLE_WORKSPACE_WRITE_GUARD=1 + +# Scoring is ON by default after each scenario (heuristic + LLM). Opt out: +# EVAL_NO_SCORE_HEURISTIC=1 +# EVAL_NO_SCORE_LLM=1 +# EVAL_SCORE_MODEL=claude-sonnet-4-20250514 diff --git a/docs/agent-evaluation/AGENTS.md b/docs/agent-evaluation/AGENTS.md new file mode 100644 index 000000000..ea6cee0d8 --- /dev/null +++ b/docs/agent-evaluation/AGENTS.md @@ -0,0 +1,46 @@ +# Agent evaluation — authoring rules for humans & coding agents + +This file applies to **everything under `docs/agent-evaluation/`** (scenarios, README, tracker, harness TypeScript). Follow it when adding or editing eval specs so we do not **teach to the test** or confuse **evaluator docs** with **in-character user speech**. + +## Who reads what + +| Audience | Content | +|----------|---------| +| **The model under test** | Turn 0 = pasted [`hookdeck-outpost-agent-prompt.mdoc`](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) template only, plus **Turn N — User** blockquotes (verbatim user role-play). | +| **Humans / harness** | Intent, preconditions, eval harness JSON, Success criteria, Failure modes, `score-transcript.ts`, README. | + +**Never** put harness vocabulary into **user** lines. The user is a product engineer, not an eval runner. + +## Anti-leakage rules (user turns) + +In **`### Turn N — User`** blockquotes, **do not** use: + +- **Option 1 / 2 / 3** (those labels exist only inside the dashboard template; a real user says what they want in plain language). +- **Turn 0**, **Turn 1**, or any **turn** numbering (that is script metadata). +- Phrases like **“the instructions you already have”**, **“the full-stack section of the prompt”**, **“follow the Hookdeck Outpost template”** as a stand-in for requirements (the model already has Turn 0; state the *product ask*, not a pointer to a doc section). +- **“Match the prompt”**, **“dashboard prompt”**, **“eval”**, **“scenario”**, **“success criteria”**, **heuristic names**, **`scoreScenarioNN`**. + +**Do** use natural operator language: stack, repo, product behavior, security (key on server), domain topics, README/env, Hookdeck project/topics **as the customer would say them**. + +It is fine for **Success criteria**, **Failure modes**, and **Intent** to name `scoreScenarioNN`, Turn 0, Option 3, etc. — those sections are not pasted as the user. + +## Alignment without parroting + +- **Product bar** (domain publish, topic reconciliation, full-stack UI depth) belongs in **Success criteria** and in the **prompt template** in `hookdeck-outpost-agent-prompt.mdoc`. +- **User turns** should **request outcomes** (“I need customers to see failed deliveries and retry”) not **cite** where in the template that is spelled out. + +If you add a new requirement, update **Success criteria** (and heuristics only when a **durable, low–false-positive** check exists). Do not stuff the verbatim rubric into the user quote. + +## Pre-merge checklist (scenarios) + +Before merging changes to `scenarios/*.md`: + +- [ ] Every **`> ...` user** line reads like a **real customer** message (read aloud test). +- [ ] No **Option N** / **Turn 0** / **scenario** / **prompt section** leakage in user blockquotes. +- [ ] **Success criteria** still state the full bar; nothing removed from criteria and only moved into user text. +- [ ] If integration depth changed, **`src/score-transcript.ts`** and this **README** scenario table are updated when rubrics change. + +## Where Cursor loads this + +- A **repo-root** [`AGENTS.md`](../../AGENTS.md) points here so agents see this folder’s rules. +- [`.cursor/rules/agent-evaluation-authoring.mdc`](../../.cursor/rules/agent-evaluation-authoring.mdc) applies when editing paths under `docs/agent-evaluation/`. diff --git a/docs/agent-evaluation/README.md b/docs/agent-evaluation/README.md new file mode 100644 index 000000000..1c5799797 --- /dev/null +++ b/docs/agent-evaluation/README.md @@ -0,0 +1,244 @@ +# Agent evaluation — Hookdeck Outpost onboarding + +This folder contains **manual** scenario specs (markdown) and an **automated** runner that uses the [Claude Agent SDK](https://platform.claude.com/docs/en/agent-sdk/overview) (`src/run-agent-eval.ts`). + +**Authoring standards (user-turn wording, no eval leakage):** [`AGENTS.md`](AGENTS.md) — also enforced via [`.cursor/rules/agent-evaluation-authoring.mdc`](../../.cursor/rules/agent-evaluation-authoring.mdc) when editing here. + +## Where success criteria live + +| What | Where | +|------|--------| +| **Human checklist** (full eval, including execution) | Each file under [`scenarios/`](scenarios/) — section **Success criteria** (static + **Execution (full pass)** rows). | +| **Manual run write-up** | [`results/RUN-RECORDING.template.md`](results/RUN-RECORDING.template.md) — copy to a local file under `results/` (gitignored). | +| **Automated transcript rubric** (regex heuristics) | [`src/score-transcript.ts`](src/score-transcript.ts) — `scoreScenario01`–`scoreScenario10` (assistant text + tool-written file corpus). Scenarios **08–10** include **`publish_beyond_test_only`** (domain publish signal vs test-only). | +| **LLM judge** (Anthropic vs **`## Success criteria`** in each scenario) | [`src/llm-judge.ts`](src/llm-judge.ts) — runs after each scenario unless **`--no-score-llm`**; also `npm run score -- --llm`. | + +**Deliberate scope:** `npm run eval` **requires** **`--scenario`**, **`--scenarios`**, or **`--all`**. There is no silent “run everything” default — you choose the scenarios and accept the cost. After **each** run: **`transcript.json`**, **`heuristic-score.json`**, and **`llm-score.json`** (judge reads the same **Success criteria** as humans). Exit **1** if any enabled score fails. + +Opt out of scoring: **`--no-score`** (heuristic only), **`--no-score-llm`** (drops the Success-criteria judge), or **`.env`**: **`EVAL_NO_SCORE_HEURISTIC=1`**, **`EVAL_NO_SCORE_LLM=1`**. Transcript-only: **`npm run eval -- --no-score --no-score-llm`**. + +Each scenario run uses one directory: + +`results/runs/-scenario-NN/` + +- **`transcript.json`** — full SDK log (written only **after** the agent finishes all turns — long runs may show little console output until then) +- **Harness sidecars (siblings of the run folder, not inside it)** — so the agent sandbox cannot read them: + - **`-scenario-NN.eval-started.json`** — written when the scenario begins (pid, scenario id, paths) + - **`-scenario-NN.eval-failure.json`** — uncaught exception before `transcript.json` + - **`-scenario-NN.eval-aborted.json`** — **SIGTERM** / **SIGINT** before completion (not **SIGKILL**) + If **`transcript.json`** is missing, check these files next to **`…/runs/-scenario-NN/`** (same directory as the run folder, not inside it). +- **`heuristic-score.json`** / **`llm-score.json`** — by default (unless disabled above) +- **Agent-written files** — the SDK **`cwd`** is this directory. Defaults include **`Write`**, **`Edit`**, and **`Bash`** for clones, installs, and generated code. + +Re-score a finished run without re-invoking the agent — uses **today’s** [`src/score-transcript.ts`](src/score-transcript.ts) and **scenario markdown on disk** (so LLM criteria update when you edit **`## Success criteria`**): + +- **`npm run score -- --run results/runs/ --write`** — refresh **`heuristic-score.json`** +- Add **`--llm`** to also re-run the judge and write **`llm-score.json`** (needs **`ANTHROPIC_API_KEY`**) + +Legacy flat files `*-scenario-NN.json` next to `runs/` are still accepted by **`npm run score`** for older runs. + +**Execution** (live Outpost) is still not auto-verified; the LLM is instructed to set `execution_in_transcript.pass` to **null** unless the transcript itself reports HTTP results. + +## Automated runs (Claude Agent SDK) + +From `docs/agent-evaluation/`: + +```sh +npm install +cp .env.example .env # then edit: ANTHROPIC_API_KEY, EVAL_TEST_DESTINATION_URL, … +npm run eval -- --scenario 01 +npm run eval -- --scenarios 01,02,08 +npm run eval -- --all # explicit full suite (every scenario file) +npm run eval:ci # same as --scenarios 01,02 + heuristic + LLM judge (see § CI) +npm run eval -- --dry-run +``` + +The runner loads **`docs/agent-evaluation/.env`** automatically (via `dotenv`). Shell exports still override `.env` if both are set. + +### Wall time (scenarios **08–10** and other heavy baselines) + +Scenarios that **`git clone`** a full SaaS template and run **`npm` / `pnpm` / `docker compose`** installs are **slow by design**. Expect **roughly 30–90+ minutes** of wall time for a single run of **08**, **09**, or **10** (clone + install + several agent turns). The harness prints little to the terminal until **`transcript.json`** is written at the end, which can look hung. + +- **Progress on stderr:** set **`EVAL_PROGRESS=1`** so the runner prints **periodic lines** (default every **30s** per agent query, plus every **25** SDK messages). You still see activity when the agent is inside a **long Bash** call and the SDK emits **no** new messages for a while. Tune with **`EVAL_PROGRESS_INTERVAL_MS`** (minimum **5000**). Default is off so CI and short runs stay quiet. +- **Stop early:** **Ctrl+C** (**SIGINT**) in the terminal running `npm run eval`. The runner writes **`*-scenario-NN.eval-aborted.json`** next to the run folder (see **Harness sidecars** at the top of this file). +- **Skip re-clone:** If the baseline is already under the run directory, **`EVAL_SKIP_HARNESS_PRE_STEPS=1`** skips **`git_clone`** from the scenario harness (see each scenario’s **`## Eval harness`** block). +- **Cap agent length (smoke only):** **`EVAL_MAX_TURNS`** (default **80**) limits SDK turns; lowering it may end the run sooner but often **fails** the integration before success criteria are met—use for debugging, not a real pass. +- **Save judge time only:** **`--no-score-llm`** skips the Success-criteria LLM judge at the end (saves a few minutes; you lose that rubric). + +For **fast** automated signal in CI, use **`eval:ci`** (**01** + **02** only)—not **08**. + +### CI (recommended slice) + +For **pull-request or main-branch** automation, run **two** scenarios only: + +| Scenario | Why | +|----------|-----| +| **01** (curl) | Shortest path: managed API, tenant → destination → publish, no `npm install` / framework scaffold. Cheap signal that the prompt + heuristics still align with the curl quickstart. | +| **02** (TypeScript) | Most common integration style: **`@hookdeck/outpost-sdk`**, env vars, same API flow in code. Still much faster than **05** (Next.js) or **08** (clone a full SaaS repo). | + +**Commands:** + +```sh +cd docs/agent-evaluation && npm ci && npm run eval:ci +# or: ./scripts/ci-eval.sh # requires ANTHROPIC_API_KEY + EVAL_TEST_DESTINATION_URL in the environment +# after a successful eval:ci, live Outpost smoke: OUTPOST_API_KEY + OUTPOST_TEST_WEBHOOK_URL ./scripts/execute-ci-artifacts.sh +``` + +`eval:ci` is **`npm run eval -- --scenarios 01,02`**: both **heuristic** checks and the **LLM judge** (grounded in each scenario’s **`## Success criteria`**). Skipping the judge would leave you with regex-only signal, which does not encode the product checklist. + +**GitHub Actions:** add repository secrets **`ANTHROPIC_API_KEY`**, **`EVAL_TEST_DESTINATION_URL`**, and **`OUTPOST_API_KEY`**. Workflow **`.github/workflows/docs-agent-eval-ci.yml`** runs **`./scripts/ci-eval.sh`** with **`EVAL_LOCAL_DOCS=1`** (agent **reads docs from the repo**), then **`./scripts/execute-ci-artifacts.sh`**: picks the **newest** **`*-scenario-01`** / **`*-scenario-02`** pair from **`results/runs/`**, runs the generated **`.sh`** then **`npx tsx`** on the TypeScript artifact (**`npm install`** in the **02** run dir when **`package.json`** exists). **`OUTPOST_TEST_WEBHOOK_URL`** in CI is set from the same secret as **`EVAL_TEST_DESTINATION_URL`**. Triggers on **`workflow_dispatch`** (manual: Actions → **Docs agent eval (CI slice)** → **Run workflow**, pick branch), pushes to **`main`**, and **pull requests** when **`docs/content/**`**, **`docs/apis/**`**, **`sdks/outpost-typescript/**`**, root **`docs/README.md`** / **`docs/AGENTS.md`**, or **`docs/agent-evaluation/**`** change (GitHub does not allow **`paths`** + **`paths-ignore`** together on the same event, so edits under e.g. **`docs/agent-evaluation/README.md`** also match **`docs/agent-evaluation/**`** and can trigger a run). Uses **`ubuntu-latest`** (Claude Agent SDK needs normal filesystem access — avoid tight sandboxes; see **Permissions / failures** above). **Fork PRs** skip this job (secrets are not available). + +- **`ANTHROPIC_API_KEY`** — required for the agent and for the **LLM judge** (Success criteria) after each scenario you run. +- **`EVAL_TEST_DESTINATION_URL`** — required for Turn 0; same Source URL as `{{TEST_DESTINATION_URL}}` (and, in CI, reused as **`OUTPOST_TEST_WEBHOOK_URL`** for execution). +- **`OUTPOST_API_KEY`** — required for **`execute-ci-artifacts.sh`** and for **GitHub Actions** execution after **`eval:ci`**. For **local** transcript-only runs you can omit it. Put the key in **`docs/agent-evaluation/.env`** (or export); never paste it into chat. +- **`EVAL_LOCAL_DOCS=1`** — Turn 0 replaces public doc URLs with **absolute paths to MDX/OpenAPI files in this repo** (agent uses **Read** on **`docs/`** instead of **WebFetch** to production). Use locally when validating unpublished docs; **GitHub Actions** sets this for **`docs-agent-eval-ci.yml`**. +- **`EVAL_SKIP_HARNESS_PRE_STEPS=1`** — skip **`git_clone`** (and any future **`preSteps`**) declared in a scenario’s **`## Eval harness`** JSON block; useful offline or when the baseline folder is already present. + +- **Turn 0** text is built from [`hookdeck-outpost-agent-prompt.mdoc`](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) (`## Template`) with placeholders filled from environment variables. +- Transcripts are written to `results/runs/-scenario-NN/transcript.json` (gitignored). + +See `npm run eval -- --help` for env vars (`EVAL_TOOLS`, `EVAL_MODEL`, etc.). + +### Permissions / failures (why a run might not work) + +Two different things get called “permissions”: + +1. **Cursor (or CI) sandbox and `tsx`** — The `tsx` **CLI** opens an IPC pipe in `/tmp` (or similar), which some sandboxes block (`listen EPERM`). This repo’s `npm run eval` uses **`node --import tsx`** instead so Node loads the tsx **loader** only (no CLI IPC). If you still see EPERM, run the same command in a normal terminal outside the sandbox, or use `npm run eval:tsx-cli` only where IPC is allowed. + +2. **Claude Agent SDK `dontAsk` + `allowedTools`** — In `dontAsk` mode, tools **not** listed in `allowedTools` are denied (no prompt). Defaults include **`Write`**, **`Edit`**, and **`Bash`** so app scenarios can scaffold and install dependencies inside the per-run directory. With **`EVAL_LOCAL_DOCS=1`**: **`Read,Glob,Grep,Write,Edit,Bash`**. Otherwise **`Read,Glob,Grep,WebFetch,Write,Edit,Bash`**. Narrow **`EVAL_TOOLS`** only if you need a stricter harness (e.g. transcript-only, no shell). + +3. **Run-directory sandbox (`PreToolUse`)** — Under `permissionMode: dontAsk`, hooks enforce boundaries (not `canUseTool` alone): + - **Write / Edit / NotebookEdit** — target path must resolve under `results/runs/-scenario-NN/`. **`EVAL_DISABLE_WORKSPACE_WRITE_GUARD=1`** disables this only (debug). + - **Read / Glob / Grep** — must stay under that same run directory, and (when **`EVAL_LOCAL_DOCS=1`**) under **`docs/`** of the Outpost repo for local MDX/OpenAPI only. **`EVAL_DISABLE_WORKSPACE_READ_GUARD=1`** disables read/glob/grep/bash/agent checks (restores pre–workspace-sandbox behavior). + - **Bash** — commands must not reference the Outpost **`repositoryRoot`** on disk unless the reference stays inside the run dir or (with local docs) inside **`docs/`**. + - **Agent** (subagent) — **denied by default** so runs cannot spider the monorepo for “free” SDK context. **`EVAL_ALLOW_AGENT_TOOL=1`** to opt in. + - Turn 0 also appends a short **workspace boundary** block (absolute run-dir paths) so the model treats only the clone as the product under integration. + +Changing **`EVAL_PERMISSION_MODE`** is usually unnecessary; widening **`EVAL_TOOLS`** (or using local docs) fixes most tool denials. + +### Transcript vs execution (full pass) + +`npm run eval` only captures **what the model produced**; by itself it does **not** call Outpost (transcript review). **`./scripts/execute-ci-artifacts.sh`** (and the **GitHub Actions** workflow’s second step) runs the **01** shell + **02** TypeScript outputs against **live** Outpost when **`OUTPOST_API_KEY`** and **`OUTPOST_TEST_WEBHOOK_URL`** are set. + +**Local smoke (no agent):** to verify secrets and the managed API the same way CI does—without depending on a fresh eval transcript—run from **`docs/agent-evaluation/`** with **`OUTPOST_API_KEY`** and **`OUTPOST_TEST_WEBHOOK_URL`** set (e.g. **`source .env`**): + +```sh +npm run smoke:execute-ci +``` + +That writes a temporary **`*-scenario-01` / `*-scenario-02`** pair under **`results/runs/`** with hand-maintained scripts: shell destination uses **`topics: ["*"]`** so you do not need every topic name pre-created; publish still uses **`OUTPOST_CI_PUBLISH_TOPIC`** (default **`user.created`**, overridable in the environment), which **must exist** in your Outpost project’s topic list. **`execute-ci-artifacts.sh`** was not exercised end-to-end in-repo before CI; use this command after changing execution logic. + +**CI `curl: (22) … 404`:** the agent-generated shell script is calling an Outpost URL that returned **404**. Common causes: wrong **`OUTPOST_API_BASE_URL`** in the script (CI now sets the managed URL explicitly), or a **publish/destination topic** that does not exist in the project tied to **`OUTPOST_API_KEY`**. Ensure **`user.created`** is configured in that project, or set **`OUTPOST_CI_PUBLISH_TOPIC`** to a topic you do have. Compare the failing **`curl`** line in the Actions log with the [curl quickstart](../content/quickstarts/hookdeck-outpost-curl.mdoc). + +A **full pass** also answers: *did the generated curl / script / app succeed against a live Outpost project?* Each scenario’s **Success criteria** ends with **Execution** checkboxes for that step. To run them: + +1. Add **`OUTPOST_API_KEY`** (and **`OUTPOST_TEST_WEBHOOK_URL`** / **`OUTPOST_API_BASE_URL`** when the artifact expects them) to `docs/agent-evaluation/.env` so your shell has them after `dotenv` or when you `source` / copy into the directory where you run the code. +2. Run the agent’s commands or start its app and complete the flows the scenario describes. +3. Record pass/fail in your run notes ([`results/RUN-RECORDING.template.md`](results/RUN-RECORDING.template.md)). + +#### Integration scenarios (08–10): depth to verify + +These measure **existing-app integration**, not a greenfield demo. When you **execute** the artifact: + +- **Topic reconciliation:** Confirm README maps **`publish` topics** to **real domain events** and, when the **configured topic list from onboarding** is incomplete, tells the operator to **add topics in Hookdeck**—not to retarget the app to a stale list (unless the scenario was explicitly wiring-only). +- **Domain publish:** Prefer a smoke step that performs a **real product action** (signup, create entity, etc.) and observe an accepted publish—not **only** a “send test event” button. +- **Heuristic `publish_beyond_test_only`:** [`score-transcript.ts`](src/score-transcript.ts) adds a weak automated check that the transcript corpus suggests publish beyond synthetic test-only paths; it is **not** a substitute for execution or the LLM judge reading **Success criteria**. + +## Single source of truth for the dashboard prompt + +The **full prompt template** (the text operators paste as Turn 0) lives in **one** place: + +**[`docs/content/quickstarts/hookdeck-outpost-agent-prompt.mdoc`](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc)** — use the fenced block under **## Template**. + +For eval runs, example placeholder substitutions (non-secret) are in [`fixtures/placeholder-values-for-turn0.md`](fixtures/placeholder-values-for-turn0.md) only. That file intentionally **does not** duplicate the template. + +The Hookdeck dashboard should eventually render the **same** template body from product-side source; until then, this MDX page is the documentation canonical copy. + +## How to run an evaluation (manual) + +1. **Turn 0:** Open the [agent prompt template](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc), copy **## Template**, replace `{{…}}` (see [placeholder examples](fixtures/placeholder-values-for-turn0.md)). +2. **Pick a scenario:** e.g. [`scenarios/01-basics-curl.md`](scenarios/01-basics-curl.md). +3. **New agent thread:** Paste Turn 0, then follow each **Turn N — User** line from the scenario verbatim (or as specified). +4. **Judge output:** Use the scenario’s **Success criteria** checkboxes (human decision). +5. **Record:** Copy [`results/RUN-RECORDING.template.md`](results/RUN-RECORDING.template.md) to a local filename under `results/` (see [`results/README.md`](results/README.md)); those files are **gitignored** by default. + +### Helper script (optional) + +From the repo root: + +```sh +./docs/agent-evaluation/scripts/run-scenario.sh 01 +``` + +This **only prints** paths and reminders. It does **not** start an agent or call OpenAI/Anthropic/etc. + +## Judging results + +- **Automated runs:** use **Success criteria** in each `scenarios/*.md` (definition of pass). Each **`npm run eval -- --scenario|scenarios|all`** run applies **heuristic + LLM** scorers unless you pass **`--no-score`** / **`--no-score-llm`**; **Execution** rows stay manual unless you add a verifier. +- **Manual runs** use the checklist in [`results/RUN-RECORDING.template.md`](results/RUN-RECORDING.template.md). + +There is still **no single portable “IDE agent” CLI** for all vendors; the SDK runner is the supported path for headless Anthropic-based CI. + +## Measuring scenarios + +| Layer | What it answers | Where | +|--------|-----------------|--------| +| **Definition** | What “good” means (product + transcript) | **`## Success criteria`** in each [`scenarios/*.md`](scenarios/) | +| **Heuristic** | Fast, deterministic signal from transcript JSON | [`src/score-transcript.ts`](src/score-transcript.ts) — combines assistant text with **Write/Edit tool inputs** and tool results so on-disk artifacts count | +| **LLM judge** | Structured pass/fail vs the same **Success criteria** | After each scenario when **`--no-score-llm`** is not set; or `npm run score -- --run --llm` — [`src/llm-judge.ts`](src/llm-judge.ts) | +| **Execution** | Live API / app smoke test | Human (or future script); not automated here | + +**Heuristic functions** (failed checks set **`npm run eval`** / **`npm run score`** exit **1** when that scorer ran): + +| Scenario | Function | Topics covered (summary) | +|----------|----------|---------------------------| +| 01 | `scoreScenario01` | Managed URL, tenant PUT, webhook destination POST, publish `data`, no key leak, optional verify turn | +| 02 | `scoreScenario02` | TS SDK, `Outpost`, env key, tenants/destinations/publish, webhook env, run command | +| 03 | `scoreScenario03` | Python SDK import, client, same API calls, env, webhook URL | +| 04 | `scoreScenario04` | Go module, `New`/`WithSecurity`, Upsert/Create/Publish, env, webhook URL | +| 05 | `scoreScenario05` | Next.js signals, TS SDK, API routes, two flows, server env key, no `NEXT_PUBLIC_` key, README, optional stress-turn Hookdeck hint | +| 06 | `scoreScenario06` | FastAPI, `outpost_sdk`, uvicorn, server env, two flows, README, webhook docs | +| 07 | `scoreScenario07` | `net/http`, Go SDK + `CreateDestinationCreateWebhook`, HTML UI, two flows, `go run`, README | +| 08 | `scoreScenario08` | Clone **next-saas-starter** (or git baseline), TS SDK, publish/destinations/tenants, server env key, per-customer webhook story | +| 09 | `scoreScenario09` | Clone **full-stack-fastapi-template** (or git baseline), `outpost_sdk`, integration + domain hook, env key, no client `NEXT_PUBLIC_`/`VITE_` key wiring, `publish_beyond_test_only`, README/env docs signal | +| 10 | `scoreScenario10` | Clone **startersaas-go-api** (or git baseline), Go Outpost SDK, publish + handler hook, env key | + +Export **`SCENARIO_IDS_WITH_HEURISTIC_RUBRIC`** in `score-transcript.ts` lists IDs **01–10** for tooling. + +## Scenarios + +To record each **`npm run eval -- --scenario …`** run, automated scores, and **whether you ran the generated code** with `OUTPOST_API_KEY`, use **[`SCENARIO-RUN-TRACKER.md`](SCENARIO-RUN-TRACKER.md)** (committed; not under `results/`, which is gitignored). + +| ID | File | Goal | +|----|------|------| +| 1 | [scenarios/01-basics-curl.md](scenarios/01-basics-curl.md) | Minimal **curl** only (managed API). | +| 2 | [scenarios/02-basics-typescript.md](scenarios/02-basics-typescript.md) | Minimal **TypeScript** script (`@hookdeck/outpost-sdk`). | +| 3 | [scenarios/03-basics-python.md](scenarios/03-basics-python.md) | Minimal **Python** script (`outpost_sdk`). | +| 4 | [scenarios/04-basics-go.md](scenarios/04-basics-go.md) | Minimal **Go** program (`outpost-go`). | +| 5 | [scenarios/05-app-nextjs.md](scenarios/05-app-nextjs.md) | Small **Next.js** app: UI to register a webhook destination and trigger a test publish. | +| 6 | [scenarios/06-app-fastapi.md](scenarios/06-app-fastapi.md) | Small **FastAPI** app with the same UX as scenario 5. | +| 7 | [scenarios/07-app-go-http.md](scenarios/07-app-go-http.md) | Small **Go** `net/http` app + simple HTML UI (same UX as scenario 5). | +| 8 | [scenarios/08-integrate-nextjs-existing.md](scenarios/08-integrate-nextjs-existing.md) | **Existing Next.js SaaS** baseline — add outbound webhooks via Outpost ([leerob/next-saas-starter](https://github.com/leerob/next-saas-starter)). | +| 9 | [scenarios/09-integrate-fastapi-existing.md](scenarios/09-integrate-fastapi-existing.md) | **Existing FastAPI full-stack** baseline — Outpost integration ([fastapi/full-stack-fastapi-template](https://github.com/fastapi/full-stack-fastapi-template)). | +| 10 | [scenarios/10-integrate-go-existing.md](scenarios/10-integrate-go-existing.md) | **Existing Go SaaS API** baseline — Outpost integration ([devinterface/startersaas-go-api](https://github.com/devinterface/startersaas-go-api)). | + +Scenarios **1–4** align with **“Try it out”**; **5–7** with **“Build a minimal example”**; **8–10** with **“Integrate with an existing app”** using pinned OSS baselines (Java / .NET can be added later the same way). + +## Agent skills recommendation + +**Recommend yes** for teams standardizing on Hookdeck’s skill pack: the [outpost skill](https://github.com/hookdeck/agent-skills/tree/main/skills/outpost) gives agents a consistent overview (tenants, destinations, topics, curl shape) and links into docs. + +**Caveats (update the skill in `hookdeck/agent-skills`, not in this repo):** + +1. **Managed-first** — The published skill is still **self-hosted heavy** (Docker block first; managed is a short table). For Hookdeck Outpost GA, the skill should foreground [managed quickstarts](../content/quickstarts/hookdeck-outpost-curl.mdoc), `https://api.outpost.hookdeck.com/2025-07-01`, **Settings → Secrets**, and `OUTPOST_API_KEY` / optional `OUTPOST_API_BASE_URL` to match product copy. +2. **REST paths** — Examples must use **`/tenants/{id}`**, not `PUT $BASE_URL/$TENANT_ID` (that path is wrong for the real API). +3. **Naming** — Align env var naming with docs (`OUTPOST_API_KEY` or documented dashboard name), not ad-hoc `HOOKDECK_API_KEY` unless the dashboard literally uses that string. +4. **Router vs. deep skills** — Today `outpost` is one monolithic `SKILL.md`. The skill itself mentions **future** destination-specific skills (`outpost-webhooks`, etc.). For scale, consider either **sections** with clear headings or **child skills** (e.g. `outpost-managed-quickstart`, `outpost-self-hosted`) once content grows—without forcing users to install many tiles for the common case. + +Until the skill is updated, agents should still be pointed at the **quickstart MDX pages** in this repo (or production docs URLs); the skill is supplementary. + +## Related docs + +- [Agent prompt template (SSoT)](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) +- [Upstream skill notes](SKILL-UPSTREAM-NOTES.md) +- [TEMP tracking note](../TEMP-hookdeck-outpost-onboarding-status.md) diff --git a/docs/agent-evaluation/SCENARIO-RUN-TRACKER.md b/docs/agent-evaluation/SCENARIO-RUN-TRACKER.md new file mode 100644 index 000000000..f55ad9cf7 --- /dev/null +++ b/docs/agent-evaluation/SCENARIO-RUN-TRACKER.md @@ -0,0 +1,139 @@ +# Scenario run tracker + +Use this table while you **run scenarios one at a time** and **execute the generated artifacts** against a real Outpost project. + +## How to use + +1. **Automated agent eval** (from `docs/agent-evaluation/`): + ```sh + npm run eval -- --scenario + ``` + Each run creates `**results/runs/-scenario-/**` with `transcript.json`, `heuristic-score.json`, `llm-score.json`, and whatever the agent wrote (scripts, apps, clones). +2. **Fill the table:** paste or note the **run directory** (stamp), mark **Heuristic** / **LLM** pass or fail (from the sidecars or console). **Run directory** should be the **latest** folder matching `results/runs/*-scenario-` whose `heuristic-score.json` has **`overallTranscriptPass: true`** (re-scan directories when updating this file). +3. **Execution (generated code):** with `**OUTPOST_API_KEY`** (and `**OUTPOST_TEST_WEBHOOK_URL`** / `**OUTPOST_API_BASE_URL`** if needed) in your shell or `.env`, run the artifact the scenario expects — e.g. `bash outpost-quickstart.sh`, `npx tsx …`, `python …`, `go run …`, `npm run dev` in the generated app folder. Mark **Pass** / **Fail** / **Skip** and add **Notes** (HTTP status, delivery in Hookdeck Console, etc.). **Do not edit generated files to force a pass** — test what the agent produced; note OS/environment (e.g. Linux vs macOS) when relevant. **This column is the primary bar for “does the output actually work?”** Heuristic and LLM scores are supplementary. +4. **Optional:** copy a row to your local run log under `results/` if you use `RUN-RECORDING.template.md`. + +--- + +## Tracker + + +| ID | Scenario file | Run directory (`results/runs/…`) | Heuristic | LLM judge | Execution (generated code) | Notes | +| --- | ------------------------------------------------------------------------------ | -------------------------------------- | ---------------------- | --------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| 01 | [01-basics-curl.md](scenarios/01-basics-curl.md) | `2026-04-10T09-28-52-764Z-scenario-01` | Pass (7/7) | Pass | Pass | Artifact: `**quickstart.sh`**. Heuristic + LLM from `npm run eval -- --scenario 01`; harness sidecars are sibling `*.eval-*.json` under `results/runs/` (not inside run dir). Execution: `OUTPOST_API_KEY` from `docs/agent-evaluation/.env` + `bash quickstart.sh` in run dir; tenant **200**, destination **201**, publish **202**; exit 0. | +| 02 | [02-basics-typescript.md](scenarios/02-basics-typescript.md) | `2026-04-10T15-01-35-359Z-scenario-02` | Pass (9/9) | Pass | Pass | `EVAL_LOCAL_DOCS=1` after **scope-router** update to [agent prompt template](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc). Artifact: `**outpost-quickstart.ts`** + `package.json` (SDK)—**no** Next.js scaffold. Heuristic + LLM pass; harness sidecars sibling under `results/runs/`. Earlier passes: `2026-04-10T10-49-02-890Z-scenario-02`, `2026-04-10T10-34-35-461Z-scenario-02`. Over-build run: `2026-04-10T09-39-06-362Z-scenario-02` (Next.js + script; LLM fail). | +| 03 | [03-basics-python.md](scenarios/03-basics-python.md) | `2026-04-10T11-02-19-073Z-scenario-03` | Pass (8/8) | Pass | Pass | `EVAL_LOCAL_DOCS=1` with [scope-router prompt](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc). Artifact: `**outpost_quickstart.py`** + `.env.example` (`python-dotenv`, `outpost_sdk`)—**no** web framework. Heuristic + LLM pass; judge `execution_in_transcript` **pass** (agent ran script; printed event id). Harness sidecars sibling under `results/runs/`. Earlier run: `2026-04-08T15-34-12-720Z-scenario-03`. | +| 04 | [04-basics-go.md](scenarios/04-basics-go.md) | `2026-04-08T15-48-31-367Z-scenario-04` | Pass (9/9) | Pass | Pass | `EVAL_LOCAL_DOCS=1`. Artifacts: `**main.go`**, `go.mod` (replace → repo `sdks/outpost-go`). `docs/agent-evaluation/.env` + `go run .`; tenant, destination, publish OK. | +| 05 | [05-app-nextjs.md](scenarios/05-app-nextjs.md) | `2026-04-08T16-12-10-708Z-scenario-05` | Pass (10/10) | Pass | Pass | **Last heuristic-pass run:** `**outpost-nextjs-demo/`** — simpler two-route app (`/api/register`, `/api/publish`), fixed topic. Richer app + assessment: **§ Scenario 05 — assessment** (`**nextjs-webhook-demo/`** in `2026-04-08T17-21-22-170Z-scenario-05`) — LLM + execution pass; heuristic **9/10** (`managed_base_not_selfhosted`, doc-corpus). | +| 06 | [06-app-fastapi.md](scenarios/06-app-fastapi.md) | `2026-04-09T08-38-42-008Z-scenario-06` | Pass (8/8) | Pass | Pass | `EVAL_LOCAL_DOCS=1`. `**main.py`** + `requirements.txt`, `outpost_sdk` + FastAPI. HTML: destinations list, add webhook (topics from API + URL), publish test event, delete. Execution: `python3 -m venv .venv`, `pip install -r requirements.txt`, run-dir `.env`, `uvicorn main:app` on :8766; **GET /** 200, **POST /destinations** 303, **POST /publish** 303. | +| 07 | [07-app-go-http.md](scenarios/07-app-go-http.md) | `2026-04-09T09-10-23-291Z-scenario-07` | Pass (9/9) | Pass | Pass | `EVAL_LOCAL_DOCS=1`. `**go-portal-demo/`** — `main.go` + `templates/`, `net/http`, `outpost-go` (`replace` → repo `sdks/outpost-go`). Multi-step create destination + **GET/POST /publish**. Execution: `PORT=8777` + key/base from `docs/agent-evaluation/.env`; **GET /** 200, **POST /publish** 200. Eval ~25 min wall time. | +| 08 | [08-integrate-nextjs-existing.md](scenarios/08-integrate-nextjs-existing.md) | `2026-04-10T14-29-04-214Z-scenario-08` | Pass (10/10) | Pass | Pass | `EVAL_LOCAL_DOCS=1` + [scope-router prompt](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc). Harness `**next-saas-starter/`** under run dir (gitignored). **Execution pass** — operator QA (Postgres, `.env`, migrate/seed/dev, Outpost UI/API). See **§ Scenario 08 — execution notes** for reproducibility (seed/`server-only`, destination-schema `key` vs SDK). Earlier: `2026-04-10T11-08-35-921Z-scenario-08` (8/8), `2026-04-09T14-48-16-906Z-scenario-08`, `2026-04-09T11-08-32-505Z-scenario-08`. | +| 09 | [09-integrate-fastapi-existing.md](scenarios/09-integrate-fastapi-existing.md) | `2026-04-10T19-54-20-037Z-scenario-09` | Pass (10/10) | Pass | Pass | `EVAL_LOCAL_DOCS=1`. **Artifact:** `full-stack-fastapi-template/` under run dir (**gitignored**). **Heuristic + LLM** from this stamp; harness sidecars sibling under `results/runs/`. Docker: default **5173** / **8000** / **1080** / **1025**; if host **5432** is taken, map DB e.g. **54334:5432** in `compose.override.yml`. After a **fresh DB volume**, clear the SPA token or **re-login** — stale JWT → **404 User not found** on `/api/v1/users/me` and `/api/v1/outpost/destinations`. **§ Scenario 09 — post-agent work** (below) still describes template fixes vs baseline. **Legacy runs:** `2026-04-10T19-22-02-903Z-scenario-09`, `2026-04-09T22-16-54-750Z-scenario-09` (6/6), `2026-04-09T20-48-16-530Z-scenario-09`, `2026-04-09T15-51-44-184Z-scenario-09`. | +| 10 | [10-integrate-go-existing.md](scenarios/10-integrate-go-existing.md) | `2026-04-10T22-14-20-704Z-scenario-10` | Pass (7/7) | Pass | Pass | `EVAL_LOCAL_DOCS=1`. Harness clone **`startersaas-go-api/`** under run dir (**gitignored**); pin [**devinterface/startersaas-go-api**](https://github.com/devinterface/startersaas-go-api). **Execution:** `go build` OK; **`docker compose build`** fails on baseline **Go 1.21** image vs **`go 1.22`** in `go.mod` (upstream Dockerfile). **Smoke:** Mongo **:27018**, `go run .`, **`POST /api/v1/auth/signup`** with **`privacyAccepted` / `marketingAccepted` as JSON booleans** → **200**; log **`[outpost] published user.created`**. **Outpost delivery** to Hookdeck Source verified with a distinct **`POST /publish`** probe (tenant + webhook destination + event). | + + +### Scenario 08 — execution notes (`2026-04-10T14-29-04-214Z-scenario-08`) + +**Execution:** **Pass** — operator QA on `**next-saas-starter/`** (artifact **not** committed; run folder under `results/runs/` is gitignored). + +Reproducibility / gotchas: + +- **`pnpm db:migrate`** — succeeds against local Postgres when `POSTGRES_URL` is set (see clone `README.md`). +- **`pnpm db:seed`** — as generated, importing `stripe` from `**lib/payments/stripe.ts**` pulls Outpost and `**server-only**`, which throws when the seed script runs under `**tsx**` (not the Next server). Common **local** fix: instantiate `**Stripe**` directly in `**lib/db/seed.ts**` with the same `**apiVersion**` as the payments module so seed does not load that file. Requires valid **Stripe** keys in `.env`. Re-running seed after a successful run fails on duplicate `**test@test.com**` — expected. +- **`pnpm dev`** — if another `**next dev**` already holds **`.next/dev/lock`** for this tree, stop it or remove the lock; port **3000** may be taken (Next picks another port). Turbopack may warn about multiple lockfiles when the app sits under the monorepo — see Next’s **`turbopack.root`** guidance if needed. +- **Destination schema `key`** — API returns `key` on schema fields; older SDK parses may strip it and break create-destination payloads keyed from labels. Regenerating SDKs (or a BFF raw fetch + mapping) aligns the UI with the API until then. + +### Scenario 09 — post-agent work (representative: `2026-04-09T22-16-54-750Z-scenario-09`; latest eval stamp `2026-04-10T19-54-20-037Z-scenario-09`) + +Work applied **after** the agent transcript so the FastAPI + React artifact matches current integration guidance (eval honesty + local execution). The template tree under `results/runs/-scenario-09/` is **not committed** (see `results/.gitignore`); repo **docs** and **prompt** updates that back this scenario **are** in git. + +**Frontend / router** + +- **TanStack Router:** `frontend/src/routeTree.gen.ts` — register `/_layout/webhooks` (agent added the route file but not the generated tree). +- **API base URL:** webhooks page used browser-relative `/api/...` against nginx; switched to backend base (`OpenAPI.BASE` / `VITE_API_URL`). +- **Destination types:** Outpost JSON uses `**type`** and `**icon`** (not `id` / `svg`); fixed controlled radios / **Next** in the create wizard. + +**Backend** + +- `**POST /api/v1/webhooks/publish-test`** — synthetic `publish` for integration testing. +- `**GET /api/v1/webhooks/events`**, `**GET /api/v1/webhooks/attempts**`, `**POST /api/v1/webhooks/retry**` — BFF proxies for tenant-scoped **events list**, **attempts**, and **manual retry** (admin key server-side). + +**Dashboard UI (webhooks page)** + +- **Send test event**, **Event activity** (filter by destination, select event → attempts table, **Retry** on failed attempts). + +**Docs & prompt (repository)** + +- [Building your own UI](../content/guides/building-your-own-ui.mdoc) — destination-type field fixes; **Events, attempts, and retries** section (features, how they connect, links to API). +- [Agent prompt template](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) — full-stack guidance mentions **events list**, **attempts**, **retry**, alongside test publish. + +### Scenario 09 — review notes (resolved, 2026-04-10) + +Operator feedback from exercising the FastAPI full-stack artifact is **closed** in-repo: + +1. **Event activity IA** — [Building your own UI](../content/guides/building-your-own-ui.mdoc) documents **default** destination → activity and **optional** tenant-wide activity with the same list endpoints; no open doc gap. +2. **Domain topics + real publishes vs test-only** — [Agent prompt](../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) (topic reconciliation, domain publish, test publish as separate), scenarios **08–10** success criteria + user-turn scripts, [README](README.md) execution notes, and heuristic `**publish_beyond_test_only`** in `[src/score-transcript.ts](src/score-transcript.ts)` cover what we measure. + +The **copied agent template** (the `## Hookdeck Outpost integration` block) intentionally stays **scenario-agnostic**: it does not name eval baselines, harness repos, or scenario IDs—only product-level integration guidance and doc links. + +### Column hints + + +| Column | Meaning | +| ----------------- | ---------------------------------------------------------------------------------------------------------- | +| **Run directory** | Latest `results/runs/*-scenario-` with `heuristic-score.json` → `overallTranscriptPass: true` (folder contains `transcript.json`) | +| **Heuristic** | `heuristic-score.json` → `overallTranscriptPass` (or `passed`/`total`) | +| **LLM judge** | `llm-score.json` → `overall_transcript_pass` | +| **Execution** | Your smoke test of the **produced** script/app with real credentials — **not** automated by `npm run eval` | + + +### Status legend (suggested) + +Use short text or symbols in cells, e.g. **Pass** / **Fail** / **Skip** / **N/A**, or ✅ / ❌ / — + +--- + +## Scenario 05 — assessment (`2026-04-08T17-21-22-170Z`) + +**Status:** Deep-dive on the **richer** Next.js artifact (`nextjs-webhook-demo/`). The **tracker table** row for scenario **05** points at **`2026-04-08T16-12-10-708Z-scenario-05`** (`outpost-nextjs-demo/`) as the **latest heuristic-pass** run (10/10); this section documents **`17-21-22`** separately because it failed that check while still passing LLM + execution. + + +| Dimension | Result | +| ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Run directory** | `results/runs/2026-04-08T17-21-22-170Z-scenario-05/` | +| **Artifact** | `nextjs-webhook-demo/` — Next.js App Router, `@hookdeck/outpost-sdk`, Outpost calls **only** in `app/api/**/route.ts` (managed API via SDK default unless `OUTPOST_API_BASE_URL` is set). | +| **Heuristic** | **9/10**; `overallTranscriptPass` false — single failure: `managed_base_not_selfhosted` because the transcript corpus included a **Read** of older [Building your own UI](../content/guides/building-your-own-ui.mdoc) containing `localhost:3333/api/v1`. The **generated app does not** use that URL. See § Scenario 05 heuristic. | +| **LLM judge** | **Pass** — matches scenario 05 success criteria (Next.js structure, server-side SDK, distinct destination + publish UI, tenant/topic handling, README env, managed default). | +| **Execution** | **Pass** (re-checked): `npm run build` in `nextjs-webhook-demo/`; `npm run dev` with `docs/agent-evaluation/.env`; `POST /api/destinations` → **201**, `POST /api/publish` → **200**. | + + +**What the app demonstrates (UX / model):** + +1. **Tenant** — Editable tenant id; copy states destinations and publishes are scoped to it. +2. **Register webhook destination** — URL field + **topic checkboxes** populated from `**GET /api/topics`** (server lists topics from Outpost); `**POST /api/destinations`** upserts tenant and creates webhook destination for selected topics. +3. **Destinations list** — `**GET /api/destinations?tenantId=`** table (type, target, topics) with refresh — matches “tenant → many destinations” mental model. +4. **Publish test event** — Separate action; `**POST /api/publish`** with chosen topic; UI notes fan-out to matching destinations. + +**Comparison — older run `2026-04-08T16-12-10-708Z` (`outpost-nextjs-demo/`):** Simpler two-route app (`/api/register`, `/api/publish`), **fixed topic** in routes, **no** topics or destinations list APIs, **10/10** heuristic (no offending doc fragment in corpus). Useful as a minimal baseline; **17-21-22** is the richer assessment target. + +--- + +## Scenario 05 heuristic — `managed_base_not_selfhosted` + +Scenario 05 includes a regex check (`managed_base_not_selfhosted`) in `[src/score-transcript.ts](../src/score-transcript.ts)` (`scoreScenario05`). It looks at the **whole scoring corpus**: assistant-visible text **plus** content that ended up in the transcript from tools (e.g. **Read** of a doc file), not just files in the run folder. + +- It fails if the corpus contains a **self-hosted** default API path: specifically the literal substring `localhost:3333/api/v1` (Outpost’s common local dev URL), or a similar `localhost: / api/v1` pattern, unless `OUTPOST_API_BASE_URL` also appears (see code for the exact conditions). +- **Historical cause:** Older [Building your own UI](../content/guides/building-your-own-ui.mdoc) curl examples used `localhost:3333/api/v1`. If the agent **read** that page during a run, those lines were embedded in `transcript.json`, the check fired, and `overallTranscriptPass` became **false** even when the **generated Next.js app** only used the **managed** SDK default. That was a **harness / doc-corpus** interaction, not proof the app targeted local Outpost. +- **Doc update:** `docs/content/guides/building-your-own-ui.mdoc` was rewritten to be **managed / self-hosted agnostic** (`OUTPOST_API_BASE_URL`, OpenAPI-shaped paths). Examples **no longer contain** the literal `localhost:3333/api/v1`, so a future eval whose corpus only picks up the current file should **not** fail this check for that substring. Re-run scenario 05 to confirm; other `localhost` patterns could still match if they appear elsewhere in the corpus. +- **Run `2026-04-08T16-12-10-708Z`:** heuristic **10/10**, `overallTranscriptPass: true`. +- **Run `2026-04-08T17-21-22-170Z`:** heuristic **9/10**, `overallTranscriptPass: false` — failed `managed_base_not_selfhosted`; LLM judge still **passed**; transcript included **Read** of the **previous** `building-your-own-ui.mdx` with `localhost:3333/api/v1`. + +**Possible follow-ups:** narrow the heuristic to tool-written files under the run workspace only, or exclude known doc paths from the substring that triggers this check. + +## Action items + +- Scenario 05: optionally re-run eval after the UI guide rewrite to confirm `managed_base_not_selfhosted` no longer false-positives on that doc **Read**; then consider whether the heuristic can be narrowed (see § above). + +--- + +Full harness docs: [README.md](README.md). \ No newline at end of file diff --git a/docs/agent-evaluation/fixtures/placeholder-values-for-turn0.md b/docs/agent-evaluation/fixtures/placeholder-values-for-turn0.md new file mode 100644 index 000000000..152bcf9d3 --- /dev/null +++ b/docs/agent-evaluation/fixtures/placeholder-values-for-turn0.md @@ -0,0 +1,29 @@ +# Placeholder values for Turn 0 (eval / local testing) + +The **prompt template itself** lives in one place only: + +`**[hookdeck-outpost-agent-prompt.mdoc](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc)`** (from repo root: `docs/content/quickstarts/...`) — copy the fenced block under **## Template**, then replace each `{{PLACEHOLDER}}` using the table below. + +Do **not** paste real API keys into chat. Have operators put `OUTPOST_API_KEY` in a project `**.env`** (or another loader), not in the agent transcript. Use a throwaway Hookdeck project when possible. + +For `**npm run eval -- --scenario …**` (or `**--scenarios**` / `**--all**`), the runner only needs `**ANTHROPIC_API_KEY**` and `**EVAL_TEST_DESTINATION_URL**`. To score a **full** eval (generated commands/code actually work), you still need `**OUTPOST_API_KEY`** (and usually `**OUTPOST_TEST_WEBHOOK_URL**`) when you **execute** the agent’s output afterward. Optional `**EVAL_LOCAL_DOCS=1`** points Turn 0 at repo paths instead of live `{{DOCS_URL}}` links. + +--- + +## Example substitutions (non-secret) + + +| Placeholder | Example | +| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `{{API_BASE_URL}}` | `https://api.outpost.hookdeck.com/2025-07-01` | +| `{{TOPICS_LIST}}` | `- user.created` | +| `{{TEST_DESTINATION_URL}}` | Hookdeck Console **Source** URL the dashboard feeds in (for automated evals, set `EVAL_TEST_DESTINATION_URL` to the same value). Example: `https://hkdk.events/...` | +| `{{DOCS_URL}}` | `https://hookdeck.com/docs/outpost` (same path segments as `/docs/outpost/…` on hookdeck.com; see `docs/content/nav.json`) | +| `{{LLMS_FULL_URL}}` | Omit the line in the template if unused, or your public `llms-full.txt` URL | + + +--- + +## Dashboard implementation note + +When this text is embedded in the Hookdeck product, the **same** template body should be rendered from one dashboard/backend source so docs and product stay aligned. The MDX page in this repo is the documentation **canonical** copy until product source is wired to match it. \ No newline at end of file diff --git a/docs/agent-evaluation/package-lock.json b/docs/agent-evaluation/package-lock.json new file mode 100644 index 000000000..12d5ab75e --- /dev/null +++ b/docs/agent-evaluation/package-lock.json @@ -0,0 +1,2096 @@ +{ + "name": "outpost-agent-evaluation", + "version": "1.0.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "outpost-agent-evaluation", + "version": "1.0.0", + "dependencies": { + "@anthropic-ai/claude-agent-sdk": "^0.2.92", + "dotenv": "^16.4.7" + }, + "devDependencies": { + "tsx": "^4.19.4", + "typescript": "^5.8.3" + }, + "engines": { + "node": ">=18" + } + }, + "node_modules/@anthropic-ai/claude-agent-sdk": { + "version": "0.2.92", + "resolved": "https://registry.npmjs.org/@anthropic-ai/claude-agent-sdk/-/claude-agent-sdk-0.2.92.tgz", + "integrity": "sha512-loYyxVUC5gBwHjGi9Fv0b84mduJTp9Z3Pum+y/7IVQDb4NynKfVQl6l4VeDKZaW+1QTQtd25tY4hwUznD7Krqw==", + "license": "SEE LICENSE IN README.md", + "dependencies": { + "@anthropic-ai/sdk": "^0.80.0", + "@modelcontextprotocol/sdk": "^1.27.1" + }, + "engines": { + "node": ">=18.0.0" + }, + "optionalDependencies": { + "@img/sharp-darwin-arm64": "^0.34.2", + "@img/sharp-darwin-x64": "^0.34.2", + "@img/sharp-linux-arm": "^0.34.2", + "@img/sharp-linux-arm64": "^0.34.2", + "@img/sharp-linux-x64": "^0.34.2", + "@img/sharp-linuxmusl-arm64": "^0.34.2", + "@img/sharp-linuxmusl-x64": "^0.34.2", + "@img/sharp-win32-arm64": "^0.34.2", + "@img/sharp-win32-x64": "^0.34.2" + }, + "peerDependencies": { + "zod": "^4.0.0" + } + }, + "node_modules/@anthropic-ai/sdk": { + "version": "0.80.0", + "resolved": "https://registry.npmjs.org/@anthropic-ai/sdk/-/sdk-0.80.0.tgz", + "integrity": "sha512-WeXLn7zNVk3yjeshn+xZHvld6AoFUOR3Sep6pSoHho5YbSi6HwcirqgPA5ccFuW8QTVJAAU7N8uQQC6Wa9TG+g==", + "license": "MIT", + "dependencies": { + "json-schema-to-ts": "^3.1.1" + }, + "bin": { + "anthropic-ai-sdk": "bin/cli" + }, + "peerDependencies": { + "zod": "^3.25.0 || ^4.0.0" + }, + "peerDependenciesMeta": { + "zod": { + "optional": true + } + } + }, + "node_modules/@babel/runtime": { + "version": "7.29.2", + "resolved": "https://registry.npmjs.org/@babel/runtime/-/runtime-7.29.2.tgz", + "integrity": "sha512-JiDShH45zKHWyGe4ZNVRrCjBz8Nh9TMmZG1kh4QTK8hCBTWBi8Da+i7s1fJw7/lYpM4ccepSNfqzZ/QvABBi5g==", + "license": "MIT", + "engines": { + "node": ">=6.9.0" + } + }, + "node_modules/@esbuild/aix-ppc64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.27.7.tgz", + "integrity": "sha512-EKX3Qwmhz1eMdEJokhALr0YiD0lhQNwDqkPYyPhiSwKrh7/4KRjQc04sZ8db+5DVVnZ1LmbNDI1uAMPEUBnQPg==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "aix" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-arm": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.27.7.tgz", + "integrity": "sha512-jbPXvB4Yj2yBV7HUfE2KHe4GJX51QplCN1pGbYjvsyCZbQmies29EoJbkEc+vYuU5o45AfQn37vZlyXy4YJ8RQ==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.27.7.tgz", + "integrity": "sha512-62dPZHpIXzvChfvfLJow3q5dDtiNMkwiRzPylSCfriLvZeq0a1bWChrGx/BbUbPwOrsWKMn8idSllklzBy+dgQ==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.27.7.tgz", + "integrity": "sha512-x5VpMODneVDb70PYV2VQOmIUUiBtY3D3mPBG8NxVk5CogneYhkR7MmM3yR/uMdITLrC1ml/NV1rj4bMJuy9MCg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/darwin-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.27.7.tgz", + "integrity": "sha512-5lckdqeuBPlKUwvoCXIgI2D9/ABmPq3Rdp7IfL70393YgaASt7tbju3Ac+ePVi3KDH6N2RqePfHnXkaDtY9fkw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/darwin-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.27.7.tgz", + "integrity": "sha512-rYnXrKcXuT7Z+WL5K980jVFdvVKhCHhUwid+dDYQpH+qu+TefcomiMAJpIiC2EM3Rjtq0sO3StMV/+3w3MyyqQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/freebsd-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.27.7.tgz", + "integrity": "sha512-B48PqeCsEgOtzME2GbNM2roU29AMTuOIN91dsMO30t+Ydis3z/3Ngoj5hhnsOSSwNzS+6JppqWsuhTp6E82l2w==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/freebsd-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.27.7.tgz", + "integrity": "sha512-jOBDK5XEjA4m5IJK3bpAQF9/Lelu/Z9ZcdhTRLf4cajlB+8VEhFFRjWgfy3M1O4rO2GQ/b2dLwCUGpiF/eATNQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-arm": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.27.7.tgz", + "integrity": "sha512-RkT/YXYBTSULo3+af8Ib0ykH8u2MBh57o7q/DAs3lTJlyVQkgQvlrPTnjIzzRPQyavxtPtfg0EopvDyIt0j1rA==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.27.7.tgz", + "integrity": "sha512-RZPHBoxXuNnPQO9rvjh5jdkRmVizktkT7TCDkDmQ0W2SwHInKCAV95GRuvdSvA7w4VMwfCjUiPwDi0ZO6Nfe9A==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-ia32": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.27.7.tgz", + "integrity": "sha512-GA48aKNkyQDbd3KtkplYWT102C5sn/EZTY4XROkxONgruHPU72l+gW+FfF8tf2cFjeHaRbWpOYa/uRBz/Xq1Pg==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-loong64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.27.7.tgz", + "integrity": "sha512-a4POruNM2oWsD4WKvBSEKGIiWQF8fZOAsycHOt6JBpZ+JN2n2JH9WAv56SOyu9X5IqAjqSIPTaJkqN8F7XOQ5Q==", + "cpu": [ + "loong64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-mips64el": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.27.7.tgz", + "integrity": "sha512-KabT5I6StirGfIz0FMgl1I+R1H73Gp0ofL9A3nG3i/cYFJzKHhouBV5VWK1CSgKvVaG4q1RNpCTR2LuTVB3fIw==", + "cpu": [ + "mips64el" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-ppc64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.27.7.tgz", + "integrity": "sha512-gRsL4x6wsGHGRqhtI+ifpN/vpOFTQtnbsupUF5R5YTAg+y/lKelYR1hXbnBdzDjGbMYjVJLJTd2OFmMewAgwlQ==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-riscv64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.27.7.tgz", + "integrity": "sha512-hL25LbxO1QOngGzu2U5xeXtxXcW+/GvMN3ejANqXkxZ/opySAZMrc+9LY/WyjAan41unrR3YrmtTsUpwT66InQ==", + "cpu": [ + "riscv64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-s390x": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.27.7.tgz", + "integrity": "sha512-2k8go8Ycu1Kb46vEelhu1vqEP+UeRVj2zY1pSuPdgvbd5ykAw82Lrro28vXUrRmzEsUV0NzCf54yARIK8r0fdw==", + "cpu": [ + "s390x" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.27.7.tgz", + "integrity": "sha512-hzznmADPt+OmsYzw1EE33ccA+HPdIqiCRq7cQeL1Jlq2gb1+OyWBkMCrYGBJ+sxVzve2ZJEVeePbLM2iEIZSxA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/netbsd-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.27.7.tgz", + "integrity": "sha512-b6pqtrQdigZBwZxAn1UpazEisvwaIDvdbMbmrly7cDTMFnw/+3lVxxCTGOrkPVnsYIosJJXAsILG9XcQS+Yu6w==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/netbsd-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.27.7.tgz", + "integrity": "sha512-OfatkLojr6U+WN5EDYuoQhtM+1xco+/6FSzJJnuWiUw5eVcicbyK3dq5EeV/QHT1uy6GoDhGbFpprUiHUYggrw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openbsd-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.27.7.tgz", + "integrity": "sha512-AFuojMQTxAz75Fo8idVcqoQWEHIXFRbOc1TrVcFSgCZtQfSdc1RXgB3tjOn/krRHENUB4j00bfGjyl2mJrU37A==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openbsd-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.27.7.tgz", + "integrity": "sha512-+A1NJmfM8WNDv5CLVQYJ5PshuRm/4cI6WMZRg1by1GwPIQPCTs1GLEUHwiiQGT5zDdyLiRM/l1G0Pv54gvtKIg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openharmony-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/openharmony-arm64/-/openharmony-arm64-0.27.7.tgz", + "integrity": "sha512-+KrvYb/C8zA9CU/g0sR6w2RBw7IGc5J2BPnc3dYc5VJxHCSF1yNMxTV5LQ7GuKteQXZtspjFbiuW5/dOj7H4Yw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openharmony" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/sunos-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.27.7.tgz", + "integrity": "sha512-ikktIhFBzQNt/QDyOL580ti9+5mL/YZeUPKU2ivGtGjdTYoqz6jObj6nOMfhASpS4GU4Q/Clh1QtxWAvcYKamA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "sunos" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-arm64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.27.7.tgz", + "integrity": "sha512-7yRhbHvPqSpRUV7Q20VuDwbjW5kIMwTHpptuUzV+AA46kiPze5Z7qgt6CLCK3pWFrHeNfDd1VKgyP4O+ng17CA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-ia32": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.27.7.tgz", + "integrity": "sha512-SmwKXe6VHIyZYbBLJrhOoCJRB/Z1tckzmgTLfFYOfpMAx63BJEaL9ExI8x7v0oAO3Zh6D/Oi1gVxEYr5oUCFhw==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-x64": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.27.7.tgz", + "integrity": "sha512-56hiAJPhwQ1R4i+21FVF7V8kSD5zZTdHcVuRFMW0hn753vVfQN8xlx4uOPT4xoGH0Z/oVATuR82AiqSTDIpaHg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@hono/node-server": { + "version": "1.19.13", + "resolved": "https://registry.npmjs.org/@hono/node-server/-/node-server-1.19.13.tgz", + "integrity": "sha512-TsQLe4i2gvoTtrHje625ngThGBySOgSK3Xo2XRYOdqGN1teR8+I7vchQC46uLJi8OF62YTYA3AhSpumtkhsaKQ==", + "license": "MIT", + "engines": { + "node": ">=18.14.1" + }, + "peerDependencies": { + "hono": "^4" + } + }, + "node_modules/@img/sharp-darwin-arm64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-darwin-arm64/-/sharp-darwin-arm64-0.34.5.tgz", + "integrity": "sha512-imtQ3WMJXbMY4fxb/Ndp6HBTNVtWCUI0WdobyheGf5+ad6xX8VIDO8u2xE4qc/fr08CKG/7dDseFtn6M6g/r3w==", + "cpu": [ + "arm64" + ], + "license": "Apache-2.0", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + }, + "optionalDependencies": { + "@img/sharp-libvips-darwin-arm64": "1.2.4" + } + }, + "node_modules/@img/sharp-darwin-x64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-darwin-x64/-/sharp-darwin-x64-0.34.5.tgz", + "integrity": "sha512-YNEFAF/4KQ/PeW0N+r+aVVsoIY0/qxxikF2SWdp+NRkmMB7y9LBZAVqQ4yhGCm/H3H270OSykqmQMKLBhBJDEw==", + "cpu": [ + "x64" + ], + "license": "Apache-2.0", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + }, + "optionalDependencies": { + "@img/sharp-libvips-darwin-x64": "1.2.4" + } + }, + "node_modules/@img/sharp-libvips-darwin-arm64": { + "version": "1.2.4", + "resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-arm64/-/sharp-libvips-darwin-arm64-1.2.4.tgz", + "integrity": "sha512-zqjjo7RatFfFoP0MkQ51jfuFZBnVE2pRiaydKJ1G/rHZvnsrHAOcQALIi9sA5co5xenQdTugCvtb1cuf78Vf4g==", + "cpu": [ + "arm64" + ], + "license": "LGPL-3.0-or-later", + "optional": true, + "os": [ + "darwin" + ], + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-libvips-darwin-x64": { + "version": "1.2.4", + "resolved": "https://registry.npmjs.org/@img/sharp-libvips-darwin-x64/-/sharp-libvips-darwin-x64-1.2.4.tgz", + "integrity": "sha512-1IOd5xfVhlGwX+zXv2N93k0yMONvUlANylbJw1eTah8K/Jtpi15KC+WSiaX/nBmbm2HxRM1gZ0nSdjSsrZbGKg==", + "cpu": [ + "x64" + ], + "license": "LGPL-3.0-or-later", + "optional": true, + "os": [ + "darwin" + ], + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-libvips-linux-arm": { + "version": "1.2.4", + "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm/-/sharp-libvips-linux-arm-1.2.4.tgz", + "integrity": "sha512-bFI7xcKFELdiNCVov8e44Ia4u2byA+l3XtsAj+Q8tfCwO6BQ8iDojYdvoPMqsKDkuoOo+X6HZA0s0q11ANMQ8A==", + "cpu": [ + "arm" + ], + "license": "LGPL-3.0-or-later", + "optional": true, + "os": [ + "linux" + ], + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-libvips-linux-arm64": { + "version": "1.2.4", + "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-arm64/-/sharp-libvips-linux-arm64-1.2.4.tgz", + "integrity": "sha512-excjX8DfsIcJ10x1Kzr4RcWe1edC9PquDRRPx3YVCvQv+U5p7Yin2s32ftzikXojb1PIFc/9Mt28/y+iRklkrw==", + "cpu": [ + "arm64" + ], + "license": "LGPL-3.0-or-later", + "optional": true, + "os": [ + "linux" + ], + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-libvips-linux-x64": { + "version": "1.2.4", + "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linux-x64/-/sharp-libvips-linux-x64-1.2.4.tgz", + "integrity": "sha512-tJxiiLsmHc9Ax1bz3oaOYBURTXGIRDODBqhveVHonrHJ9/+k89qbLl0bcJns+e4t4rvaNBxaEZsFtSfAdquPrw==", + "cpu": [ + "x64" + ], + "license": "LGPL-3.0-or-later", + "optional": true, + "os": [ + "linux" + ], + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-libvips-linuxmusl-arm64": { + "version": "1.2.4", + "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linuxmusl-arm64/-/sharp-libvips-linuxmusl-arm64-1.2.4.tgz", + "integrity": "sha512-FVQHuwx1IIuNow9QAbYUzJ+En8KcVm9Lk5+uGUQJHaZmMECZmOlix9HnH7n1TRkXMS0pGxIJokIVB9SuqZGGXw==", + "cpu": [ + "arm64" + ], + "license": "LGPL-3.0-or-later", + "optional": true, + "os": [ + "linux" + ], + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-libvips-linuxmusl-x64": { + "version": "1.2.4", + "resolved": "https://registry.npmjs.org/@img/sharp-libvips-linuxmusl-x64/-/sharp-libvips-linuxmusl-x64-1.2.4.tgz", + "integrity": "sha512-+LpyBk7L44ZIXwz/VYfglaX/okxezESc6UxDSoyo2Ks6Jxc4Y7sGjpgU9s4PMgqgjj1gZCylTieNamqA1MF7Dg==", + "cpu": [ + "x64" + ], + "license": "LGPL-3.0-or-later", + "optional": true, + "os": [ + "linux" + ], + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-linux-arm": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-linux-arm/-/sharp-linux-arm-0.34.5.tgz", + "integrity": "sha512-9dLqsvwtg1uuXBGZKsxem9595+ujv0sJ6Vi8wcTANSFpwV/GONat5eCkzQo/1O6zRIkh0m/8+5BjrRr7jDUSZw==", + "cpu": [ + "arm" + ], + "license": "Apache-2.0", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + }, + "optionalDependencies": { + "@img/sharp-libvips-linux-arm": "1.2.4" + } + }, + "node_modules/@img/sharp-linux-arm64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-linux-arm64/-/sharp-linux-arm64-0.34.5.tgz", + "integrity": "sha512-bKQzaJRY/bkPOXyKx5EVup7qkaojECG6NLYswgktOZjaXecSAeCWiZwwiFf3/Y+O1HrauiE3FVsGxFg8c24rZg==", + "cpu": [ + "arm64" + ], + "license": "Apache-2.0", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + }, + "optionalDependencies": { + "@img/sharp-libvips-linux-arm64": "1.2.4" + } + }, + "node_modules/@img/sharp-linux-x64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-linux-x64/-/sharp-linux-x64-0.34.5.tgz", + "integrity": "sha512-MEzd8HPKxVxVenwAa+JRPwEC7QFjoPWuS5NZnBt6B3pu7EG2Ge0id1oLHZpPJdn3OQK+BQDiw9zStiHBTJQQQQ==", + "cpu": [ + "x64" + ], + "license": "Apache-2.0", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + }, + "optionalDependencies": { + "@img/sharp-libvips-linux-x64": "1.2.4" + } + }, + "node_modules/@img/sharp-linuxmusl-arm64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-linuxmusl-arm64/-/sharp-linuxmusl-arm64-0.34.5.tgz", + "integrity": "sha512-fprJR6GtRsMt6Kyfq44IsChVZeGN97gTD331weR1ex1c1rypDEABN6Tm2xa1wE6lYb5DdEnk03NZPqA7Id21yg==", + "cpu": [ + "arm64" + ], + "license": "Apache-2.0", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + }, + "optionalDependencies": { + "@img/sharp-libvips-linuxmusl-arm64": "1.2.4" + } + }, + "node_modules/@img/sharp-linuxmusl-x64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-linuxmusl-x64/-/sharp-linuxmusl-x64-0.34.5.tgz", + "integrity": "sha512-Jg8wNT1MUzIvhBFxViqrEhWDGzqymo3sV7z7ZsaWbZNDLXRJZoRGrjulp60YYtV4wfY8VIKcWidjojlLcWrd8Q==", + "cpu": [ + "x64" + ], + "license": "Apache-2.0", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + }, + "optionalDependencies": { + "@img/sharp-libvips-linuxmusl-x64": "1.2.4" + } + }, + "node_modules/@img/sharp-win32-arm64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-win32-arm64/-/sharp-win32-arm64-0.34.5.tgz", + "integrity": "sha512-WQ3AgWCWYSb2yt+IG8mnC6Jdk9Whs7O0gxphblsLvdhSpSTtmu69ZG1Gkb6NuvxsNACwiPV6cNSZNzt0KPsw7g==", + "cpu": [ + "arm64" + ], + "license": "Apache-2.0 AND LGPL-3.0-or-later", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@img/sharp-win32-x64": { + "version": "0.34.5", + "resolved": "https://registry.npmjs.org/@img/sharp-win32-x64/-/sharp-win32-x64-0.34.5.tgz", + "integrity": "sha512-+29YMsqY2/9eFEiW93eqWnuLcWcufowXewwSNIT6UwZdUUCrM3oFjMWH/Z6/TMmb4hlFenmfAVbpWeup2jryCw==", + "cpu": [ + "x64" + ], + "license": "Apache-2.0 AND LGPL-3.0-or-later", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": "^18.17.0 || ^20.3.0 || >=21.0.0" + }, + "funding": { + "url": "https://opencollective.com/libvips" + } + }, + "node_modules/@modelcontextprotocol/sdk": { + "version": "1.29.0", + "resolved": "https://registry.npmjs.org/@modelcontextprotocol/sdk/-/sdk-1.29.0.tgz", + "integrity": "sha512-zo37mZA9hJWpULgkRpowewez1y6ML5GsXJPY8FI0tBBCd77HEvza4jDqRKOXgHNn867PVGCyTdzqpz0izu5ZjQ==", + "license": "MIT", + "dependencies": { + "@hono/node-server": "^1.19.9", + "ajv": "^8.17.1", + "ajv-formats": "^3.0.1", + "content-type": "^1.0.5", + "cors": "^2.8.5", + "cross-spawn": "^7.0.5", + "eventsource": "^3.0.2", + "eventsource-parser": "^3.0.0", + "express": "^5.2.1", + "express-rate-limit": "^8.2.1", + "hono": "^4.11.4", + "jose": "^6.1.3", + "json-schema-typed": "^8.0.2", + "pkce-challenge": "^5.0.0", + "raw-body": "^3.0.0", + "zod": "^3.25 || ^4.0", + "zod-to-json-schema": "^3.25.1" + }, + "engines": { + "node": ">=18" + }, + "peerDependencies": { + "@cfworker/json-schema": "^4.1.1", + "zod": "^3.25 || ^4.0" + }, + "peerDependenciesMeta": { + "@cfworker/json-schema": { + "optional": true + }, + "zod": { + "optional": false + } + } + }, + "node_modules/accepts": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/accepts/-/accepts-2.0.0.tgz", + "integrity": "sha512-5cvg6CtKwfgdmVqY1WIiXKc3Q1bkRqGLi+2W/6ao+6Y7gu/RCwRuAhGEzh5B4KlszSuTLgZYuqFqo5bImjNKng==", + "license": "MIT", + "dependencies": { + "mime-types": "^3.0.0", + "negotiator": "^1.0.0" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/ajv": { + "version": "8.18.0", + "resolved": "https://registry.npmjs.org/ajv/-/ajv-8.18.0.tgz", + "integrity": "sha512-PlXPeEWMXMZ7sPYOHqmDyCJzcfNrUr3fGNKtezX14ykXOEIvyK81d+qydx89KY5O71FKMPaQ2vBfBFI5NHR63A==", + "license": "MIT", + "dependencies": { + "fast-deep-equal": "^3.1.3", + "fast-uri": "^3.0.1", + "json-schema-traverse": "^1.0.0", + "require-from-string": "^2.0.2" + }, + "funding": { + "type": "github", + "url": "https://github.com/sponsors/epoberezkin" + } + }, + "node_modules/ajv-formats": { + "version": "3.0.1", + "resolved": "https://registry.npmjs.org/ajv-formats/-/ajv-formats-3.0.1.tgz", + "integrity": "sha512-8iUql50EUR+uUcdRQ3HDqa6EVyo3docL8g5WJ3FNcWmu62IbkGUue/pEyLBW8VGKKucTPgqeks4fIU1DA4yowQ==", + "license": "MIT", + "dependencies": { + "ajv": "^8.0.0" + }, + "peerDependencies": { + "ajv": "^8.0.0" + }, + "peerDependenciesMeta": { + "ajv": { + "optional": true + } + } + }, + "node_modules/body-parser": { + "version": "2.2.2", + "resolved": "https://registry.npmjs.org/body-parser/-/body-parser-2.2.2.tgz", + "integrity": "sha512-oP5VkATKlNwcgvxi0vM0p/D3n2C3EReYVX+DNYs5TjZFn/oQt2j+4sVJtSMr18pdRr8wjTcBl6LoV+FUwzPmNA==", + "license": "MIT", + "dependencies": { + "bytes": "^3.1.2", + "content-type": "^1.0.5", + "debug": "^4.4.3", + "http-errors": "^2.0.0", + "iconv-lite": "^0.7.0", + "on-finished": "^2.4.1", + "qs": "^6.14.1", + "raw-body": "^3.0.1", + "type-is": "^2.0.1" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/bytes": { + "version": "3.1.2", + "resolved": "https://registry.npmjs.org/bytes/-/bytes-3.1.2.tgz", + "integrity": "sha512-/Nf7TyzTx6S3yRJObOAV7956r8cr2+Oj8AC5dt8wSP3BQAoeX58NoHyCU8P8zGkNXStjTSi6fzO6F0pBdcYbEg==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/call-bind-apply-helpers": { + "version": "1.0.2", + "resolved": "https://registry.npmjs.org/call-bind-apply-helpers/-/call-bind-apply-helpers-1.0.2.tgz", + "integrity": "sha512-Sp1ablJ0ivDkSzjcaJdxEunN5/XvksFJ2sMBFfq6x0ryhQV/2b/KwFe21cMpmHtPOSij8K99/wSfoEuTObmuMQ==", + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0", + "function-bind": "^1.1.2" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/call-bound": { + "version": "1.0.4", + "resolved": "https://registry.npmjs.org/call-bound/-/call-bound-1.0.4.tgz", + "integrity": "sha512-+ys997U96po4Kx/ABpBCqhA9EuxJaQWDQg7295H4hBphv3IZg0boBKuwYpt4YXp6MZ5AmZQnU/tyMTlRpaSejg==", + "license": "MIT", + "dependencies": { + "call-bind-apply-helpers": "^1.0.2", + "get-intrinsic": "^1.3.0" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/content-disposition": { + "version": "1.0.1", + "resolved": "https://registry.npmjs.org/content-disposition/-/content-disposition-1.0.1.tgz", + "integrity": "sha512-oIXISMynqSqm241k6kcQ5UwttDILMK4BiurCfGEREw6+X9jkkpEe5T9FZaApyLGGOnFuyMWZpdolTXMtvEJ08Q==", + "license": "MIT", + "engines": { + "node": ">=18" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/content-type": { + "version": "1.0.5", + "resolved": "https://registry.npmjs.org/content-type/-/content-type-1.0.5.tgz", + "integrity": "sha512-nTjqfcBFEipKdXCv4YDQWCfmcLZKm81ldF0pAopTvyrFGVbcR6P/VAAd5G7N+0tTr8QqiU0tFadD6FK4NtJwOA==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/cookie": { + "version": "0.7.2", + "resolved": "https://registry.npmjs.org/cookie/-/cookie-0.7.2.tgz", + "integrity": "sha512-yki5XnKuf750l50uGTllt6kKILY4nQ1eNIQatoXEByZ5dWgnKqbnqmTrBE5B4N7lrMJKQ2ytWMiTO2o0v6Ew/w==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/cookie-signature": { + "version": "1.2.2", + "resolved": "https://registry.npmjs.org/cookie-signature/-/cookie-signature-1.2.2.tgz", + "integrity": "sha512-D76uU73ulSXrD1UXF4KE2TMxVVwhsnCgfAyTg9k8P6KGZjlXKrOLe4dJQKI3Bxi5wjesZoFXJWElNWBjPZMbhg==", + "license": "MIT", + "engines": { + "node": ">=6.6.0" + } + }, + "node_modules/cors": { + "version": "2.8.6", + "resolved": "https://registry.npmjs.org/cors/-/cors-2.8.6.tgz", + "integrity": "sha512-tJtZBBHA6vjIAaF6EnIaq6laBBP9aq/Y3ouVJjEfoHbRBcHBAHYcMh/w8LDrk2PvIMMq8gmopa5D4V8RmbrxGw==", + "license": "MIT", + "dependencies": { + "object-assign": "^4", + "vary": "^1" + }, + "engines": { + "node": ">= 0.10" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/cross-spawn": { + "version": "7.0.6", + "resolved": "https://registry.npmjs.org/cross-spawn/-/cross-spawn-7.0.6.tgz", + "integrity": "sha512-uV2QOWP2nWzsy2aMp8aRibhi9dlzF5Hgh5SHaB9OiTGEyDTiJJyx0uy51QXdyWbtAHNua4XJzUKca3OzKUd3vA==", + "license": "MIT", + "dependencies": { + "path-key": "^3.1.0", + "shebang-command": "^2.0.0", + "which": "^2.0.1" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/debug": { + "version": "4.4.3", + "resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz", + "integrity": "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA==", + "license": "MIT", + "dependencies": { + "ms": "^2.1.3" + }, + "engines": { + "node": ">=6.0" + }, + "peerDependenciesMeta": { + "supports-color": { + "optional": true + } + } + }, + "node_modules/depd": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/depd/-/depd-2.0.0.tgz", + "integrity": "sha512-g7nH6P6dyDioJogAAGprGpCtVImJhpPk/roCzdb3fIh61/s/nPsfR6onyMwkCAR/OlC3yBC0lESvUoQEAssIrw==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/dotenv": { + "version": "16.6.1", + "resolved": "https://registry.npmjs.org/dotenv/-/dotenv-16.6.1.tgz", + "integrity": "sha512-uBq4egWHTcTt33a72vpSG0z3HnPuIl6NqYcTrKEg2azoEyl2hpW0zqlxysq2pK9HlDIHyHyakeYaYnSAwd8bow==", + "license": "BSD-2-Clause", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://dotenvx.com" + } + }, + "node_modules/dunder-proto": { + "version": "1.0.1", + "resolved": "https://registry.npmjs.org/dunder-proto/-/dunder-proto-1.0.1.tgz", + "integrity": "sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A==", + "license": "MIT", + "dependencies": { + "call-bind-apply-helpers": "^1.0.1", + "es-errors": "^1.3.0", + "gopd": "^1.2.0" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/ee-first": { + "version": "1.1.1", + "resolved": "https://registry.npmjs.org/ee-first/-/ee-first-1.1.1.tgz", + "integrity": "sha512-WMwm9LhRUo+WUaRN+vRuETqG89IgZphVSNkdFgeb6sS/E4OrDIN7t48CAewSHXc6C8lefD8KKfr5vY61brQlow==", + "license": "MIT" + }, + "node_modules/encodeurl": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/encodeurl/-/encodeurl-2.0.0.tgz", + "integrity": "sha512-Q0n9HRi4m6JuGIV1eFlmvJB7ZEVxu93IrMyiMsGC0lrMJMWzRgx6WGquyfQgZVb31vhGgXnfmPNNXmxnOkRBrg==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/es-define-property": { + "version": "1.0.1", + "resolved": "https://registry.npmjs.org/es-define-property/-/es-define-property-1.0.1.tgz", + "integrity": "sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g==", + "license": "MIT", + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/es-errors": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/es-errors/-/es-errors-1.3.0.tgz", + "integrity": "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==", + "license": "MIT", + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/es-object-atoms": { + "version": "1.1.1", + "resolved": "https://registry.npmjs.org/es-object-atoms/-/es-object-atoms-1.1.1.tgz", + "integrity": "sha512-FGgH2h8zKNim9ljj7dankFPcICIK9Cp5bm+c2gQSYePhpaG5+esrLODihIorn+Pe6FGJzWhXQotPv73jTaldXA==", + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/esbuild": { + "version": "0.27.7", + "resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.27.7.tgz", + "integrity": "sha512-IxpibTjyVnmrIQo5aqNpCgoACA/dTKLTlhMHihVHhdkxKyPO1uBBthumT0rdHmcsk9uMonIWS0m4FljWzILh3w==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "bin": { + "esbuild": "bin/esbuild" + }, + "engines": { + "node": ">=18" + }, + "optionalDependencies": { + "@esbuild/aix-ppc64": "0.27.7", + "@esbuild/android-arm": "0.27.7", + "@esbuild/android-arm64": "0.27.7", + "@esbuild/android-x64": "0.27.7", + "@esbuild/darwin-arm64": "0.27.7", + "@esbuild/darwin-x64": "0.27.7", + "@esbuild/freebsd-arm64": "0.27.7", + "@esbuild/freebsd-x64": "0.27.7", + "@esbuild/linux-arm": "0.27.7", + "@esbuild/linux-arm64": "0.27.7", + "@esbuild/linux-ia32": "0.27.7", + "@esbuild/linux-loong64": "0.27.7", + "@esbuild/linux-mips64el": "0.27.7", + "@esbuild/linux-ppc64": "0.27.7", + "@esbuild/linux-riscv64": "0.27.7", + "@esbuild/linux-s390x": "0.27.7", + "@esbuild/linux-x64": "0.27.7", + "@esbuild/netbsd-arm64": "0.27.7", + "@esbuild/netbsd-x64": "0.27.7", + "@esbuild/openbsd-arm64": "0.27.7", + "@esbuild/openbsd-x64": "0.27.7", + "@esbuild/openharmony-arm64": "0.27.7", + "@esbuild/sunos-x64": "0.27.7", + "@esbuild/win32-arm64": "0.27.7", + "@esbuild/win32-ia32": "0.27.7", + "@esbuild/win32-x64": "0.27.7" + } + }, + "node_modules/escape-html": { + "version": "1.0.3", + "resolved": "https://registry.npmjs.org/escape-html/-/escape-html-1.0.3.tgz", + "integrity": "sha512-NiSupZ4OeuGwr68lGIeym/ksIZMJodUGOSCZ/FSnTxcrekbvqrgdUxlJOMpijaKZVjAJrWrGs/6Jy8OMuyj9ow==", + "license": "MIT" + }, + "node_modules/etag": { + "version": "1.8.1", + "resolved": "https://registry.npmjs.org/etag/-/etag-1.8.1.tgz", + "integrity": "sha512-aIL5Fx7mawVa300al2BnEE4iNvo1qETxLrPI/o05L7z6go7fCw1J6EQmbK4FmJ2AS7kgVF/KEZWufBfdClMcPg==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/eventsource": { + "version": "3.0.7", + "resolved": "https://registry.npmjs.org/eventsource/-/eventsource-3.0.7.tgz", + "integrity": "sha512-CRT1WTyuQoD771GW56XEZFQ/ZoSfWid1alKGDYMmkt2yl8UXrVR4pspqWNEcqKvVIzg6PAltWjxcSSPrboA4iA==", + "license": "MIT", + "dependencies": { + "eventsource-parser": "^3.0.1" + }, + "engines": { + "node": ">=18.0.0" + } + }, + "node_modules/eventsource-parser": { + "version": "3.0.6", + "resolved": "https://registry.npmjs.org/eventsource-parser/-/eventsource-parser-3.0.6.tgz", + "integrity": "sha512-Vo1ab+QXPzZ4tCa8SwIHJFaSzy4R6SHf7BY79rFBDf0idraZWAkYrDjDj8uWaSm3S2TK+hJ7/t1CEmZ7jXw+pg==", + "license": "MIT", + "engines": { + "node": ">=18.0.0" + } + }, + "node_modules/express": { + "version": "5.2.1", + "resolved": "https://registry.npmjs.org/express/-/express-5.2.1.tgz", + "integrity": "sha512-hIS4idWWai69NezIdRt2xFVofaF4j+6INOpJlVOLDO8zXGpUVEVzIYk12UUi2JzjEzWL3IOAxcTubgz9Po0yXw==", + "license": "MIT", + "peer": true, + "dependencies": { + "accepts": "^2.0.0", + "body-parser": "^2.2.1", + "content-disposition": "^1.0.0", + "content-type": "^1.0.5", + "cookie": "^0.7.1", + "cookie-signature": "^1.2.1", + "debug": "^4.4.0", + "depd": "^2.0.0", + "encodeurl": "^2.0.0", + "escape-html": "^1.0.3", + "etag": "^1.8.1", + "finalhandler": "^2.1.0", + "fresh": "^2.0.0", + "http-errors": "^2.0.0", + "merge-descriptors": "^2.0.0", + "mime-types": "^3.0.0", + "on-finished": "^2.4.1", + "once": "^1.4.0", + "parseurl": "^1.3.3", + "proxy-addr": "^2.0.7", + "qs": "^6.14.0", + "range-parser": "^1.2.1", + "router": "^2.2.0", + "send": "^1.1.0", + "serve-static": "^2.2.0", + "statuses": "^2.0.1", + "type-is": "^2.0.1", + "vary": "^1.1.2" + }, + "engines": { + "node": ">= 18" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/express-rate-limit": { + "version": "8.3.2", + "resolved": "https://registry.npmjs.org/express-rate-limit/-/express-rate-limit-8.3.2.tgz", + "integrity": "sha512-77VmFeJkO0/rvimEDuUC5H30oqUC4EyOhyGccfqoLebB0oiEYfM7nwPrsDsBL1gsTpwfzX8SFy2MT3TDyRq+bg==", + "license": "MIT", + "dependencies": { + "ip-address": "10.1.0" + }, + "engines": { + "node": ">= 16" + }, + "funding": { + "url": "https://github.com/sponsors/express-rate-limit" + }, + "peerDependencies": { + "express": ">= 4.11" + } + }, + "node_modules/fast-deep-equal": { + "version": "3.1.3", + "resolved": "https://registry.npmjs.org/fast-deep-equal/-/fast-deep-equal-3.1.3.tgz", + "integrity": "sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q==", + "license": "MIT" + }, + "node_modules/fast-uri": { + "version": "3.1.0", + "resolved": "https://registry.npmjs.org/fast-uri/-/fast-uri-3.1.0.tgz", + "integrity": "sha512-iPeeDKJSWf4IEOasVVrknXpaBV0IApz/gp7S2bb7Z4Lljbl2MGJRqInZiUrQwV16cpzw/D3S5j5Julj/gT52AA==", + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/fastify" + }, + { + "type": "opencollective", + "url": "https://opencollective.com/fastify" + } + ], + "license": "BSD-3-Clause" + }, + "node_modules/finalhandler": { + "version": "2.1.1", + "resolved": "https://registry.npmjs.org/finalhandler/-/finalhandler-2.1.1.tgz", + "integrity": "sha512-S8KoZgRZN+a5rNwqTxlZZePjT/4cnm0ROV70LedRHZ0p8u9fRID0hJUZQpkKLzro8LfmC8sx23bY6tVNxv8pQA==", + "license": "MIT", + "dependencies": { + "debug": "^4.4.0", + "encodeurl": "^2.0.0", + "escape-html": "^1.0.3", + "on-finished": "^2.4.1", + "parseurl": "^1.3.3", + "statuses": "^2.0.1" + }, + "engines": { + "node": ">= 18.0.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/forwarded": { + "version": "0.2.0", + "resolved": "https://registry.npmjs.org/forwarded/-/forwarded-0.2.0.tgz", + "integrity": "sha512-buRG0fpBtRHSTCOASe6hD258tEubFoRLb4ZNA6NxMVHNw2gOcwHo9wyablzMzOA5z9xA9L1KNjk/Nt6MT9aYow==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/fresh": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/fresh/-/fresh-2.0.0.tgz", + "integrity": "sha512-Rx/WycZ60HOaqLKAi6cHRKKI7zxWbJ31MhntmtwMoaTeF7XFH9hhBp8vITaMidfljRQ6eYWCKkaTK+ykVJHP2A==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/fsevents": { + "version": "2.3.3", + "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.3.tgz", + "integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": "^8.16.0 || ^10.6.0 || >=11.0.0" + } + }, + "node_modules/function-bind": { + "version": "1.1.2", + "resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz", + "integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==", + "license": "MIT", + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/get-intrinsic": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.3.0.tgz", + "integrity": "sha512-9fSjSaos/fRIVIp+xSJlE6lfwhES7LNtKaCBIamHsjr2na1BiABJPo0mOjjz8GJDURarmCPGqaiVg5mfjb98CQ==", + "license": "MIT", + "dependencies": { + "call-bind-apply-helpers": "^1.0.2", + "es-define-property": "^1.0.1", + "es-errors": "^1.3.0", + "es-object-atoms": "^1.1.1", + "function-bind": "^1.1.2", + "get-proto": "^1.0.1", + "gopd": "^1.2.0", + "has-symbols": "^1.1.0", + "hasown": "^2.0.2", + "math-intrinsics": "^1.1.0" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/get-proto": { + "version": "1.0.1", + "resolved": "https://registry.npmjs.org/get-proto/-/get-proto-1.0.1.tgz", + "integrity": "sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==", + "license": "MIT", + "dependencies": { + "dunder-proto": "^1.0.1", + "es-object-atoms": "^1.0.0" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/get-tsconfig": { + "version": "4.13.7", + "resolved": "https://registry.npmjs.org/get-tsconfig/-/get-tsconfig-4.13.7.tgz", + "integrity": "sha512-7tN6rFgBlMgpBML5j8typ92BKFi2sFQvIdpAqLA2beia5avZDrMs0FLZiM5etShWq5irVyGcGMEA1jcDaK7A/Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "resolve-pkg-maps": "^1.0.0" + }, + "funding": { + "url": "https://github.com/privatenumber/get-tsconfig?sponsor=1" + } + }, + "node_modules/gopd": { + "version": "1.2.0", + "resolved": "https://registry.npmjs.org/gopd/-/gopd-1.2.0.tgz", + "integrity": "sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg==", + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/has-symbols": { + "version": "1.1.0", + "resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.1.0.tgz", + "integrity": "sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ==", + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/hasown": { + "version": "2.0.2", + "resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz", + "integrity": "sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==", + "license": "MIT", + "dependencies": { + "function-bind": "^1.1.2" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/hono": { + "version": "4.12.12", + "resolved": "https://registry.npmjs.org/hono/-/hono-4.12.12.tgz", + "integrity": "sha512-p1JfQMKaceuCbpJKAPKVqyqviZdS0eUxH9v82oWo1kb9xjQ5wA6iP3FNVAPDFlz5/p7d45lO+BpSk1tuSZMF4Q==", + "license": "MIT", + "peer": true, + "engines": { + "node": ">=16.9.0" + } + }, + "node_modules/http-errors": { + "version": "2.0.1", + "resolved": "https://registry.npmjs.org/http-errors/-/http-errors-2.0.1.tgz", + "integrity": "sha512-4FbRdAX+bSdmo4AUFuS0WNiPz8NgFt+r8ThgNWmlrjQjt1Q7ZR9+zTlce2859x4KSXrwIsaeTqDoKQmtP8pLmQ==", + "license": "MIT", + "dependencies": { + "depd": "~2.0.0", + "inherits": "~2.0.4", + "setprototypeof": "~1.2.0", + "statuses": "~2.0.2", + "toidentifier": "~1.0.1" + }, + "engines": { + "node": ">= 0.8" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/iconv-lite": { + "version": "0.7.2", + "resolved": "https://registry.npmjs.org/iconv-lite/-/iconv-lite-0.7.2.tgz", + "integrity": "sha512-im9DjEDQ55s9fL4EYzOAv0yMqmMBSZp6G0VvFyTMPKWxiSBHUj9NW/qqLmXUwXrrM7AvqSlTCfvqRb0cM8yYqw==", + "license": "MIT", + "dependencies": { + "safer-buffer": ">= 2.1.2 < 3.0.0" + }, + "engines": { + "node": ">=0.10.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/inherits": { + "version": "2.0.4", + "resolved": "https://registry.npmjs.org/inherits/-/inherits-2.0.4.tgz", + "integrity": "sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ==", + "license": "ISC" + }, + "node_modules/ip-address": { + "version": "10.1.0", + "resolved": "https://registry.npmjs.org/ip-address/-/ip-address-10.1.0.tgz", + "integrity": "sha512-XXADHxXmvT9+CRxhXg56LJovE+bmWnEWB78LB83VZTprKTmaC5QfruXocxzTZ2Kl0DNwKuBdlIhjL8LeY8Sf8Q==", + "license": "MIT", + "engines": { + "node": ">= 12" + } + }, + "node_modules/ipaddr.js": { + "version": "1.9.1", + "resolved": "https://registry.npmjs.org/ipaddr.js/-/ipaddr.js-1.9.1.tgz", + "integrity": "sha512-0KI/607xoxSToH7GjN1FfSbLoU0+btTicjsQSWQlh/hZykN8KpmMf7uYwPW3R+akZ6R/w18ZlXSHBYXiYUPO3g==", + "license": "MIT", + "engines": { + "node": ">= 0.10" + } + }, + "node_modules/is-promise": { + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/is-promise/-/is-promise-4.0.0.tgz", + "integrity": "sha512-hvpoI6korhJMnej285dSg6nu1+e6uxs7zG3BYAm5byqDsgJNWwxzM6z6iZiAgQR4TJ30JmBTOwqZUw3WlyH3AQ==", + "license": "MIT" + }, + "node_modules/isexe": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/isexe/-/isexe-2.0.0.tgz", + "integrity": "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw==", + "license": "ISC" + }, + "node_modules/jose": { + "version": "6.2.2", + "resolved": "https://registry.npmjs.org/jose/-/jose-6.2.2.tgz", + "integrity": "sha512-d7kPDd34KO/YnzaDOlikGpOurfF0ByC2sEV4cANCtdqLlTfBlw2p14O/5d/zv40gJPbIQxfES3nSx1/oYNyuZQ==", + "license": "MIT", + "funding": { + "url": "https://github.com/sponsors/panva" + } + }, + "node_modules/json-schema-to-ts": { + "version": "3.1.1", + "resolved": "https://registry.npmjs.org/json-schema-to-ts/-/json-schema-to-ts-3.1.1.tgz", + "integrity": "sha512-+DWg8jCJG2TEnpy7kOm/7/AxaYoaRbjVB4LFZLySZlWn8exGs3A4OLJR966cVvU26N7X9TWxl+Jsw7dzAqKT6g==", + "license": "MIT", + "dependencies": { + "@babel/runtime": "^7.18.3", + "ts-algebra": "^2.0.0" + }, + "engines": { + "node": ">=16" + } + }, + "node_modules/json-schema-traverse": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/json-schema-traverse/-/json-schema-traverse-1.0.0.tgz", + "integrity": "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug==", + "license": "MIT" + }, + "node_modules/json-schema-typed": { + "version": "8.0.2", + "resolved": "https://registry.npmjs.org/json-schema-typed/-/json-schema-typed-8.0.2.tgz", + "integrity": "sha512-fQhoXdcvc3V28x7C7BMs4P5+kNlgUURe2jmUT1T//oBRMDrqy1QPelJimwZGo7Hg9VPV3EQV5Bnq4hbFy2vetA==", + "license": "BSD-2-Clause" + }, + "node_modules/math-intrinsics": { + "version": "1.1.0", + "resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz", + "integrity": "sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g==", + "license": "MIT", + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/media-typer": { + "version": "1.1.0", + "resolved": "https://registry.npmjs.org/media-typer/-/media-typer-1.1.0.tgz", + "integrity": "sha512-aisnrDP4GNe06UcKFnV5bfMNPBUw4jsLGaWwWfnH3v02GnBuXX2MCVn5RbrWo0j3pczUilYblq7fQ7Nw2t5XKw==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/merge-descriptors": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/merge-descriptors/-/merge-descriptors-2.0.0.tgz", + "integrity": "sha512-Snk314V5ayFLhp3fkUREub6WtjBfPdCPY1Ln8/8munuLuiYhsABgBVWsozAG+MWMbVEvcdcpbi9R7ww22l9Q3g==", + "license": "MIT", + "engines": { + "node": ">=18" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/mime-db": { + "version": "1.54.0", + "resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.54.0.tgz", + "integrity": "sha512-aU5EJuIN2WDemCcAp2vFBfp/m4EAhWJnUNSSw0ixs7/kXbd6Pg64EmwJkNdFhB8aWt1sH2CTXrLxo/iAGV3oPQ==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/mime-types": { + "version": "3.0.2", + "resolved": "https://registry.npmjs.org/mime-types/-/mime-types-3.0.2.tgz", + "integrity": "sha512-Lbgzdk0h4juoQ9fCKXW4by0UJqj+nOOrI9MJ1sSj4nI8aI2eo1qmvQEie4VD1glsS250n15LsWsYtCugiStS5A==", + "license": "MIT", + "dependencies": { + "mime-db": "^1.54.0" + }, + "engines": { + "node": ">=18" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/ms": { + "version": "2.1.3", + "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz", + "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", + "license": "MIT" + }, + "node_modules/negotiator": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/negotiator/-/negotiator-1.0.0.tgz", + "integrity": "sha512-8Ofs/AUQh8MaEcrlq5xOX0CQ9ypTF5dl78mjlMNfOK08fzpgTHQRQPBxcPlEtIw0yRpws+Zo/3r+5WRby7u3Gg==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/object-assign": { + "version": "4.1.1", + "resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz", + "integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/object-inspect": { + "version": "1.13.4", + "resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.13.4.tgz", + "integrity": "sha512-W67iLl4J2EXEGTbfeHCffrjDfitvLANg0UlX3wFUUSTx92KXRFegMHUVgSqE+wvhAbi4WqjGg9czysTV2Epbew==", + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/on-finished": { + "version": "2.4.1", + "resolved": "https://registry.npmjs.org/on-finished/-/on-finished-2.4.1.tgz", + "integrity": "sha512-oVlzkg3ENAhCk2zdv7IJwd/QUD4z2RxRwpkcGY8psCVcCYZNq4wYnVWALHM+brtuJjePWiYF/ClmuDr8Ch5+kg==", + "license": "MIT", + "dependencies": { + "ee-first": "1.1.1" + }, + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/once": { + "version": "1.4.0", + "resolved": "https://registry.npmjs.org/once/-/once-1.4.0.tgz", + "integrity": "sha512-lNaJgI+2Q5URQBkccEKHTQOPaXdUxnZZElQTZY0MFUAuaEqe1E+Nyvgdz/aIyNi6Z9MzO5dv1H8n58/GELp3+w==", + "license": "ISC", + "dependencies": { + "wrappy": "1" + } + }, + "node_modules/parseurl": { + "version": "1.3.3", + "resolved": "https://registry.npmjs.org/parseurl/-/parseurl-1.3.3.tgz", + "integrity": "sha512-CiyeOxFT/JZyN5m0z9PfXw4SCBJ6Sygz1Dpl0wqjlhDEGGBP1GnsUVEL0p63hoG1fcj3fHynXi9NYO4nWOL+qQ==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/path-key": { + "version": "3.1.1", + "resolved": "https://registry.npmjs.org/path-key/-/path-key-3.1.1.tgz", + "integrity": "sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q==", + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/path-to-regexp": { + "version": "8.4.2", + "resolved": "https://registry.npmjs.org/path-to-regexp/-/path-to-regexp-8.4.2.tgz", + "integrity": "sha512-qRcuIdP69NPm4qbACK+aDogI5CBDMi1jKe0ry5rSQJz8JVLsC7jV8XpiJjGRLLol3N+R5ihGYcrPLTno6pAdBA==", + "license": "MIT", + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/pkce-challenge": { + "version": "5.0.1", + "resolved": "https://registry.npmjs.org/pkce-challenge/-/pkce-challenge-5.0.1.tgz", + "integrity": "sha512-wQ0b/W4Fr01qtpHlqSqspcj3EhBvimsdh0KlHhH8HRZnMsEa0ea2fTULOXOS9ccQr3om+GcGRk4e+isrZWV8qQ==", + "license": "MIT", + "engines": { + "node": ">=16.20.0" + } + }, + "node_modules/proxy-addr": { + "version": "2.0.7", + "resolved": "https://registry.npmjs.org/proxy-addr/-/proxy-addr-2.0.7.tgz", + "integrity": "sha512-llQsMLSUDUPT44jdrU/O37qlnifitDP+ZwrmmZcoSKyLKvtZxpyV0n2/bD/N4tBAAZ/gJEdZU7KMraoK1+XYAg==", + "license": "MIT", + "dependencies": { + "forwarded": "0.2.0", + "ipaddr.js": "1.9.1" + }, + "engines": { + "node": ">= 0.10" + } + }, + "node_modules/qs": { + "version": "6.15.0", + "resolved": "https://registry.npmjs.org/qs/-/qs-6.15.0.tgz", + "integrity": "sha512-mAZTtNCeetKMH+pSjrb76NAM8V9a05I9aBZOHztWy/UqcJdQYNsf59vrRKWnojAT9Y+GbIvoTBC++CPHqpDBhQ==", + "license": "BSD-3-Clause", + "dependencies": { + "side-channel": "^1.1.0" + }, + "engines": { + "node": ">=0.6" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/range-parser": { + "version": "1.2.1", + "resolved": "https://registry.npmjs.org/range-parser/-/range-parser-1.2.1.tgz", + "integrity": "sha512-Hrgsx+orqoygnmhFbKaHE6c296J+HTAQXoxEF6gNupROmmGJRoyzfG3ccAveqCBrwr/2yxQ5BVd/GTl5agOwSg==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/raw-body": { + "version": "3.0.2", + "resolved": "https://registry.npmjs.org/raw-body/-/raw-body-3.0.2.tgz", + "integrity": "sha512-K5zQjDllxWkf7Z5xJdV0/B0WTNqx6vxG70zJE4N0kBs4LovmEYWJzQGxC9bS9RAKu3bgM40lrd5zoLJ12MQ5BA==", + "license": "MIT", + "dependencies": { + "bytes": "~3.1.2", + "http-errors": "~2.0.1", + "iconv-lite": "~0.7.0", + "unpipe": "~1.0.0" + }, + "engines": { + "node": ">= 0.10" + } + }, + "node_modules/require-from-string": { + "version": "2.0.2", + "resolved": "https://registry.npmjs.org/require-from-string/-/require-from-string-2.0.2.tgz", + "integrity": "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/resolve-pkg-maps": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/resolve-pkg-maps/-/resolve-pkg-maps-1.0.0.tgz", + "integrity": "sha512-seS2Tj26TBVOC2NIc2rOe2y2ZO7efxITtLZcGSOnHHNOQ7CkiUBfw0Iw2ck6xkIhPwLhKNLS8BO+hEpngQlqzw==", + "dev": true, + "license": "MIT", + "funding": { + "url": "https://github.com/privatenumber/resolve-pkg-maps?sponsor=1" + } + }, + "node_modules/router": { + "version": "2.2.0", + "resolved": "https://registry.npmjs.org/router/-/router-2.2.0.tgz", + "integrity": "sha512-nLTrUKm2UyiL7rlhapu/Zl45FwNgkZGaCpZbIHajDYgwlJCOzLSk+cIPAnsEqV955GjILJnKbdQC1nVPz+gAYQ==", + "license": "MIT", + "dependencies": { + "debug": "^4.4.0", + "depd": "^2.0.0", + "is-promise": "^4.0.0", + "parseurl": "^1.3.3", + "path-to-regexp": "^8.0.0" + }, + "engines": { + "node": ">= 18" + } + }, + "node_modules/safer-buffer": { + "version": "2.1.2", + "resolved": "https://registry.npmjs.org/safer-buffer/-/safer-buffer-2.1.2.tgz", + "integrity": "sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg==", + "license": "MIT" + }, + "node_modules/send": { + "version": "1.2.1", + "resolved": "https://registry.npmjs.org/send/-/send-1.2.1.tgz", + "integrity": "sha512-1gnZf7DFcoIcajTjTwjwuDjzuz4PPcY2StKPlsGAQ1+YH20IRVrBaXSWmdjowTJ6u8Rc01PoYOGHXfP1mYcZNQ==", + "license": "MIT", + "dependencies": { + "debug": "^4.4.3", + "encodeurl": "^2.0.0", + "escape-html": "^1.0.3", + "etag": "^1.8.1", + "fresh": "^2.0.0", + "http-errors": "^2.0.1", + "mime-types": "^3.0.2", + "ms": "^2.1.3", + "on-finished": "^2.4.1", + "range-parser": "^1.2.1", + "statuses": "^2.0.2" + }, + "engines": { + "node": ">= 18" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/serve-static": { + "version": "2.2.1", + "resolved": "https://registry.npmjs.org/serve-static/-/serve-static-2.2.1.tgz", + "integrity": "sha512-xRXBn0pPqQTVQiC8wyQrKs2MOlX24zQ0POGaj0kultvoOCstBQM5yvOhAVSUwOMjQtTvsPWoNCHfPGwaaQJhTw==", + "license": "MIT", + "dependencies": { + "encodeurl": "^2.0.0", + "escape-html": "^1.0.3", + "parseurl": "^1.3.3", + "send": "^1.2.0" + }, + "engines": { + "node": ">= 18" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/setprototypeof": { + "version": "1.2.0", + "resolved": "https://registry.npmjs.org/setprototypeof/-/setprototypeof-1.2.0.tgz", + "integrity": "sha512-E5LDX7Wrp85Kil5bhZv46j8jOeboKq5JMmYM3gVGdGH8xFpPWXUMsNrlODCrkoxMEeNi/XZIwuRvY4XNwYMJpw==", + "license": "ISC" + }, + "node_modules/shebang-command": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/shebang-command/-/shebang-command-2.0.0.tgz", + "integrity": "sha512-kHxr2zZpYtdmrN1qDjrrX/Z1rR1kG8Dx+gkpK1G4eXmvXswmcE1hTWBWYUzlraYw1/yZp6YuDY77YtvbN0dmDA==", + "license": "MIT", + "dependencies": { + "shebang-regex": "^3.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/shebang-regex": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/shebang-regex/-/shebang-regex-3.0.0.tgz", + "integrity": "sha512-7++dFhtcx3353uBaq8DDR4NuxBetBzC7ZQOhmTQInHEd6bSrXdiEyzCvG07Z44UYdLShWUyXt5M/yhz8ekcb1A==", + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/side-channel": { + "version": "1.1.0", + "resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.1.0.tgz", + "integrity": "sha512-ZX99e6tRweoUXqR+VBrslhda51Nh5MTQwou5tnUDgbtyM0dBgmhEDtWGP/xbKn6hqfPRHujUNwz5fy/wbbhnpw==", + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0", + "object-inspect": "^1.13.3", + "side-channel-list": "^1.0.0", + "side-channel-map": "^1.0.1", + "side-channel-weakmap": "^1.0.2" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/side-channel-list": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/side-channel-list/-/side-channel-list-1.0.0.tgz", + "integrity": "sha512-FCLHtRD/gnpCiCHEiJLOwdmFP+wzCmDEkc9y7NsYxeF4u7Btsn1ZuwgwJGxImImHicJArLP4R0yX4c2KCrMrTA==", + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0", + "object-inspect": "^1.13.3" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/side-channel-map": { + "version": "1.0.1", + "resolved": "https://registry.npmjs.org/side-channel-map/-/side-channel-map-1.0.1.tgz", + "integrity": "sha512-VCjCNfgMsby3tTdo02nbjtM/ewra6jPHmpThenkTYh8pG9ucZ/1P8So4u4FGBek/BjpOVsDCMoLA/iuBKIFXRA==", + "license": "MIT", + "dependencies": { + "call-bound": "^1.0.2", + "es-errors": "^1.3.0", + "get-intrinsic": "^1.2.5", + "object-inspect": "^1.13.3" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/side-channel-weakmap": { + "version": "1.0.2", + "resolved": "https://registry.npmjs.org/side-channel-weakmap/-/side-channel-weakmap-1.0.2.tgz", + "integrity": "sha512-WPS/HvHQTYnHisLo9McqBHOJk2FkHO/tlpvldyrnem4aeQp4hai3gythswg6p01oSoTl58rcpiFAjF2br2Ak2A==", + "license": "MIT", + "dependencies": { + "call-bound": "^1.0.2", + "es-errors": "^1.3.0", + "get-intrinsic": "^1.2.5", + "object-inspect": "^1.13.3", + "side-channel-map": "^1.0.1" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/statuses": { + "version": "2.0.2", + "resolved": "https://registry.npmjs.org/statuses/-/statuses-2.0.2.tgz", + "integrity": "sha512-DvEy55V3DB7uknRo+4iOGT5fP1slR8wQohVdknigZPMpMstaKJQWhwiYBACJE3Ul2pTnATihhBYnRhZQHGBiRw==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/toidentifier": { + "version": "1.0.1", + "resolved": "https://registry.npmjs.org/toidentifier/-/toidentifier-1.0.1.tgz", + "integrity": "sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA==", + "license": "MIT", + "engines": { + "node": ">=0.6" + } + }, + "node_modules/ts-algebra": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/ts-algebra/-/ts-algebra-2.0.0.tgz", + "integrity": "sha512-FPAhNPFMrkwz76P7cdjdmiShwMynZYN6SgOujD1urY4oNm80Ou9oMdmbR45LotcKOXoy7wSmHkRFE6Mxbrhefw==", + "license": "MIT" + }, + "node_modules/tsx": { + "version": "4.21.0", + "resolved": "https://registry.npmjs.org/tsx/-/tsx-4.21.0.tgz", + "integrity": "sha512-5C1sg4USs1lfG0GFb2RLXsdpXqBSEhAaA/0kPL01wxzpMqLILNxIxIOKiILz+cdg/pLnOUxFYOR5yhHU666wbw==", + "dev": true, + "license": "MIT", + "dependencies": { + "esbuild": "~0.27.0", + "get-tsconfig": "^4.7.5" + }, + "bin": { + "tsx": "dist/cli.mjs" + }, + "engines": { + "node": ">=18.0.0" + }, + "optionalDependencies": { + "fsevents": "~2.3.3" + } + }, + "node_modules/type-is": { + "version": "2.0.1", + "resolved": "https://registry.npmjs.org/type-is/-/type-is-2.0.1.tgz", + "integrity": "sha512-OZs6gsjF4vMp32qrCbiVSkrFmXtG/AZhY3t0iAMrMBiAZyV9oALtXO8hsrHbMXF9x6L3grlFuwW2oAz7cav+Gw==", + "license": "MIT", + "dependencies": { + "content-type": "^1.0.5", + "media-typer": "^1.1.0", + "mime-types": "^3.0.0" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/typescript": { + "version": "5.9.3", + "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz", + "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==", + "dev": true, + "license": "Apache-2.0", + "bin": { + "tsc": "bin/tsc", + "tsserver": "bin/tsserver" + }, + "engines": { + "node": ">=14.17" + } + }, + "node_modules/unpipe": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/unpipe/-/unpipe-1.0.0.tgz", + "integrity": "sha512-pjy2bYhSsufwWlKwPc+l3cN7+wuJlK6uz0YdJEOlQDbl6jo/YlPi4mb8agUkVC8BF7V8NuzeyPNqRksA3hztKQ==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/vary": { + "version": "1.1.2", + "resolved": "https://registry.npmjs.org/vary/-/vary-1.1.2.tgz", + "integrity": "sha512-BNGbWLfd0eUPabhkXUVm0j8uuvREyTh5ovRa/dyow/BqAbZJyC+5fU+IzQOzmAKzYqYRAISoRhdQr3eIZ/PXqg==", + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/which": { + "version": "2.0.2", + "resolved": "https://registry.npmjs.org/which/-/which-2.0.2.tgz", + "integrity": "sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA==", + "license": "ISC", + "dependencies": { + "isexe": "^2.0.0" + }, + "bin": { + "node-which": "bin/node-which" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/wrappy": { + "version": "1.0.2", + "resolved": "https://registry.npmjs.org/wrappy/-/wrappy-1.0.2.tgz", + "integrity": "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ==", + "license": "ISC" + }, + "node_modules/zod": { + "version": "4.3.6", + "resolved": "https://registry.npmjs.org/zod/-/zod-4.3.6.tgz", + "integrity": "sha512-rftlrkhHZOcjDwkGlnUtZZkvaPHCsDATp4pGpuOOMDaTdDDXF91wuVDJoWoPsKX/3YPQ5fHuF3STjcYyKr+Qhg==", + "license": "MIT", + "peer": true, + "funding": { + "url": "https://github.com/sponsors/colinhacks" + } + }, + "node_modules/zod-to-json-schema": { + "version": "3.25.2", + "resolved": "https://registry.npmjs.org/zod-to-json-schema/-/zod-to-json-schema-3.25.2.tgz", + "integrity": "sha512-O/PgfnpT1xKSDeQYSCfRI5Gy3hPf91mKVDuYLUHZJMiDFptvP41MSnWofm8dnCm0256ZNfZIM7DSzuSMAFnjHA==", + "license": "ISC", + "peerDependencies": { + "zod": "^3.25.28 || ^4" + } + } + } +} diff --git a/docs/agent-evaluation/package.json b/docs/agent-evaluation/package.json new file mode 100644 index 000000000..f9812c162 --- /dev/null +++ b/docs/agent-evaluation/package.json @@ -0,0 +1,26 @@ +{ + "name": "outpost-agent-evaluation", + "version": "1.0.0", + "private": true, + "type": "module", + "description": "Claude Agent SDK harness for Outpost onboarding scenario evals", + "scripts": { + "eval": "node --import tsx src/run-agent-eval.ts", + "eval:ci": "node --import tsx src/run-agent-eval.ts --scenarios 01,02", + "smoke:execute-ci": "bash scripts/smoke-test-execute-ci-artifacts.sh", + "eval:tsx-cli": "tsx src/run-agent-eval.ts", + "score": "node --import tsx src/score-eval.ts", + "typecheck": "tsc --noEmit" + }, + "engines": { + "node": ">=18" + }, + "dependencies": { + "@anthropic-ai/claude-agent-sdk": "^0.2.92", + "dotenv": "^16.4.7" + }, + "devDependencies": { + "tsx": "^4.19.4", + "typescript": "^5.8.3" + } +} diff --git a/docs/agent-evaluation/results/.gitignore b/docs/agent-evaluation/results/.gitignore new file mode 100644 index 000000000..3a2f71330 --- /dev/null +++ b/docs/agent-evaluation/results/.gitignore @@ -0,0 +1,5 @@ +# Ignore local run recordings; keep README + template committed +* +!.gitignore +!README.md +!RUN-RECORDING.template.md diff --git a/docs/agent-evaluation/results/README.md b/docs/agent-evaluation/results/README.md new file mode 100644 index 000000000..9fe1615cc --- /dev/null +++ b/docs/agent-evaluation/results/README.md @@ -0,0 +1,57 @@ +# Agent evaluation — results + +This directory holds **manual run write-ups** and, under `**runs/`**, **automated** artifacts from `npm run eval`. Almost everything here is **gitignored** by default (see `[.gitignore](.gitignore)`). + +Full workflow and env vars: `**[../README.md](../README.md)`**. + +--- + +## Automated runs (`runs/`) + +From `docs/agent-evaluation/`: + +```sh +npm run eval -- --scenario 01 +npm run eval -- --scenarios 01,02 +npm run eval -- --all +``` + +Each run is a **directory** (same timestamp stem, all gitignored): + +`runs/-scenario-NN/` + +| Path in run dir | What it is | +| --------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | +| `transcript.json` | Full Claude Agent SDK transcript (`meta` + `messages`). | +| `heuristic-score.json` | **Heuristic** transcript checks (`[../src/score-transcript.ts](../src/score-transcript.ts)`); rubrics **01–10** (`scoreScenario01`–`10`). | +| `llm-score.json` | **LLM judge** output (`[../src/llm-judge.ts](../src/llm-judge.ts)`) vs `**## Success criteria`** in the scenario markdown. | +| *(other files)* | Anything the agent **`Write`**s (e.g. `outpost-quickstart.sh`); SDK **`cwd`** is this directory. | + +Legacy flat `runs/-scenario-NN.json` (and `*.score.json` / `*.llm-score.json` beside it) still work with **`npm run score`**. + +Re-score an existing run without re-running the agent: + +```sh +npm run score -- --run results/runs/-scenario-NN --write +npm run score -- --run results/runs/-scenario-NN --llm --write +``` + +**Execution** (curl/SDK against live Outpost with `OUTPOST_API_KEY`) is **not** recorded in these JSON files. Use **`../scripts/execute-ci-artifacts.sh`** after **`eval:ci`**, or the second step in **`.github/workflows/docs-agent-eval-ci.yml`**, and the **Execution (full pass)** rows in `[../scenarios/](../scenarios/)` for human notes. + +--- + +## Manual run recordings + +For **IDE-only** or ad-hoc runs (no `npm run eval`): + +1. Copy `[RUN-RECORDING.template.md](RUN-RECORDING.template.md)` to a **local-only** name (e.g. `2026-04-08-s01-cursor.md`) in this directory. +2. Fill in transcript summary, heuristic/LLM pointers if you ran `npm run score` separately, **Execution verification**, and notes. +3. Do not commit raw recordings unless your policy allows it; anonymized summaries in a PR are fine. + +Success criteria for every scenario: `**[../scenarios/*.md](../scenarios/)`** — section **Success criteria**. + +--- + +## Template + +See `[RUN-RECORDING.template.md](RUN-RECORDING.template.md)`. \ No newline at end of file diff --git a/docs/agent-evaluation/results/RUN-RECORDING.template.md b/docs/agent-evaluation/results/RUN-RECORDING.template.md new file mode 100644 index 000000000..047b9fa84 --- /dev/null +++ b/docs/agent-evaluation/results/RUN-RECORDING.template.md @@ -0,0 +1,36 @@ +# Agent eval recording (copy this file, rename, fill in) + +**Scenario:** (e.g. `01-basics-curl` — link to `../scenarios/....md`) +**Date:** YYYY-MM-DD +**Agent / client:** (e.g. Cursor Agent, Claude Code, Copilot Chat) +**Model:** (if known) +**Outpost skill enabled?** yes / no + +## Environment + +- Docs / prompt source: (commit SHA or “main @ date”) +- Hookdeck project: throwaway / prod (describe) + +## Transcript summary + +(Optional bullets — do not paste secrets.) + +- Turn 0: … +- Turn 1: … + +## Success criteria (from scenario doc) + +Copy the checklist from the scenario and mark **PASS** / **FAIL** / **N/A**. + +- … + +## Execution verification (full pass) + +Did you run the generated curl / script / app against a **live** Outpost project with `**OUTPOST_API_KEY`** (and related env vars)? + +- **Execution:** PASS / FAIL / SKIPPED (transcript-only) +- Notes (HTTP status codes, error bodies — no secrets): + +## Notes / regressions + +… \ No newline at end of file diff --git a/docs/agent-evaluation/scenarios/01-basics-curl.md b/docs/agent-evaluation/scenarios/01-basics-curl.md new file mode 100644 index 000000000..7d90026f4 --- /dev/null +++ b/docs/agent-evaluation/scenarios/01-basics-curl.md @@ -0,0 +1,47 @@ +# Scenario 1 — Basics with curl + +## Intent + +Agent should produce a **minimal shell + curl** flow against the **managed** API (no SDK), matching the official curl quickstart. Prefer a **single runnable shell script** (e.g. `outpost-quickstart.sh`) that sets variables and runs all curls, so the operator can `chmod +x` and run once; inline copy-paste blocks are acceptable if the user asked only for “commands.” + +## Preconditions + +- `OUTPOST_API_KEY` set in the environment (user states this; agent must not ask for the raw key in chat). +- Topics include at least one topic used in the script (e.g. `user.created`). + +## Automated eval (Claude Agent SDK) + +The harness sets the agent **cwd** to an empty directory under `docs/agent-evaluation/results/runs/-scenario-NN/`. **`Write` / `Edit` / `NotebookEdit` paths are enforced** to that directory only (absolute paths elsewhere are denied). Save the script as e.g. **`outpost-quickstart.sh`** in that folder (relative path or a path under the run dir), not under `examples/` or the repo root. + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from `[hookdeck-outpost-agent-prompt.mdoc](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc)`, with `{{…}}` filled using your project or `[fixtures/placeholder-values-for-turn0.md](../fixtures/placeholder-values-for-turn0.md)`. + +### Turn 1 — User + +> I want option 1 — **the simplest thing possible**. I don’t need a framework or SDK; just the smallest path to see tenant → webhook → publish working. + +### Turn 2 — User (optional) + +> How do I know the event actually reached my test URL? + +## Success criteria + +**Measurement:** Heuristic rubric `scoreScenario01` in `[../src/score-transcript.ts](../src/score-transcript.ts)` (assistant text + tool-written script content). LLM judge: `npm run score -- --run --llm`. Execution row remains manual. + +- Uses managed base URL `https://api.outpost.hookdeck.com/2025-07-01` (or explicit `OUTPOST_API_BASE_URL`), **not** `localhost:3333/api/v1`, unless the user asked for self-hosted. +- Tenant: `PUT .../tenants/{tenant_id}` with `Authorization: Bearer` (or documents equivalent). +- Destination: `POST .../tenants/{tenant_id}/destinations` with `type: webhook`, `topics` including the configured topic or `*`, and `config.url` pointing at a test HTTPS URL (env or placeholder). +- Publish: `POST .../publish` with `tenant_id`, `topic`, and a top-level JSON field `**data`** (the event payload object — see OpenAPI `PublishRequest` and curl quickstart). Not `payload`. Typically also `eligible_for_retry`. +- Delivers as one **shell script** (or one fenced `bash` block meant to be saved as `.sh`), not only three unrelated snippets without a shebang/variables. +- Does **not** embed a pasted API key in the reply. +- Verification mentions Hookdeck Console / dashboard logs if Turn 2 was asked. +- **Execution (full pass):** With `OUTPOST_API_KEY` (and `OUTPOST_API_BASE_URL` if the snippet uses it) set in your environment, run the agent’s tenant → destination → publish sequence against a real project. Expect success per the **curl quickstart** and **OpenAPI** (tenant and destination typically 2xx; publish uses the documented success status—often **202**). Confirm delivery via Hookdeck Console / project logs (or `GET .../attempts` as appropriate). *Skip only if you are doing transcript-only triage.* + +## Failure modes to note + +- Wrong path (`PUT /{tenant}` without `/tenants/`). +- Mixing self-hosted base path with managed host. +- Skipping topic alignment with dashboard configuration. \ No newline at end of file diff --git a/docs/agent-evaluation/scenarios/02-basics-typescript.md b/docs/agent-evaluation/scenarios/02-basics-typescript.md new file mode 100644 index 000000000..afbc4b7f2 --- /dev/null +++ b/docs/agent-evaluation/scenarios/02-basics-typescript.md @@ -0,0 +1,45 @@ +# Scenario 2 — Basics with TypeScript + +## Intent + +Agent should produce a **single runnable `.ts` file** using `@hookdeck/outpost-sdk`, following the managed TypeScript quickstart pattern. + +## Preconditions + +- Node 18+; user can run `npx tsx`. +- `OUTPOST_API_KEY` and `OUTPOST_TEST_WEBHOOK_URL` available as env vars. + +## Automated eval (Claude Agent SDK) + +The harness sets the agent **cwd** to `docs/agent-evaluation/results/runs/-scenario-NN/`. Write the script and any `package.json` there with **Write** / **Edit**; use **Bash** for `npm install`, `npx tsx`, etc., so the folder is a runnable mini-project. + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from `[hookdeck-outpost-agent-prompt.mdoc](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc)`, with `{{…}}` filled using your project or `[fixtures/placeholder-values-for-turn0.md](../fixtures/placeholder-values-for-turn0.md)`. + +### Turn 1 — User + +> Option 1. Let’s do it in **TypeScript**. + +### Turn 2 — User (optional) + +> How do I run it locally? + +## Success criteria + +**Measurement:** Heuristic `scoreScenario02` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge maps the bullets below ([README.md § Measuring scenarios](../README.md#measuring-scenarios)). Execution row is manual. + +- Depends on `@hookdeck/outpost-sdk`; uses `Outpost` client with `apiKey` from `process.env.OUTPOST_API_KEY`. +- Calls `tenants.upsert`, `destinations.create` (webhook), `publish.event`. +- Uses a topic that matches the dashboard list from the prompt (or asks which topic if ambiguous). +- Webhook URL from `OUTPOST_TEST_WEBHOOK_URL` (or clearly documented env). +- No API key in source; fails fast if env missing. +- Mentions `npx tsx script.ts` or equivalent run instructions. +- **Execution (full pass):** With `OUTPOST_API_KEY`, `OUTPOST_TEST_WEBHOOK_URL`, and optional `OUTPOST_API_BASE_URL` set, the generated script runs to completion (no uncaught API errors) and prints or logs an event id or other clear success signal. *Skip only for transcript-only triage.* + +## Failure modes to note + +- Defaulting to localhost API without user asking for self-hosted. +- Using raw `fetch` when user asked for TypeScript SDK specifically. \ No newline at end of file diff --git a/docs/agent-evaluation/scenarios/03-basics-python.md b/docs/agent-evaluation/scenarios/03-basics-python.md new file mode 100644 index 000000000..c0d747373 --- /dev/null +++ b/docs/agent-evaluation/scenarios/03-basics-python.md @@ -0,0 +1,43 @@ +# Scenario 3 — Basics with Python + +## Intent + +Agent should produce a **single Python script** using `outpost_sdk`, equivalent to scenario 2. + +## Preconditions + +- Python 3.9+; `pip install outpost_sdk`. +- `OUTPOST_API_KEY`, `OUTPOST_TEST_WEBHOOK_URL` set. + +## Automated eval (Claude Agent SDK) + +The harness sets the agent **cwd** to `docs/agent-evaluation/results/runs/-scenario-NN/`. Save `*.py`, `requirements.txt` or `pyproject.toml` with **Write** / **Edit**; use **Bash** for `pip` / `uv` installs so the run directory is self-contained. + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from `[hookdeck-outpost-agent-prompt.mdoc](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc)`, with `{{…}}` filled using your project or `[fixtures/placeholder-values-for-turn0.md](../fixtures/placeholder-values-for-turn0.md)`. + +### Turn 1 — User + +> Option 1. I’d like to use **Python**. + +### Turn 2 — User (optional) + +> One file I can run with `python` is enough. + +## Success criteria + +**Measurement:** Heuristic `scoreScenario03` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge maps the checklist below ([README.md § Measuring scenarios](../README.md#measuring-scenarios)). Execution row is manual. + +- `from outpost_sdk import Outpost` (or equivalent documented import path). +- `Outpost(api_key=..., server_url=...)` with optional base URL from env. +- `tenants.upsert`, `destinations.create`, `publish.event` as in the **Python quickstart** (including `request=` for publish where the SDK requires it). +- Topic aligned with prompt; webhook URL from env. +- No secrets in file. +- **Execution (full pass):** With `OUTPOST_API_KEY`, `OUTPOST_TEST_WEBHOOK_URL`, and optional base URL env vars set, `python …` (as documented) completes without API errors and prints an event id or clear success. *Skip only for transcript-only triage.* + +## Failure modes to note + +- Using `requests` only when user asked for the official SDK. \ No newline at end of file diff --git a/docs/agent-evaluation/scenarios/04-basics-go.md b/docs/agent-evaluation/scenarios/04-basics-go.md new file mode 100644 index 000000000..e1d8a6db8 --- /dev/null +++ b/docs/agent-evaluation/scenarios/04-basics-go.md @@ -0,0 +1,42 @@ +# Scenario 4 — Basics with Go + +## Intent + +Agent should produce a **small Go program** using `github.com/hookdeck/outpost/sdks/outpost-go`, equivalent to scenarios 2–3. + +## Preconditions + +- Go toolchain; module with `outpost-go` dependency. +- `OUTPOST_API_KEY`, `OUTPOST_TEST_WEBHOOK_URL` set. + +## Automated eval (Claude Agent SDK) + +The harness sets the agent **cwd** to `docs/agent-evaluation/results/runs/-scenario-NN/`. Write `go.mod`, `main.go`, etc. with **Write** / **Edit**; use **Bash** for `go mod init`, `go mod tidy`, and `go run` so the folder is a complete module. + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from [`hookdeck-outpost-agent-prompt.mdoc`](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc), with `{{…}}` filled using your project or [`fixtures/placeholder-values-for-turn0.md`](../fixtures/placeholder-values-for-turn0.md). + +### Turn 1 — User + +> Option 1. I want to try it in **Go**. + +### Turn 2 — User (optional) + +> Keep the program small — one `main` or a couple of files is fine. + +## Success criteria + +**Measurement:** Heuristic `scoreScenario04` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge maps the checklist below ([`README.md` § Measuring scenarios](../README.md#measuring-scenarios)). Execution row is manual. + +- [ ] `outpostgo.New` with `WithSecurity` (and optional `WithServerURL`). +- [ ] `Tenants.Upsert`, `Destinations.Create` with `CreateDestinationCreateWebhook` (or correct union wrapper), `Publish.Event`. +- [ ] Topic and tenant id explicit; matches prompt topics. +- [ ] No API key in source. +- [ ] **Execution (full pass):** With `OUTPOST_API_KEY`, `OUTPOST_TEST_WEBHOOK_URL`, and optional server URL env vars set, `go run …` succeeds and prints ids or clear success. *Skip only for transcript-only triage.* + +## Failure modes to note + +- Passing raw struct to `Create` without `CreateDestinationCreateWebhook` wrapper (common compile mistake). diff --git a/docs/agent-evaluation/scenarios/05-app-nextjs.md b/docs/agent-evaluation/scenarios/05-app-nextjs.md new file mode 100644 index 000000000..c6f861f4a --- /dev/null +++ b/docs/agent-evaluation/scenarios/05-app-nextjs.md @@ -0,0 +1,57 @@ +# Scenario 5 — Minimal example app (Next.js) + +## Intent + +Agent scaffolds a **minimal Next.js** app (App Router or Pages Router acceptable) with a **simple UI** that lets an operator: + +1. Register a **webhook destination** for a tenant (URL input + submit). +2. After registration, **trigger a test publish** to a configured topic so the destination receives an event. + +Server-side code must call Outpost with the API key from **environment** (e.g. `OUTPOST_API_KEY`), never exposed to the browser. + +## Preconditions + +- User has Node 18+; comfortable creating a Next app. +- `OUTPOST_API_KEY`, managed base URL, at least one topic, and `OUTPOST_TEST_WEBHOOK_URL` or user-supplied URL pattern documented. + +## Automated eval (Claude Agent SDK) + +The harness sets the agent **cwd** to an empty directory under `docs/agent-evaluation/results/runs/-scenario-NN/`. You **must** scaffold the Next.js app **into that directory** (e.g. `npx create-next-app@latest` with flags for non-interactive use) using **Bash**, then implement routes/server code with **Write** / **Edit**. Chat-only snippets are not enough for this scenario—the run folder should contain a real project tree reviewers can `npm install && npm run dev`. + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from `[hookdeck-outpost-agent-prompt.mdoc](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc)`, with `{{…}}` filled using your project or `[fixtures/placeholder-values-for-turn0.md](../fixtures/placeholder-values-for-turn0.md)`. + +### Turn 1 — User + +> Option 2 — a **tiny demo app**. Can we use **Next.js**? I want a minimal page: somewhere to put a webhook URL, register it for a customer, and a way to fire one test event. + +### Turn 2 — User (optional) + +> Can you add a short README — what goes in `.env` and how I start the dev server? + +### Turn 3 — User (stress) + +> I don’t have a public webhook URL yet. What should I put in that field? + +*Expected:* agent points to a Hookdeck Console Source URL (or equivalent) consistent with the quickstarts and Turn 0 test destination. + +## Success criteria + +**Measurement:** Heuristic `scoreScenario05` in `[src/score-transcript.ts](../src/score-transcript.ts)`; LLM judge maps the bullets below (`[README.md` § Measuring scenarios](../README.md#measuring-scenarios)). Execution row is manual. + +- Next.js project structure with install/run instructions. +- API routes or server actions perform Outpost calls; **no API key** in client bundles. +- UI flow covers **create destination** and **publish** (two distinct actions visible to the user). +- Tenant id and topic are configurable or clearly documented constants. +- Uses managed base URL by default. +- README lists required env vars. +- **Execution (full pass):** After `npm install` and `npm run dev` (or documented command), a manual smoke test completes **both** flows: register webhook destination and trigger test publish, without 5xx from your app’s Outpost calls and with Outpost accepting the requests. Requires `OUTPOST_API_KEY` and related env in `.env.local` or as documented. *Skip only for transcript-only triage.* + +## Failure modes to note + +- Calling Outpost directly from browser-side code with embedded key. +- Only publishing without a UI path to register the destination first. +- Hard-coding localhost Outpost without user request. \ No newline at end of file diff --git a/docs/agent-evaluation/scenarios/06-app-fastapi.md b/docs/agent-evaluation/scenarios/06-app-fastapi.md new file mode 100644 index 000000000..db8bb76f6 --- /dev/null +++ b/docs/agent-evaluation/scenarios/06-app-fastapi.md @@ -0,0 +1,47 @@ +# Scenario 6 — Minimal example app (FastAPI + Jinja or HTMX) + +## Intent + +Same product behavior as [scenario 5](05-app-nextjs.md), but stack is **Python FastAPI**: + +- Server renders a **simple HTML form** (Jinja2 templates, HTMX, or minimal static HTML served by FastAPI). +- Endpoints (or form posts) call `outpost_sdk` with env-based API key. +- User can submit webhook URL → create destination; user can trigger test publish. + +## Preconditions + +- Python 3.9+; `fastapi`, `uvicorn`, `outpost_sdk`. + +## Automated eval (Claude Agent SDK) + +The harness sets the agent **cwd** to `docs/agent-evaluation/results/runs/-scenario-NN/`. Create the FastAPI app **in that directory**: add source files with **Write** / **Edit**, install deps with **Bash** (`pip` / `uv`). The run folder must be a small but complete app (not only code pasted in chat). + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from [`hookdeck-outpost-agent-prompt.mdoc`](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc), with `{{…}}` filled using your project or [`fixtures/placeholder-values-for-turn0.md`](../fixtures/placeholder-values-for-turn0.md). + +### Turn 1 — User + +> Option 2 — **FastAPI**, same idea as a tiny demo: simple HTML, register a webhook for a tenant, button to send one test event. Keep the codebase small. + +### Turn 2 — User (optional) + +> README with env vars and how to run it would help. + +## Success criteria + +**Measurement:** Heuristic `scoreScenario06` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge maps the checklist below ([`README.md` § Measuring scenarios](../README.md#measuring-scenarios)). Execution row is manual. + +- [ ] FastAPI app runs with one command documented (`uvicorn ...`). +- [ ] Outpost calls only server-side; API key from environment. +- [ ] Two user-visible actions: **register webhook** and **publish test event**. +- [ ] Managed API base URL by default. +- [ ] README with `OUTPOST_API_KEY`, `OUTPOST_TEST_WEBHOOK_URL` or equivalent. +- [ ] **Execution (full pass):** App starts (`uvicorn` or as documented); manual smoke test completes **register webhook** and **publish test event** without server errors on Outpost calls. Env vars set including `OUTPOST_API_KEY`. *Skip only for transcript-only triage.* + +## Failure modes to note + +- Exposing API key to templates/inline JS. +- Using only `curl` subprocesses when user asked for FastAPI + SDK. diff --git a/docs/agent-evaluation/scenarios/07-app-go-http.md b/docs/agent-evaluation/scenarios/07-app-go-http.md new file mode 100644 index 000000000..03f9fe31c --- /dev/null +++ b/docs/agent-evaluation/scenarios/07-app-go-http.md @@ -0,0 +1,46 @@ +# Scenario 7 — Minimal example app (Go net/http) + +## Intent + +Same behavior as scenarios 5–6: **small Go program** using `net/http` (no heavy framework required) that serves **basic HTML** with: + +1. Form or fields for webhook URL → create webhook destination (via `outpost-go`). +2. Control to **publish** one test event. + +## Preconditions + +- Go 1.22+; `outpost-go` module. + +## Automated eval (Claude Agent SDK) + +The harness sets the agent **cwd** to `docs/agent-evaluation/results/runs/-scenario-NN/`. Initialize the module and server **there** (`go mod init`, `go get`, etc. via **Bash**; `main.go` / `handlers.go` via **Write** / **Edit`). Reviewers should be able to `go run .` from the run directory after the eval. + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from `[hookdeck-outpost-agent-prompt.mdoc](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc)`, with `{{…}}` filled using your project or `[fixtures/placeholder-values-for-turn0.md](../fixtures/placeholder-values-for-turn0.md)`. + +### Turn 1 — User + +> Option 2 — **Go** with the standard library: small HTTP server, basic HTML, register a webhook and publish one test event. + +### Turn 2 — User (optional) + +> One or two files is fine if you can keep it readable. + +## Success criteria + +**Measurement:** Heuristic `scoreScenario07` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge maps the bullets below ([`README.md` § Measuring scenarios](../README.md#measuring-scenarios)). Execution row is manual. + +- `go run .` (or `go run main.go`) documented. +- HTML UI with two flows: **create destination**, **publish**. +- SDK used server-side only; `OUTPOST_API_KEY` from env. +- Correct `CreateDestinationCreateWebhook` usage. +- README lists env vars and port. +- **Execution (full pass):** `go run …` starts the server; manual smoke test completes **create destination** and **publish** through the HTML UI without Outpost API failures. `OUTPOST_API_KEY` (and related env) set. *Skip only for transcript-only triage.* + +## Failure modes to note + +- Embedding API key in HTML/JS. +- Omitting publish action after destination registration. \ No newline at end of file diff --git a/docs/agent-evaluation/scenarios/08-integrate-nextjs-existing.md b/docs/agent-evaluation/scenarios/08-integrate-nextjs-existing.md new file mode 100644 index 000000000..8d459ccfe --- /dev/null +++ b/docs/agent-evaluation/scenarios/08-integrate-nextjs-existing.md @@ -0,0 +1,80 @@ +# Scenario 8 — Integrate Outpost into an existing Next.js SaaS app + +## Intent + +Operators often have a **production-shaped SaaS codebase** (auth, teams, dashboard) and need **outbound webhooks** for their customers. This scenario measures whether the agent can work **inside an existing app tree** (here: a pinned open-source baseline), understand where **domain events** happen, and **wire Hookdeck Outpost** so events are **published** to Outpost (with **per-tenant webhook destinations** documented or implemented). + +**Baseline application (pin this in evals):** [**leerob/next-saas-starter**](https://github.com/leerob/next-saas-starter) — Next.js, PostgreSQL, Drizzle, team/member flows, MIT license. It is a common reference for “real” SaaS structure; adjust the prompt if you standardize on another repo. + +## Preconditions + +- Node 18+; `git` available. +- Same **initial onboarding prompt** as other scenarios (`OUTPOST_API_KEY` **not** in the pasted text; test destination URL from dashboard). + +## Eval harness + +The runner executes **`preSteps`** below with shell **`cwd`** = `results/runs/-scenario-08/` before Turn 0. **`agentCwd`** is the SDK process working directory (the baseline repo root). Set **`EVAL_SKIP_HARNESS_PRE_STEPS=1`** to skip preSteps; if **`agentCwd`** is missing, the harness falls back to the run directory. When **`urlEnv`** is set and that variable is non-empty, it overrides **`url`**. + +```eval-harness +{ + "preSteps": [ + { + "type": "git_clone", + "url": "https://github.com/leerob/next-saas-starter.git", + "into": "next-saas-starter", + "depth": 1, + "urlEnv": "EVAL_NEXT_SAAS_BASELINE_URL" + } + ], + "agentCwd": "next-saas-starter" +} +``` + +## Automated eval (Claude Agent SDK) + +Same as other scenarios, except the agent starts **inside** the cloned tree above. Expect **`npm` / `pnpm install`** via **Bash**, then **Write** / **Edit** for Outpost. Reviewers inspect that tree plus `transcript.json`. + +## Conversation script + +### Turn 0 + +Paste the **## Template** block from [`hookdeck-outpost-agent-prompt.mdoc`](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc), with `{{…}}` filled using your project or [`fixtures/placeholder-values-for-turn0.md`](../fixtures/placeholder-values-for-turn0.md). + +### Turn 1 — User + +> I’m integrating into our existing **Next.js** SaaS app—you’re in this repo with me. Install dependencies, get it running, then add **Hookdeck Outpost** so we can send **outbound webhooks** to our customers. +> +> Tie it to **real product behavior** (not a throwaway demo page). I need a clear story for **how each customer registers their webhook** and which topics they receive. Use **topic names that match our domain**; if Hookdeck doesn’t list a topic we need yet, tell me exactly what to add in the project—don’t point our code at the wrong names just to match a short list unless I’ve said we’re only doing a quick wiring spike. Document env vars and setup in the **README**. Keep the Outpost API key on the **server** only. + +### Turn 2 — User (optional) + +> When should we create or sync the Outpost **tenant** with our own customer or team model? + +## Success criteria + +**Measurement:** Heuristic `scoreScenario08` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge maps the bullets below ([`README.md` § Measuring scenarios](../README.md#measuring-scenarios)). Execution row is manual. + +**Contract:** The baseline ships a **customer-facing dashboard**. Treat it like **Existing application (full-stack products)** in [`hookdeck-outpost-agent-prompt.mdoc`](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc). The detailed UI bar is **not** repeated here—use **[Building your own UI — Implementation checklists](../../content/guides/building-your-own-ui.mdoc#implementation-checklists)** (*Planning and contract*, *Destinations experience*, *Activity, attempts, and retries*). The agent must self-verify with **Before you stop (verify)** in the same prompt (full-stack UI item). + +- Baseline app is the documented **next-saas-starter** (or an explicitly justified fork): harness clone under the run directory plus install / integration steps reflected in the transcript or that tree. +- **Outpost TypeScript SDK** used **server-side only**; no `NEXT_PUBLIC_*` API key. +- **Topic reconciliation:** README or inline notes map **each `publish` topic** to a **real domain event**; if the app needs topics not in the **configured project list** from onboarding, instructions say to **add them in Hookdeck** (domain-first—not reshaping product logic to fit a stale default list unless wiring-only scope was agreed). +- **Domain publish:** At least one **`publish` on a real domain path** (signup, CRUD, billing, etc.)—**not** only a synthetic “test event” route. +- **Separate test publish:** A **distinct** server-side control (button, action, or route) that publishes a **test** event for the signed-in tenant—**in addition to** domain publish; does **not** satisfy the domain-publish requirement by itself (see prompt). +- **Full-stack destination + activity UI:** Customers can **drill into** a destination (detail or edit—per product policy), reach **destination-scoped activity** (events / attempts / manual retry for failures) via **your** authenticated routes, and **create** destinations using **dynamic** fields from **`GET /destination-types`** (each field’s **`key`** → `config` / `credentials`). **List rows** link or navigate into that flow—not **only** create + delete with no detail or activity. Omit sub-items only if Turn 1 explicitly scoped **backend-only** or excluded activity UI (then document how operators verify delivery instead). +- **Per-customer webhook** story: **tenant ↔ customer** mapping is consistent for publish and destination APIs. +- README (or equivalent) lists **env vars** for Outpost. +- **Execution (full pass):** With `OUTPOST_API_KEY` set, the app runs; perform a **real in-app action** that triggers the domain publish and confirm Outpost accepts it (2xx/202). Exercise **test publish** and **activity / retry** in the UI when present. Smoke from **`results/runs/…-scenario-08/next-saas-starter/`** (not transcript-only triage). + +## Failure modes to note + +- Pasting a greenfield Next app instead of integrating the **baseline** in the workspace. +- **List-only** destinations (no drill-down to detail or destination-scoped activity) while the baseline still has a product dashboard—unless the user explicitly scoped backend-only. +- **No separate test publish** when customers can manage destinations from the UI. +- Publishing only from a demo or **test-only** route with no domain path. +- **Topics** in code with no README telling the operator to **add** them in Hookdeck when the onboarding topic list was incomplete (or silently retargeting domain logic to unrelated configured names). +- Calling Outpost from client components with secrets. + +## Future baselines + +Java / .NET “existing app” scenarios can follow the same shape: harness pre-clones a fixed public baseline into the run workspace + a natural-language **integration** Turn 1 + Success criteria + `scoreScenarioNN`. diff --git a/docs/agent-evaluation/scenarios/09-integrate-fastapi-existing.md b/docs/agent-evaluation/scenarios/09-integrate-fastapi-existing.md new file mode 100644 index 000000000..fe4e1ed18 --- /dev/null +++ b/docs/agent-evaluation/scenarios/09-integrate-fastapi-existing.md @@ -0,0 +1,84 @@ +# Scenario 9 — Integrate Outpost into an existing FastAPI SaaS app + +## Intent + +Same as [scenario 8](08-integrate-nextjs-existing.md), but stack is **Python + FastAPI** with a **multi-tenant / team** style baseline that also ships a **real web UI** (so operators can exercise dashboards, not only OpenAPI). + +**Baseline application (pin this in evals):** [**fastapi/full-stack-fastapi-template**](https://github.com/fastapi/full-stack-fastapi-template) — maintained full-stack app: **FastAPI** backend (SQLModel, **Pydantic v2**), **React + TypeScript + Vite** frontend, PostgreSQL, Docker Compose, JWT auth, MIT license. Substitute only if you document another baseline in the scenario and update heuristics. + +**Supersedes:** The previous pin [**philipokiokio/FastAPI_SAAS_Template**](https://github.com/philipokiokio/FastAPI_SAAS_Template) (stale dependencies, API-only, no product UI). + +## Preconditions + +- Python 3.10+; **Node.js 18+** (for the frontend); `git` available. +- **Docker** (recommended) — template dev flow uses Docker Compose for API, DB, and frontend; see repository `development.md`. +- Same **initial onboarding prompt** as other scenarios (`OUTPOST_API_KEY` **not** in the pasted text; test destination URL from dashboard). + +## Eval harness + +```eval-harness +{ + "preSteps": [ + { + "type": "git_clone", + "url": "https://github.com/fastapi/full-stack-fastapi-template.git", + "into": "full-stack-fastapi-template", + "depth": 1, + "urlEnv": "EVAL_FASTAPI_BASELINE_URL" + } + ], + "agentCwd": "full-stack-fastapi-template" +} +``` + +Optional: set **`EVAL_FASTAPI_BASELINE_URL`** to override the clone URL (fork or pinned commit). + +## Automated eval (Claude Agent SDK) + +The agent starts **inside** the cloned baseline above. Expect **`docker compose`** and/or **`uv` / `pip`** per **`development.md`** and **`backend/README.md`**, then **Write** / **Edit** for Outpost integration (backend-first; UI hooks optional but encouraged when they clarify the customer webhook story). + +## Conversation script + +### Turn 0 + +Paste the **## Template** from [`hookdeck-outpost-agent-prompt.mdoc`](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) with placeholders filled. + +### Turn 1 — User + +> This workspace is our **full-stack FastAPI + React** product (the template we ship). Follow the repo’s dev docs to bring up API, DB, and frontend, then integrate **Hookdeck Outpost** for **per-customer webhooks**. +> +> I want customers to manage **destinations** from the product (or through our authenticated API), a **separate** way to **fire a test event** that isn’t pretending to be production traffic, and enough **delivery visibility** that they can see **events**, **attempts**, and **retry** when something failed—all **through our backend**, never with the platform API key in the browser. +> +> Wire **publish** into **one real workflow** we already have (signups, records, teams—whatever fits this codebase). **Topics** should match that workflow. If Hookdeck doesn’t list a name we need, document what I should add there; don’t reshape the product around random topic strings unless I’ve said this is wiring-only. Document env vars and how **tenant** maps to our customer or team model. Don’t expose the API key to clients. + +### Turn 2 — User (optional) + +> When should we create or sync the Outpost **tenant** with our own customer or team model? + +## Success criteria + +**Measurement:** Heuristic `scoreScenario09` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge (reads this section); execution manual. + +**Contract:** Same full-stack bar as scenario **8**, pinned to this template. **Canonical checklist:** [Building your own UI — Implementation checklists](../../content/guides/building-your-own-ui.mdoc#implementation-checklists). **Agent self-verify:** [`hookdeck-outpost-agent-prompt.mdoc`](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) → *Before you stop (verify)* (full-stack UI item). Do not duplicate checklist rows in transcripts—confirm against the guide. + +- **full-stack-fastapi-template** (or documented alternative) present via harness **`preSteps`** with install steps in the transcript or tree. +- **`outpost_sdk`** with **`publish.event`** (and related calls as needed) on a **real** code path in the **backend** (server-side only for secrets)—**not** only a synthetic test-publish endpoint unless the scenario was explicitly scoped to wiring-only. +- **Domain + test publish:** At least one **`publish` on a real domain path** (entity create/update, signup, etc.). A **separate** test-publish path or control is **required** for this baseline—it **does not** replace the domain publish requirement. +- API key from **environment** or secure backend settings only — not hard-coded, not exposed via **`NEXT_PUBLIC_*`**, **`VITE_*`**, or other client-visible env patterns. +- **Topic reconciliation:** each **`topic` in code** ties to a real domain event; gaps vs the **configured project topic list** from onboarding are resolved by **adding topics in Hookdeck** (documented), not by retargeting domain logic to a mismatched list unless wiring-only scope was agreed. +- **Destinations + tenant:** Per-customer (or per-team) **destination** management via **authenticated** UI or BFF routes: **list**, **create**, and **drill-down** (detail and **destination-scoped activity**—events, attempts, **manual retry**). **Dynamic** forms from **`GET /destination-types`** with correct **`key`** → `config` / `credentials`. **`tenant_id`** is consistent between publish and destination APIs. Omit drill-down / activity only if Turn 1 scoped **backend-only** or excluded activity UI (document verification instead). +- **Operator docs:** Root **README**, **backend/README**, **development.md**, or **`.env.example`** (whichever the template uses) lists **Outpost env vars** and how to run and verify. +- **Execution (full pass):** Stack runs per template docs; trigger a **real domain action** that fires publish; Outpost accepts. Exercise **test publish** and **activity / retry** in the UI when in scope. *Skip for transcript-only.* + +## Failure modes to note + +- Greenfield FastAPI “hello world” instead of the **cloned** baseline. +- Using raw `httpx` to Outpost when the scenario asks for **`outpost_sdk`**. +- Putting `OUTPOST_API_KEY` in `NEXT_PUBLIC_*`, `VITE_*`, or other client bundles. +- **Only** test/synthetic publish with no domain hook, or **only** domain publish with no **separate** test-publish control when a dashboard is in scope. +- **No** events/attempts/retry surfaced for customers when the baseline includes a product UI and the user did not ask to skip that scope. +- **Flat list** of destinations with no navigation to **detail** or **per-destination activity** (same as scenario 8 failure mode). + +## Future baselines + +Other “existing FastAPI app” pins can follow the same shape: harness pre-clone + natural-language integration Turn 1 + success criteria + `scoreScenario09`. diff --git a/docs/agent-evaluation/scenarios/10-integrate-go-existing.md b/docs/agent-evaluation/scenarios/10-integrate-go-existing.md new file mode 100644 index 000000000..01ca61438 --- /dev/null +++ b/docs/agent-evaluation/scenarios/10-integrate-go-existing.md @@ -0,0 +1,70 @@ +# Scenario 10 — Integrate Outpost into an existing Go SaaS API + +## Intent + +Same integration goal as [scenarios 8–9](08-integrate-nextjs-existing.md), for a **Go** REST API baseline with **auth and typical SaaS** structure. + +**Baseline application (pin this in evals):** [**devinterface/startersaas-go-api**](https://github.com/devinterface/startersaas-go-api) — Go API, JWT, MongoDB, Stripe hooks, Docker — MIT license, small enough to clone in an eval. If you standardize on another Go SaaS boilerplate, update this file and `scoreScenario10`’s baseline check. + +## Preconditions + +- Go 1.21+; `git` available. + +## Eval harness + +```eval-harness +{ + "preSteps": [ + { + "type": "git_clone", + "url": "https://github.com/devinterface/startersaas-go-api.git", + "into": "startersaas-go-api", + "depth": 1, + "urlEnv": "EVAL_GO_SAAS_BASELINE_URL" + } + ], + "agentCwd": "startersaas-go-api" +} +``` + +## Automated eval (Claude Agent SDK) + +The agent starts **inside** the cloned baseline above. Expect **`go mod`** / **`go get`** for **`outpost-go`**, then source edits. + +## Conversation script + +### Turn 0 + +Paste the **## Template** from [`hookdeck-outpost-agent-prompt.mdoc`](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc) with placeholders filled. + +### Turn 1 — User + +> Existing **Go** API—you’re in this repo with me. Get it building, then add **Hookdeck Outpost** for outbound webhooks. +> +> Trigger **publish** from **one real handler** (signup, billing, etc.—not a throwaway test-only route by itself). **`topic` values should match that domain**. If our Hookdeck project’s topic list is missing something, document what to add; don’t point production code at the wrong names just to match a stub list unless I’ve said this is a minimal wiring pass. **`OUTPOST_API_KEY`** from env only. Explain how customers register webhook URLs and what to put in **README** / env. Use the **test receiver URL** from our Hookdeck setup when you want to prove delivery end-to-end. + +### Turn 2 — User (optional) + +> If customers submit a webhook URL in a settings endpoint, where does destination creation live? + +## Success criteria + +**Measurement:** Heuristic `scoreScenario10` in [`src/score-transcript.ts`](../src/score-transcript.ts); LLM judge; execution manual. + +**Contract:** This baseline is an **API-first** Go service (no first-party customer dashboard in the pin). It does **not** inherit the full **[Building your own UI](../../content/guides/building-your-own-ui.mdoc)** dashboard checklist wholesale—agents follow **[Existing application](../../content/quickstarts/hookdeck-outpost-agent-prompt.mdoc#existing-application)** (minimum integration depth) plus **API-only** guidance in **Existing application (full-stack products)** (*Document how tenants manage destinations via **your** API*). If a future pin adds a UI, scenarios should be updated to require the **Implementation checklists** linked above. + +- **startersaas-go-api** (or documented alternative) present via harness **`preSteps`** with build instructions attempted in the transcript or tree. +- **Outpost Go SDK** used with **`Publish.Event`** (and related types) on a **real** handler path—not only a test-only route unless wiring-only scope was agreed. +- No API key in source; **`os.Getenv("OUTPOST_API_KEY")`** (or config loader) only. +- **Topic reconciliation** (domain-first; operator adds missing Hookdeck topics as documented); **tenant** mapping consistent everywhere Outpost is called. +- **Customer webhook registration:** At least one **concrete** story—**implemented** authenticated route(s) and/or **OpenAPI/README**—for how a customer **creates or updates** a webhook destination (URL + topics) for their tenant. Prefer real **`Destinations.Create`** (or update) calls over prose-only if the Turn 1 story asks where destination creation lives. +- **Test / verify delivery:** A **separate** mechanism from domain publish: e.g. documented **`curl`** + test receiver URL, a **small admin/test publish** endpoint, or README steps to trigger a test event—so operators can prove end-to-end delivery without relying solely on production traffic. Domain publish remains **required**; test-only wiring does **not** replace it (see prompt *Before you stop*). +- **Execution (full pass):** Server runs; trigger the **domain** handler; Outpost accepts publish. Optionally exercise documented test publish / destination registration. *Skip for transcript-only.* + +## Failure modes to note + +- New `main.go` only, without using the **cloned** baseline’s routes/models. +- Wrong `Create` shape without **`CreateDestinationCreateWebhook`** when creating webhook destinations. +- Publish only from a **test** helper with no real handler path. +- **Vague** “customers paste a URL somewhere” with no API contract, handler, or README steps for destination creation when the conversation asked for it. +- **No** operator-facing way to smoke-test delivery (test publish or documented curl) when README promises outbound webhooks. diff --git a/docs/agent-evaluation/scripts/ci-eval.sh b/docs/agent-evaluation/scripts/ci-eval.sh new file mode 100755 index 000000000..980442967 --- /dev/null +++ b/docs/agent-evaluation/scripts/ci-eval.sh @@ -0,0 +1,23 @@ +#!/usr/bin/env bash +# CI-friendly agent eval: scenarios 01+02 with heuristic + LLM judge (Success criteria from each scenario .md). +# +# Required secrets (e.g. GitHub Actions): ANTHROPIC_API_KEY, EVAL_TEST_DESTINATION_URL +# Optional: same vars in docs/agent-evaluation/.env for local runs. +# +# Scenarios: 01 = curl quickstart shape; 02 = TypeScript SDK script. See README § CI. +# After success, run ./scripts/execute-ci-artifacts.sh with OUTPOST_API_KEY + OUTPOST_TEST_WEBHOOK_URL for live Outpost (CI does this automatically). +set -euo pipefail + +ROOT="$(cd "$(dirname "$0")/.." && pwd)" +cd "$ROOT" + +if [[ -z "${ANTHROPIC_API_KEY:-}" ]]; then + echo "ci-eval: ANTHROPIC_API_KEY is not set" >&2 + exit 1 +fi +if [[ -z "${EVAL_TEST_DESTINATION_URL:-}" ]]; then + echo "ci-eval: EVAL_TEST_DESTINATION_URL is not set" >&2 + exit 1 +fi + +exec npm run eval:ci diff --git a/docs/agent-evaluation/scripts/execute-ci-artifacts.sh b/docs/agent-evaluation/scripts/execute-ci-artifacts.sh new file mode 100755 index 000000000..03c046d8c --- /dev/null +++ b/docs/agent-evaluation/scripts/execute-ci-artifacts.sh @@ -0,0 +1,111 @@ +#!/usr/bin/env bash +# After a successful eval:ci (same ISO stamp for scenario-01 and scenario-02), run generated +# curl script and TypeScript quickstart against live Outpost (tenant → destination → publish). +# +# Required env: OUTPOST_API_KEY, OUTPOST_TEST_WEBHOOK_URL (often same URL as EVAL_TEST_DESTINATION_URL) +# Optional: OUTPOST_API_BASE_URL (managed default if unset) +set -euo pipefail + +ROOT="$(cd "$(dirname "$0")/.." && pwd)" +RUNS="$ROOT/results/runs" + +if [[ -z "${OUTPOST_API_KEY:-}" ]]; then + echo "execute-ci-artifacts: OUTPOST_API_KEY is not set" >&2 + exit 1 +fi +export OUTPOST_TEST_WEBHOOK_URL="${OUTPOST_TEST_WEBHOOK_URL:-${EVAL_TEST_DESTINATION_URL:-}}" +if [[ -z "${OUTPOST_TEST_WEBHOOK_URL:-}" ]]; then + echo "execute-ci-artifacts: OUTPOST_TEST_WEBHOOK_URL or EVAL_TEST_DESTINATION_URL must be set" >&2 + exit 1 +fi + +# Managed API default (agent-generated scripts often expect this in the environment). +# Use := so empty string from .env is treated like unset (otherwise curl hits /tenants without /2025-07-01 → 404). +: "${OUTPOST_API_BASE_URL:=https://api.outpost.hookdeck.com/2025-07-01}" +export OUTPOST_API_BASE_URL + +if [[ ! -d "$RUNS" ]]; then + echo "execute-ci-artifacts: missing $RUNS (run eval:ci first)" >&2 + exit 1 +fi + +# Latest scenario-01 run directory by mtime (same batch shares stamp with scenario-02). +d01="" +best=0 +for d in "$RUNS"/*-scenario-01; do + [[ -d "$d" ]] || continue + m=$(stat -c %Y "$d" 2>/dev/null || stat -f %m "$d") + if (( m >= best )); then + best=$m + d01=$d + fi +done + +if [[ -z "$d01" ]]; then + echo "execute-ci-artifacts: no *-scenario-01 directory under $RUNS" >&2 + exit 1 +fi + +prefix=${d01%-scenario-01} +d02="${prefix}-scenario-02" +if [[ ! -d "$d02" ]]; then + echo "execute-ci-artifacts: expected paired run dir missing: $d02" >&2 + exit 1 +fi + +pick_sh() { + local dir=$1 f + for f in "$dir"/*quickstart*.sh "$dir"/outpost*.sh; do + [[ -f "$f" ]] && { echo "$f"; return 0; } + done + for f in "$dir"/*.sh; do + [[ -f "$f" ]] && { echo "$f"; return 0; } + done + return 1 +} + +pick_ts() { + local dir=$1 f + for f in "$dir"/outpost-quickstart.ts "$dir"/*quickstart*.ts; do + [[ -f "$f" ]] && { echo "$f"; return 0; } + done + for f in "$dir"/*.ts; do + [[ -f "$f" ]] && { echo "$f"; return 0; } + done + return 1 +} + +echo "execute-ci-artifacts: scenario 01 dir=$d01" +sh_path=$(pick_sh "$d01") || { + echo "execute-ci-artifacts: no .sh script found in $d01" >&2 + exit 1 +} +echo "execute-ci-artifacts: running bash $sh_path" +export OUTPOST_API_KEY OUTPOST_TEST_WEBHOOK_URL +[[ -n "${OUTPOST_API_BASE_URL:-}" ]] && export OUTPOST_API_BASE_URL +chmod +x "$sh_path" 2>/dev/null || true +# Run from the scenario 01 run dir so relative paths in the generated script behave. +cd "$d01" +bash "$sh_path" || { + echo "execute-ci-artifacts: scenario 01 shell failed (curl exit 22 = HTTP error). 404 is often a wrong path or a publish/destination topic that is not configured in your Outpost project. Set OUTPOST_API_BASE_URL if needed; try npm run smoke:execute-ci (uses destination topics [\"*\"])." >&2 + exit 1 +} + +echo "execute-ci-artifacts: scenario 02 dir=$d02" +ts_path=$(pick_ts "$d02") || { + echo "execute-ci-artifacts: no .ts file found in $d02" >&2 + exit 1 +} +echo "execute-ci-artifacts: running npx tsx $ts_path (from $d02)" +cd "$d02" +if [[ -f package.json ]]; then + npm install --no-audit --no-fund +fi +export OUTPOST_API_KEY OUTPOST_TEST_WEBHOOK_URL +[[ -n "${OUTPOST_API_BASE_URL:-}" ]] && export OUTPOST_API_BASE_URL +npx --yes tsx "$ts_path" || { + echo "execute-ci-artifacts: scenario 02 TypeScript failed. Check OUTPOST_API_KEY, OUTPOST_TEST_WEBHOOK_URL, and that OUTPOST_CI_PUBLISH_TOPIC (default user.created) exists in the project. Try: npm run smoke:execute-ci" >&2 + exit 1 +} + +echo "execute-ci-artifacts: OK (scenario 01 shell + scenario 02 TypeScript)" diff --git a/docs/agent-evaluation/scripts/run-scenario.sh b/docs/agent-evaluation/scripts/run-scenario.sh new file mode 100755 index 000000000..de47f2c87 --- /dev/null +++ b/docs/agent-evaluation/scripts/run-scenario.sh @@ -0,0 +1,46 @@ +#!/usr/bin/env bash +# Manual agent evaluation helper: prints paths and Turn 0 instructions. +# Does NOT invoke an LLM or run automated tests. +set -euo pipefail + +ROOT="$(cd "$(dirname "$0")/.." && pwd)" +REPO_ROOT="$(cd "$ROOT/../.." && pwd)" + +usage() { + echo "Usage: $0 <01|02|03|04|05|06|07|08|09|10>" + echo "Prints the scenario file path and how to obtain Turn 0 from the single source of truth." + echo "" + echo "This script does not call an API or start an agent." +} + +if [[ "${1:-}" == "-h" || "${1:-}" == "--help" || -z "${1:-}" ]]; then + usage + exit 0 +fi + +id="$1" +shopt -s nullglob +matches=( "$ROOT/scenarios/${id}"-*.md ) +shopt -u nullglob + +if [[ ${#matches[@]} -eq 0 ]]; then + echo "No scenario matching: scenarios/${id}-*.md" >&2 + exit 1 +fi + +scenario="${matches[0]}" + +echo "=== Outpost agent eval (manual) ===" +echo "" +echo "Scenario file:" +echo " $scenario" +echo "" +echo "Turn 0 — copy the fenced block under '## Template' from:" +echo " $REPO_ROOT/docs/content/quickstarts/hookdeck-outpost-agent-prompt.mdoc" +echo "" +echo "Placeholder examples (not the template):" +echo " $ROOT/fixtures/placeholder-values-for-turn0.md" +echo "" +echo "Record results (local copy; see results/.gitignore):" +echo " cp \"$ROOT/results/RUN-RECORDING.template.md\" \"$ROOT/results/$(date +%F)-s${id}-.md\"" +echo "" diff --git a/docs/agent-evaluation/scripts/smoke-test-execute-ci-artifacts.sh b/docs/agent-evaluation/scripts/smoke-test-execute-ci-artifacts.sh new file mode 100755 index 000000000..e85d1869b --- /dev/null +++ b/docs/agent-evaluation/scripts/smoke-test-execute-ci-artifacts.sh @@ -0,0 +1,126 @@ +#!/usr/bin/env bash +# Local / operator check for the same path as CI: materialize a fresh *-scenario-01 / *-scenario-02 +# pair with hand-maintained scripts (wildcard destination topics), then run execute-ci-artifacts.sh. +# +# Requires: OUTPOST_API_KEY, OUTPOST_TEST_WEBHOOK_URL (source docs/agent-evaluation/.env or export) +# Optional: OUTPOST_API_BASE_URL, OUTPOST_CI_PUBLISH_TOPIC (default user.created — must exist in your project) +# +# Does not invoke the agent. Use this to verify secrets and managed API before relying on CI execution. +set -euo pipefail + +ROOT="$(cd "$(dirname "$0")/.." && pwd)" +cd "$ROOT" +if [[ -f .env ]]; then + set -a + # shellcheck disable=SC1091 + source .env + set +a +fi +if [[ -f .env.ci ]]; then + set -a + # shellcheck disable=SC1091 + source .env.ci + set +a +fi + +# Same as CI: webhook URL is often stored as EVAL_TEST_DESTINATION_URL in .env / .env.ci +export OUTPOST_TEST_WEBHOOK_URL="${OUTPOST_TEST_WEBHOOK_URL:-${EVAL_TEST_DESTINATION_URL:-}}" + +if [[ -z "${OUTPOST_API_KEY:-}" || -z "${OUTPOST_TEST_WEBHOOK_URL:-}" ]]; then + echo "smoke-test-execute-ci: set OUTPOST_API_KEY and OUTPOST_TEST_WEBHOOK_URL (or EVAL_TEST_DESTINATION_URL), e.g. source .env" >&2 + exit 1 +fi + +RUNS="$ROOT/results/runs" +mkdir -p "$RUNS" + +STAMP="ci-smoke-$(date -u +%Y-%m-%dT%H-%M-%S)-$(printf '%03d' $((RANDOM % 1000)))Z" +d01="$RUNS/${STAMP}-scenario-01" +d02="$RUNS/${STAMP}-scenario-02" +mkdir -p "$d01" "$d02" + +PUBLISH_TOPIC="${OUTPOST_CI_PUBLISH_TOPIC:-user.created}" + +# Shell: managed API, unique tenant, destination topics * (no dashboard topic list required), then publish. +cat > "$d01/outpost_quickstart.sh" << 'EOSH' +#!/usr/bin/env bash +set -euo pipefail +BASE="${OUTPOST_API_BASE_URL:-https://api.outpost.hookdeck.com/2025-07-01}" +TENANT_ID="ci_smoke_${RANDOM}_$(date +%s)" +TOPIC="${OUTPOST_CI_PUBLISH_TOPIC:-user.created}" +DEST_JSON="$(OUTPOST_TEST_WEBHOOK_URL="$OUTPOST_TEST_WEBHOOK_URL" python3 -c ' +import json, os +print(json.dumps({"type": "webhook", "topics": ["*"], "config": {"url": os.environ["OUTPOST_TEST_WEBHOOK_URL"]}})) +')" +curl -sS -f -X PUT "$BASE/tenants/$TENANT_ID" \ + -H "Authorization: Bearer $OUTPOST_API_KEY" -o /dev/null +curl -sS -f -X POST "$BASE/tenants/$TENANT_ID/destinations" \ + -H "Authorization: Bearer $OUTPOST_API_KEY" -H "Content-Type: application/json" \ + -d "$DEST_JSON" -o /dev/null +curl -sS -f -X POST "$BASE/publish" \ + -H "Authorization: Bearer $OUTPOST_API_KEY" -H "Content-Type: application/json" \ + -d "$(TENANT_ID="$TENANT_ID" TOPIC="$TOPIC" python3 -c ' +import json, os +print(json.dumps({ + "tenant_id": os.environ["TENANT_ID"], + "topic": os.environ["TOPIC"], + "eligible_for_retry": True, + "metadata": {"source": "ci-smoke-sh"}, + "data": {"smoke": True}, +})) +')" -o /dev/null -w "publish_http=%{http_code}\n" +echo "smoke shell OK tenant=$TENANT_ID" +EOSH +chmod +x "$d01/outpost_quickstart.sh" + +# TypeScript: same semantics (wildcard subscription); publish uses OUTPOST_CI_PUBLISH_TOPIC. +cat > "$d02/package.json" << 'EOJSON' +{ + "name": "ci-smoke-outpost-ts", + "private": true, + "type": "module", + "dependencies": { + "@hookdeck/outpost-sdk": "^0.9.0" + } +} +EOJSON + +cat > "$d02/outpost-quickstart.ts" << 'EOTS' +import { Outpost } from "@hookdeck/outpost-sdk"; + +const apiKey = process.env.OUTPOST_API_KEY; +if (!apiKey) throw new Error("Set OUTPOST_API_KEY"); +const webhookUrl = process.env.OUTPOST_TEST_WEBHOOK_URL; +if (!webhookUrl) throw new Error("Set OUTPOST_TEST_WEBHOOK_URL"); + +const outpost = new Outpost({ + apiKey, + ...(process.env.OUTPOST_API_BASE_URL + ? { serverURL: process.env.OUTPOST_API_BASE_URL } + : {}), +}); + +const tenantId = `ci_smoke_ts_${Math.random().toString(36).slice(2)}_${Date.now()}`; +const topic = process.env.OUTPOST_CI_PUBLISH_TOPIC ?? "user.created"; + +await outpost.tenants.upsert(tenantId); +await outpost.destinations.create(tenantId, { + type: "webhook", + topics: ["*"], + config: { url: webhookUrl }, +}); +const published = await outpost.publish.event({ + tenantId, + topic, + eligibleForRetry: true, + metadata: { source: "ci-smoke-ts" }, + data: { smoke: true }, +}); +console.log("smoke ts OK event id:", published.id); +EOTS + +touch "$d01" "$d02" +echo "smoke-test-execute-ci: wrote $d01 and $d02 (publish topic=$PUBLISH_TOPIC)" +export OUTPOST_CI_PUBLISH_TOPIC="$PUBLISH_TOPIC" +./scripts/execute-ci-artifacts.sh +echo "smoke-test-execute-ci: OK" diff --git a/docs/agent-evaluation/src/eval-harness.ts b/docs/agent-evaluation/src/eval-harness.ts new file mode 100644 index 000000000..d8facfba8 --- /dev/null +++ b/docs/agent-evaluation/src/eval-harness.ts @@ -0,0 +1,226 @@ +/** + * Declarative pre-steps for agent eval scenarios (see `## Eval harness` in scenario markdown). + */ + +import { existsSync } from "node:fs"; +import { readdir } from "node:fs/promises"; +import { join, resolve, sep } from "node:path"; + +export interface EvalHarnessConfig { + readonly preSteps: HarnessPreStep[]; + /** Directory under the run folder for the agent process `cwd` (default `"."` = run dir). */ + readonly agentCwd: string; +} + +export type HarnessPreStep = GitClonePreStep; + +export interface GitClonePreStep { + readonly type: "git_clone"; + readonly url: string; + /** Target directory name under the run dir (single segment, no `..`). */ + readonly into: string; + readonly depth?: number; + /** If set and `process.env[urlEnv]` is non-empty, use it instead of `url`. */ + readonly urlEnv?: string; +} + +const DEFAULT_CONFIG: EvalHarnessConfig = { preSteps: [], agentCwd: "." }; + +function envFlagTruthy(v: string | undefined): boolean { + if (!v) return false; + const s = v.trim().toLowerCase(); + return s === "1" || s === "true" || s === "yes"; +} + +/** Resolved path must stay under `root` (no `..` escape). */ +export function pathMustStayInsideRunDir(root: string, relativeOrAbsolute: string): string { + const resolved = resolve(relativeOrAbsolute); + const r = resolve(root); + if (resolved === r) return resolved; + const prefix = r.endsWith(sep) ? r : r + sep; + if (!resolved.startsWith(prefix)) { + throw new Error(`Path escapes run directory: ${relativeOrAbsolute} -> ${resolved}`); + } + return resolved; +} + +function assertSingleRunSubdir(name: string, field: string): void { + if (!name || name === "." || name === "..") { + throw new Error(`eval-harness: invalid ${field} (empty, ., or ..)`); + } + if (name.includes("/") || name.includes("\\") || name.includes("..")) { + throw new Error(`eval-harness: ${field} must be a single path segment: ${JSON.stringify(name)}`); + } +} + +function isRecord(v: unknown): v is Record { + return typeof v === "object" && v !== null && !Array.isArray(v); +} + +function parseGitCloneStep(raw: Record, index: number): GitClonePreStep { + const url = raw.url; + const into = raw.into; + if (typeof url !== "string" || url.length === 0) { + throw new Error(`eval-harness: preSteps[${index}] git_clone requires non-empty string "url"`); + } + if (typeof into !== "string" || into.length === 0) { + throw new Error(`eval-harness: preSteps[${index}] git_clone requires non-empty string "into"`); + } + assertSingleRunSubdir(into, "into"); + const depth = raw.depth; + if (depth !== undefined && (typeof depth !== "number" || !Number.isInteger(depth) || depth < 1)) { + throw new Error(`eval-harness: preSteps[${index}] git_clone "depth" must be a positive integer`); + } + const urlEnv = raw.urlEnv; + if (urlEnv !== undefined && (typeof urlEnv !== "string" || urlEnv.length === 0)) { + throw new Error(`eval-harness: preSteps[${index}] git_clone "urlEnv" must be a non-empty string`); + } + return { + type: "git_clone", + url, + into, + ...(depth !== undefined ? { depth } : {}), + ...(urlEnv ? { urlEnv } : {}), + }; +} + +function parsePreStep(raw: unknown, index: number): HarnessPreStep { + if (!isRecord(raw)) { + throw new Error(`eval-harness: preSteps[${index}] must be an object`); + } + const t = raw.type; + if (t === "git_clone") { + return parseGitCloneStep(raw, index); + } + throw new Error(`eval-harness: preSteps[${index}] unknown type ${JSON.stringify(t)}`); +} + +/** + * Parse `## Eval harness` and a ```eval-harness JSON block. Missing section → default (no pre-steps, cwd = run dir). + */ +export function parseEvalHarness(markdown: string): EvalHarnessConfig { + const m = markdown.match(/^## Eval harness\s*$/m); + if (!m || m.index === undefined) { + return DEFAULT_CONFIG; + } + const afterHeader = markdown.slice(m.index + m[0].length); + const nextH2 = afterHeader.match(/^## [^\s#]/m); + const section = nextH2?.index !== undefined ? afterHeader.slice(0, nextH2.index) : afterHeader; + const fence = section.match(/```eval-harness\s*\n([\s\S]*?)```/); + if (!fence) { + throw new Error( + 'Scenario has "## Eval harness" but no ```eval-harness ... ``` JSON block (add one, or remove the heading).', + ); + } + let parsed: unknown; + try { + parsed = JSON.parse(fence[1]!.trim()); + } catch (e) { + throw new Error( + `eval-harness: invalid JSON in ## Eval harness block: ${e instanceof Error ? e.message : String(e)}`, + ); + } + if (!isRecord(parsed)) { + throw new Error("eval-harness: root must be a JSON object"); + } + const preRaw = parsed.preSteps; + const preSteps: HarnessPreStep[] = []; + if (preRaw !== undefined) { + if (!Array.isArray(preRaw)) { + throw new Error('eval-harness: "preSteps" must be an array'); + } + for (let i = 0; i < preRaw.length; i++) { + preSteps.push(parsePreStep(preRaw[i], i)); + } + } + let agentCwd = "."; + const ac = parsed.agentCwd; + if (ac !== undefined) { + if (typeof ac !== "string") { + throw new Error('eval-harness: "agentCwd" must be a string'); + } + agentCwd = ac.trim() || "."; + } + if (agentCwd !== "." && agentCwd !== "") { + assertSingleRunSubdir(agentCwd, "agentCwd"); + } else { + agentCwd = "."; + } + return { preSteps, agentCwd }; +} + +async function dirLooksCloned(target: string): Promise { + if (!existsSync(target)) return false; + const entries = await readdir(target); + return entries.length > 0; +} + +async function runGitClone(runDir: string, step: GitClonePreStep): Promise { + const url = + (step.urlEnv && process.env[step.urlEnv]?.trim()) || step.url; + if (!url) { + throw new Error( + `eval-harness: git_clone into ${step.into} has no URL (set "url" or env ${step.urlEnv ?? "(none)"})`, + ); + } + const target = join(runDir, step.into); + if (await dirLooksCloned(target)) { + console.error(`Harness: skip git_clone (directory already non-empty): ${target}`); + return; + } + const { execFile } = await import("node:child_process"); + const { promisify } = await import("node:util"); + const execFileAsync = promisify(execFile); + const depth = step.depth ?? 1; + console.error(`Harness: git clone -> ${target}`); + try { + await execFileAsync("git", ["clone", "--depth", String(depth), url, target], { + cwd: runDir, + maxBuffer: 50 * 1024 * 1024, + }); + } catch (err) { + if (await dirLooksCloned(target)) { + return; + } + throw new Error( + `Harness git_clone failed (${url} -> ${target}): ${err instanceof Error ? err.message : String(err)}`, + ); + } +} + +/** + * Run harness pre-steps and return absolute agent cwd + run dir for the write guard. + */ +export async function applyEvalHarness( + runDir: string, + config: EvalHarnessConfig, +): Promise<{ agentCwd: string; writeGuardRoot: string }> { + const writeGuardRoot = runDir; + const skip = envFlagTruthy(process.env.EVAL_SKIP_HARNESS_PRE_STEPS); + + if (!skip) { + for (const step of config.preSteps) { + if (step.type === "git_clone") { + await runGitClone(runDir, step); + } + } + } else if (config.preSteps.length > 0) { + console.error("Harness: EVAL_SKIP_HARNESS_PRE_STEPS set — skipped all preSteps."); + } + + const relative = config.agentCwd === "." ? "" : config.agentCwd; + const agentCwd = relative ? join(runDir, relative) : runDir; + pathMustStayInsideRunDir(runDir, agentCwd); + + if (!existsSync(agentCwd)) { + if (skip) { + console.error( + `Harness: agent cwd ${agentCwd} missing (pre-steps skipped); falling back to run dir ${runDir}`, + ); + return { agentCwd: runDir, writeGuardRoot }; + } + throw new Error(`Harness: agent cwd does not exist after pre-steps: ${agentCwd}`); + } + + return { agentCwd, writeGuardRoot }; +} diff --git a/docs/agent-evaluation/src/llm-judge.ts b/docs/agent-evaluation/src/llm-judge.ts new file mode 100644 index 000000000..b3e9ae0b9 --- /dev/null +++ b/docs/agent-evaluation/src/llm-judge.ts @@ -0,0 +1,230 @@ +/** + * LLM-as-judge scoring via Anthropic Messages API. + * Feeds scenario Success criteria + assistant transcript; returns structured JSON from the model. + */ + +import { readFile } from "node:fs/promises"; +import { basename, dirname, join } from "node:path"; +import { extractTranscriptScoringText } from "./score-transcript.js"; + +const ANTHROPIC_MESSAGES_URL = "https://api.anthropic.com/v1/messages"; +const DEFAULT_SCORE_MODEL = "claude-sonnet-4-20250514"; +const MAX_TRANSCRIPT_CHARS = 180_000; + +export interface LlmCriterionJudgment { + readonly criterion: string; + readonly pass: boolean; + readonly evidence: string; +} + +export interface LlmJudgeReport { + readonly version: 1; + readonly model: string; + readonly runFile: string; + readonly scenarioFile: string; + readonly overall_transcript_pass: boolean; + /** LLM cannot run curls; always note limits */ + readonly execution_in_transcript: { + readonly pass: boolean | null; + readonly note: string; + }; + readonly criteria: readonly LlmCriterionJudgment[]; + readonly summary: string; +} + +interface RunJson { + meta?: { + scenarioId?: string; + scenarioFile?: string; + turns?: readonly { label?: string; messageCount?: number }[]; + }; + messages?: unknown[]; +} + +export function extractSuccessCriteriaMarkdown(fullMd: string): string { + const anchor = "## Success criteria"; + const i = fullMd.indexOf(anchor); + if (i === -1) { + return "(No ## Success criteria section found.)"; + } + const rest = fullMd.slice(i); + const sub = rest.slice(anchor.length); + const rel = sub.search(/\n## [A-Za-z]/); + return rel === -1 ? rest.trim() : rest.slice(0, anchor.length + rel).trim(); +} + +function stripJsonFence(text: string): string { + const t = text.trim(); + const m = t.match(/^```(?:json)?\s*([\s\S]*?)```$/m); + if (m) return m[1].trim(); + return t; +} + +function parseJudgeJson(text: string): Omit & { + version?: number; +} { + const raw = stripJsonFence(text); + const parsed = JSON.parse(raw) as Record; + const overall = Boolean(parsed.overall_transcript_pass); + const criteriaIn = parsed.criteria; + const criteria: LlmCriterionJudgment[] = []; + if (Array.isArray(criteriaIn)) { + for (const c of criteriaIn) { + if (typeof c !== "object" || c === null) continue; + const o = c as Record; + criteria.push({ + criterion: String(o.criterion ?? o.id ?? "unnamed"), + pass: Boolean(o.pass), + evidence: String(o.evidence ?? ""), + }); + } + } + const exec = parsed.execution_in_transcript; + let execution_in_transcript: LlmJudgeReport["execution_in_transcript"] = { + pass: null, + note: "Not specified by judge.", + }; + if (typeof exec === "object" && exec !== null) { + const e = exec as Record; + execution_in_transcript = { + pass: typeof e.pass === "boolean" ? e.pass : null, + note: String(e.note ?? ""), + }; + } + return { + overall_transcript_pass: overall, + execution_in_transcript: execution_in_transcript, + criteria, + summary: String(parsed.summary ?? ""), + }; +} + +const JUDGE_SYSTEM = `You are an expert evaluator for Hookdeck Outpost onboarding documentation and API usage. +You judge whether an AI assistant's replies satisfy the scenario's Success criteria (markdown checklist from the scenario spec). +Be strict: a criterion passes only if the transcript (including code the model wrote via tools) clearly satisfies it. +You cannot run shell or HTTP — do not claim execution passed; use execution_in_transcript.pass = null and explain in note. +Output ONLY valid JSON (no markdown fences, no commentary outside JSON) matching this shape: +{ + "overall_transcript_pass": boolean, + "execution_in_transcript": { "pass": null, "note": "string explaining you did not execute code" }, + "criteria": [ + { "criterion": "short label from checklist", "pass": boolean, "evidence": "1-3 sentences; quote or paraphrase assistant" } + ], + "summary": "2-4 sentences overall" +} +Map each major bullet/checkbox line from Success criteria to one criteria[] entry (merge tiny sub-bullets if needed).`; + +export async function llmJudgeRun(options: { + readonly runPath: string; + readonly scenarioMdPath: string; + readonly apiKey: string; + readonly model?: string; +}): Promise { + const model = options.model?.trim() || process.env.EVAL_SCORE_MODEL?.trim() || DEFAULT_SCORE_MODEL; + const rawRun = await readFile(options.runPath, "utf8"); + const data = JSON.parse(rawRun) as RunJson; + const scenarioFile = data.meta?.scenarioFile ?? "unknown.md"; + const scenarioMd = await readFile(options.scenarioMdPath, "utf8"); + const criteriaBlock = extractSuccessCriteriaMarkdown(scenarioMd); + + let transcript = extractTranscriptScoringText(data.messages); + if (transcript.length > MAX_TRANSCRIPT_CHARS) { + transcript = + transcript.slice(0, MAX_TRANSCRIPT_CHARS) + + "\n\n[… transcript truncated for judge context …]\n"; + } + + const userContent = `## Success criteria (from scenario spec — your rubric) + +${criteriaBlock} + +--- + +## Transcript for review (assistant text plus tool-written file contents and tool inputs from the run JSON) + +${transcript} + +--- + +Judge the transcript against the Success criteria. Remember: execution (running curl against a live API) is NOT evidenced here unless the transcript explicitly describes successful HTTP results; normally set execution_in_transcript.pass to null.`; + + const res = await fetch(ANTHROPIC_MESSAGES_URL, { + method: "POST", + headers: { + "content-type": "application/json", + "x-api-key": options.apiKey, + "anthropic-version": "2023-06-01", + }, + body: JSON.stringify({ + model, + max_tokens: 8192, + system: JUDGE_SYSTEM, + messages: [{ role: "user", content: userContent }], + }), + }); + + if (!res.ok) { + const errText = await res.text(); + throw new Error(`Anthropic API ${res.status}: ${errText.slice(0, 2000)}`); + } + + const body = (await res.json()) as { + content?: readonly { type?: string; text?: string }[]; + }; + const textBlock = body.content?.find((c) => c.type === "text"); + const text = textBlock?.text ?? ""; + let judged: ReturnType; + try { + judged = parseJudgeJson(text); + } catch { + throw new Error( + `Judge did not return parseable JSON. First 800 chars:\n${text.slice(0, 800)}`, + ); + } + + return { + version: 1, + model, + runFile: options.runPath, + scenarioFile, + overall_transcript_pass: judged.overall_transcript_pass, + execution_in_transcript: judged.execution_in_transcript, + criteria: judged.criteria, + summary: judged.summary, + }; +} + +export function scenarioMdPathFromRun( + evalRoot: string, + scenarioFile: string | undefined, +): string { + if (!scenarioFile?.trim()) { + throw new Error("Run JSON meta.scenarioFile is missing"); + } + return join(evalRoot, "scenarios", scenarioFile); +} + +export function formatLlmReportHuman(r: LlmJudgeReport): string { + const lines: string[] = [ + `LLM judge (${r.model})`, + `Transcript: ${r.runFile}`, + `Scenario: ${r.scenarioFile}`, + ]; + if (basename(r.runFile) === "transcript.json") { + lines.push(`Run directory: ${dirname(r.runFile)}`); + } + lines.push( + "", + `Overall transcript pass: ${r.overall_transcript_pass ? "YES" : "NO"}`, + `Execution (from transcript only): pass=${String(r.execution_in_transcript.pass)} — ${r.execution_in_transcript.note}`, + "", + "Per criterion:", + ); + for (const c of r.criteria) { + lines.push(` [${c.pass ? "PASS" : "FAIL"}] ${c.criterion}`); + lines.push(` ${c.evidence}`); + } + lines.push(""); + lines.push(`Summary: ${r.summary}`); + return lines.join("\n"); +} diff --git a/docs/agent-evaluation/src/run-agent-eval.ts b/docs/agent-evaluation/src/run-agent-eval.ts new file mode 100644 index 000000000..26781248a --- /dev/null +++ b/docs/agent-evaluation/src/run-agent-eval.ts @@ -0,0 +1,957 @@ +/** + * Automated Outpost onboarding agent evals via the Claude Agent SDK. + * + * Requires ANTHROPIC_API_KEY (and EVAL_TEST_DESTINATION_URL). Does not call Outpost. + * For a full eval, humans (or a separate verifier) run generated artifacts using OUTPOST_API_KEY — see README. + * + * @see https://platform.claude.com/docs/en/agent-sdk/overview + */ + +import { writeFileSync } from "node:fs"; +import { mkdir, readdir, readFile, writeFile } from "node:fs/promises"; +import { basename, dirname, join, resolve, sep } from "node:path"; +import { fileURLToPath } from "node:url"; +import { parseArgs } from "node:util"; +import dotenv from "dotenv"; +import { + query, + type HookInput, + type Options, + type SDKMessage, + type SDKSystemMessage, +} from "@anthropic-ai/claude-agent-sdk"; +import { applyEvalHarness, parseEvalHarness } from "./eval-harness.js"; +import { llmJudgeRun, scenarioMdPathFromRun } from "./llm-judge.js"; +import { scoreRunFile } from "./score-transcript.js"; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); + +/** `docs/agent-evaluation/` */ +const EVAL_ROOT = join(__dirname, ".."); + +dotenv.config({ path: join(EVAL_ROOT, ".env") }); +/** Outpost repository root */ +const REPO_ROOT = join(EVAL_ROOT, "..", ".."); +const PROMPT_MDX = join( + REPO_ROOT, + "docs/content/quickstarts/hookdeck-outpost-agent-prompt.mdoc", +); +const SCENARIOS_DIR = join(EVAL_ROOT, "scenarios"); +const RUNS_DIR = join(EVAL_ROOT, "results", "runs"); + +/** + * Harness-only status files next to the run folder (not inside `runDir`) so the agent sandbox cannot Read them. + * Example: `…/runs/2026-…-scenario-08/transcript.json` vs `…/runs/2026-…-scenario-08.eval-started.json`. + */ +function harnessSidecarPaths(runDir: string): { + started: string; + failure: string; + aborted: string; +} { + const stem = basename(runDir); + return { + started: join(RUNS_DIR, `${stem}.eval-started.json`), + failure: join(RUNS_DIR, `${stem}.eval-failure.json`), + aborted: join(RUNS_DIR, `${stem}.eval-aborted.json`), + }; +} + +/** Paths for SIGTERM/SIGINT abort sidecar while a scenario is in progress (not SIGKILL). */ +let activeHarnessAbortContext: { readonly path: string; readonly runDirectory: string } | null = null; + +function registerEvalSignalHandlers(): void { + const recordAbort = (signal: string) => { + const ctx = activeHarnessAbortContext; + if (!ctx) return; + try { + writeFileSync( + ctx.path, + `${JSON.stringify( + { + abortedAt: new Date().toISOString(), + signal, + pid: process.pid, + runDirectory: ctx.runDirectory, + note: "Process exited before transcript.json was written; long agent turns often print little to stdout.", + }, + null, + 2, + )}\n`, + "utf8", + ); + } catch { + // best-effort + } + }; + process.once("SIGTERM", () => { + recordAbort("SIGTERM"); + process.exit(143); + }); + process.once("SIGINT", () => { + recordAbort("SIGINT"); + process.exit(130); + }); +} + +function isInitSystemMessage(m: SDKMessage): m is SDKSystemMessage { + return m.type === "system" && m.subtype === "init"; +} + +function extractTemplateFromMdx(mdx: string): string { + const idx = mdx.indexOf("## Template"); + if (idx === -1) { + throw new Error( + "Could not find ## Template in hookdeck-outpost-agent-prompt.mdoc", + ); + } + const after = mdx.slice(idx); + const fenceStart = after.indexOf("```"); + if (fenceStart === -1) { + throw new Error("No opening code fence after ## Template"); + } + const contentStart = after.indexOf("\n", fenceStart) + 1; + const fenceEnd = after.indexOf("```", contentStart); + if (fenceEnd === -1) { + throw new Error("No closing code fence for ## Template"); + } + return after.slice(contentStart, fenceEnd).trim(); +} + +function envFlagTruthy(v: string | undefined): boolean { + if (!v) return false; + const s = v.trim().toLowerCase(); + return s === "1" || s === "true" || s === "yes"; +} + +/** Wall-clock heartbeat while the SDK stream is quiet (e.g. long Bash / blocked subprocess). */ +function evalProgressIntervalMs(): number { + const n = Number(process.env.EVAL_PROGRESS_INTERVAL_MS ?? "30000"); + if (!Number.isFinite(n) || n < 5000) { + return 30000; + } + return n; +} + +/** When docs are not published yet, point the agent at MDX/OpenAPI paths in this repo. */ +function localDocumentationBlock(repoRoot: string, llmsFullUrl: string | undefined): string { + const f = (...parts: string[]) => join(repoRoot, ...parts); + const languageSdkBlock = `### Language → SDK vs HTTP + +Map what the user says (they rarely name packages): + +- **Simplest / minimal / least setup** and no language named → **curl** quickstart + OpenAPI; one shell script; **no SDK**. Read the **entire** curl quickstart (it covers REST responses and any shell portability notes for scripts). +- **TypeScript** or **Node** → TypeScript quickstart + \`@hookdeck/outpost-sdk\` as in that doc. +- **Python** → Python quickstart + \`outpost_sdk\`; \`publish.event(request={{...}})\` as in that doc — not TS-style kwargs. +- **Go** → Go quickstart + official Go SDK as in that doc. +- Explicit **curl** / **HTTP only** / **REST** → curl quickstart + OpenAPI. + +**Small app (option 2):** Next.js → TS SDK server-side; FastAPI → Python SDK; Go net/http → Go SDK — use that language’s quickstart for Outpost shapes. + +**Existing app (option 3):** Official SDK for the repo’s language (or REST if they refuse SDK). + +Do **not** mix TS call shapes into Python.`; + + let block = `### Documentation (local repository — unpublished) + +Do **not** rely on live public documentation URLs for this session. Read these files from the Outpost checkout (for example with the **Read** tool). Paths are absolute from the repository root: + +Follow **Language → SDK vs HTTP** below for mapping user intent to the **single** right quickstart. Prefer language quickstarts over \`sdks.mdoc\` (TS-heavy). + +- **Concepts** (tenants, destinations as subscriptions, topics, how this fits a SaaS/platform): \`${f("docs/content/concepts.mdoc")}\` +- **Building your own UI** (screen structure: list destinations, create flow type → topics → config): \`${f("docs/content/guides/building-your-own-ui.mdoc")}\` +- **Topics** (destination topic subscriptions, fan-out): \`${f("docs/content/features/topics.mdoc")}\` +- Getting started (curl / HTTP only): \`${f("docs/content/quickstarts/hookdeck-outpost-curl.mdoc")}\` +- TypeScript quickstart (TS SDK): \`${f("docs/content/quickstarts/hookdeck-outpost-typescript.mdoc")}\` +- Python quickstart (Python SDK): \`${f("docs/content/quickstarts/hookdeck-outpost-python.mdoc")}\` +- Go quickstart (Go SDK): \`${f("docs/content/quickstarts/hookdeck-outpost-go.mdoc")}\` +- Docs content (browse for feature pages): \`${f("docs/content/")}\` +- OpenAPI spec (machine-readable): \`${f("docs/apis/openapi.yaml")}\` +- **Destination types** (summary + links): \`${f("docs/content/overview.mdoc")}\` — *Supported destinations*; per-type detail in \`docs/content/destinations/*.mdoc\` (e.g. \`${f("docs/content/destinations/webhook.mdoc")}\`) +- SDKs overview (TS-heavy): \`${f("docs/content/sdks.mdoc")}\` — prefer the language quickstart over this for Python/Go/TS code. + +${languageSdkBlock}`; + if (llmsFullUrl) { + block += `\n- Full docs bundle: ${llmsFullUrl}`; + } + return block; +} + +function applyPlaceholders( + template: string, + env: NodeJS.ProcessEnv, + repoRoot: string, +): string { + const apiBase = + env.EVAL_API_BASE_URL ?? "https://api.outpost.hookdeck.com/2025-07-01"; + const topics = env.EVAL_TOPICS_LIST ?? "- user.created"; + const testUrl = env.EVAL_TEST_DESTINATION_URL?.trim(); + if (!testUrl) { + throw new Error( + "Set EVAL_TEST_DESTINATION_URL to your Hookdeck Console Source URL (same value the dashboard injects as {{TEST_DESTINATION_URL}})", + ); + } + const docsUrl = env.EVAL_DOCS_URL ?? "https://hookdeck.com/docs/outpost"; + const llms = env.EVAL_LLMS_FULL_URL?.trim() ?? ""; + const useLocalDocs = envFlagTruthy(env.EVAL_LOCAL_DOCS); + + let base = template; + if (useLocalDocs) { + const docSection = /^### Documentation\n\n[\s\S]*?(?=\n### What to do\b)/m; + if (!docSection.test(base)) { + throw new Error( + "EVAL_LOCAL_DOCS is set but the prompt template has no ### Documentation section before ### What to do", + ); + } + base = base.replace( + docSection, + localDocumentationBlock(repoRoot, llms || undefined), + ); + } + + let out = base + .replaceAll("{{API_BASE_URL}}", apiBase) + .replaceAll("{{TOPICS_LIST}}", topics) + .replaceAll("{{TEST_DESTINATION_URL}}", testUrl) + .replaceAll("{{DOCS_URL}}", docsUrl) + .replaceAll("{{LLMS_FULL_URL}}", llms); + + if (!llms) { + out = out + .split("\n") + .filter((line) => !/Full docs bundle/i.test(line)) + .join("\n"); + } + + return out; +} + +interface ParsedTurn { + readonly num: number; + readonly title: string; + readonly body: string; + readonly optional: boolean; +} + +function parseScenarioTurns(markdown: string): ParsedTurn[] { + const lines = markdown.split(/\r?\n/); + const turns: ParsedTurn[] = []; + let i = 0; + + while (i < lines.length) { + const line = lines[i]; + const m = line.match(/^### Turn (\d+)\s*(.*)$/); + if (m) { + const num = Number(m[1]); + const restOfTitle = m[2] ?? ""; + const title = `Turn ${m[1]}${restOfTitle ? ` ${restOfTitle}` : ""}`; + const optional = /optional/i.test(title); + i++; + const bodyLines: string[] = []; + while (i < lines.length) { + const L = lines[i]; + if (/^### /.test(L)) { + break; + } + if (/^## /.test(L)) { + break; + } + bodyLines.push(L); + i++; + } + turns.push({ + num, + title, + body: bodyLines.join("\n").trim(), + optional, + }); + continue; + } + i++; + } + + return turns.sort((a, b) => a.num - b.num); +} + +function extractUserMessage(turnBody: string): string { + const quoted: string[] = []; + for (const line of turnBody.split(/\r?\n/)) { + const q = line.match(/^\s*>\s?(.*)$/); + if (q) { + quoted.push(q[1]); + } + } + const fromBlockquote = quoted.join("\n").trim(); + if (fromBlockquote) { + return fromBlockquote; + } + return turnBody.replace(/^\s*$/gm, "").trim(); +} + +function serializeMessage(message: SDKMessage): unknown { + try { + return JSON.parse( + JSON.stringify(message, (_, v) => (typeof v === "bigint" ? v.toString() : v)), + ); + } catch { + return { _nonSerializable: String(message) }; + } +} + +async function listScenarioFiles(): Promise { + const names = await readdir(SCENARIOS_DIR); + return names + .filter((n) => /^\d{2}-.*\.md$/.test(n)) + .sort(); +} + +function idFromFilename(file: string): string { + return file.slice(0, 2); +} + +async function runScenarioQuery( + prompt: string, + options: Options, + progress?: { readonly phaseLabel: string }, +): Promise<{ messages: unknown[]; sessionId?: string }> { + const messages: unknown[] = []; + let sessionId: string | undefined; + const progressOn = envFlagTruthy(process.env.EVAL_PROGRESS); + const label = progress?.phaseLabel ?? "agent query"; + let msgCount = 0; + let interval: ReturnType | undefined; + + if (progressOn && progress) { + const maxTurns = options.maxTurns; + console.error( + `[eval] ${label}: starting (EVAL_PROGRESS=1; heartbeat every ${evalProgressIntervalMs()}ms; maxTurns=${String(maxTurns)})`, + ); + interval = setInterval(() => { + console.error( + `[eval] ${label}: still running (${msgCount} SDK message(s) so far — subprocess or model may be busy with no new stream events)`, + ); + }, evalProgressIntervalMs()); + } + + try { + const q = query({ prompt, options }); + for await (const message of q) { + msgCount += 1; + messages.push(serializeMessage(message)); + if (isInitSystemMessage(message)) { + sessionId = message.session_id; + } + if (progressOn && progress && msgCount > 0 && msgCount % 25 === 0) { + console.error(`[eval] ${label}: ${msgCount} SDK message(s) received`); + } + } + } finally { + if (interval !== undefined) { + clearInterval(interval); + } + } + + if (progressOn && progress) { + console.error(`[eval] ${label}: finished this query (${msgCount} SDK message(s))`); + } + + return { messages, sessionId }; +} + +async function runOneScenario( + scenarioFile: string, + filledTemplate: string, + opts: { + skipOptional: boolean; + baseOptions: Options; + /** When set, avoids a second read of the scenario file (same content as harness parse). */ + scenarioMarkdown?: string; + }, +): Promise<{ + scenarioId: string; + scenarioFile: string; + turns: Array<{ label: string; messageCount: number }>; + sessionId?: string; + allMessages: unknown[]; +}> { + const path = join(SCENARIOS_DIR, scenarioFile); + const md = opts.scenarioMarkdown ?? (await readFile(path, "utf8")); + const parsed = parseScenarioTurns(md); + + const userTurns = parsed + .filter((t) => t.num >= 1) + .filter((t) => !t.optional || !opts.skipOptional) + .map((t) => ({ + label: t.title, + text: extractUserMessage(t.body), + })) + .filter((t) => t.text.length > 0); + + const prompts = [filledTemplate, ...userTurns.map((t) => t.text)]; + + const allMessages: unknown[] = []; + let sessionId: string | undefined; + const turnStats: Array<{ label: string; messageCount: number }> = []; + + for (let i = 0; i < prompts.length; i++) { + const label = i === 0 ? "Turn 0 (dashboard prompt)" : userTurns[i - 1]?.label ?? `Turn ${i}`; + const before = allMessages.length; + const { messages, sessionId: sid } = await runScenarioQuery( + prompts[i]!, + { + ...opts.baseOptions, + resume: sessionId, + }, + { phaseLabel: label }, + ); + if (sid) { + sessionId = sid; + } + allMessages.push(...messages); + turnStats.push({ + label, + messageCount: allMessages.length - before, + }); + } + + return { + scenarioId: idFromFilename(scenarioFile), + scenarioFile, + turns: turnStats, + sessionId, + allMessages, + }; +} + +/** True if resolved `filePath` is `runDir` or a path inside it (never outside). */ +function filePathIsInsideRunDir(runDir: string, filePath: string): boolean { + const root = resolve(runDir); + const target = resolve(filePath); + if (target === root) return true; + const prefix = root.endsWith(sep) ? root : root + sep; + return target.startsWith(prefix); +} + +function resolveMaybeRelativePath(p: string, agentCwd: string): string { + if (p.startsWith(sep) || /^[A-Za-z]:[\\/]/.test(p)) { + return resolve(p); + } + return resolve(agentCwd, p); +} + +/** Read/Glob/Grep may touch the run directory, or (with local docs) only `repoRoot/docs`. */ +function pathAllowedForReadTool( + absPath: string, + runDir: string, + repoRoot: string, + localDocs: boolean, +): boolean { + const p = resolve(absPath); + if (filePathIsInsideRunDir(runDir, p)) return true; + if (localDocs && filePathIsInsideRunDir(join(repoRoot, "docs"), p)) return true; + return false; +} + +/** + * Bash: block commands that reference the Outpost repo root unless the reference stays under + * `runDir` or (local docs) `repoRoot/docs`. + */ +function bashCommandAllowed(command: string, runDir: string, repoRoot: string, localDocs: boolean): boolean { + const rr = resolve(repoRoot); + const rd = resolve(runDir); + const docRoot = localDocs ? resolve(join(repoRoot, "docs")) : null; + if (!command.includes(rr)) return true; + if (command.includes(rd)) return true; + if (docRoot && command.includes(docRoot)) return true; + if (localDocs && command.includes(join(repoRoot, "docs"))) return true; + return false; +} + +function toolInputWritePath(toolName: string, toolInput: unknown): string | undefined { + if (toolName !== "Write" && toolName !== "Edit" && toolName !== "NotebookEdit") { + return undefined; + } + if (typeof toolInput !== "object" || toolInput === null) return undefined; + const input = toolInput as Record; + for (const k of ["file_path", "path", "notebook_path"] as const) { + const v = input[k]; + if (typeof v === "string" && v.length > 0) return v; + } + return undefined; +} + +function toolInputReadFilePath(toolInput: unknown): string | undefined { + if (typeof toolInput !== "object" || toolInput === null) return undefined; + const v = (toolInput as Record).file_path; + return typeof v === "string" && v.length > 0 ? v : undefined; +} + +function preToolDeny(reason: string) { + return { + hookSpecificOutput: { + hookEventName: "PreToolUse" as const, + permissionDecision: "deny" as const, + permissionDecisionReason: reason, + }, + }; +} + +/** + * Appended to Turn 0 so the model does not treat the Hookdeck Outpost monorepo as the integration target. + */ +function buildWorkspaceBoundaryAppendix( + runDir: string, + agentCwd: string, + repoRoot: string, + localDocs: boolean, +): string { + const docsPath = join(repoRoot, "docs"); + const docBullet = localDocs + ? `\n- You **may** use Read/Glob/Grep only under **\`${docsPath}\`** when following the **Documentation (local repository)** paths in this prompt—not elsewhere under **\`${repoRoot}\`** (no \`sdks/\`, \`internal/\`, \`go.mod\` at repo root, etc.).` + : `\n- Do **not** read or search the Hookdeck Outpost checkout on disk outside **\`${runDir}\`**; use the documentation URLs already listed above.`; + + return ` + +### Workspace boundary (automated eval session) + +- The **integration target** is **only** under **\`${runDir}\`** (shell cwd: **\`${agentCwd}\`**). Install dependencies, add SDK usage, routes, UI, and env/README notes **there**. +- Do **not** use Read, Glob, Grep, or Bash to explore **\`${repoRoot}\`** except:${docBullet} +- Do **not** use the **Agent** tool to spider the monorepo or another tree; implement the integration directly in this workspace. +`; +} + +/** + * PreToolUse hook: Write/Edit only under run dir; Read/Glob/Grep/Bash constrained to run dir (+ docs/ when EVAL_LOCAL_DOCS). + * \`EVAL_DISABLE_WORKSPACE_READ_GUARD=1\` — allow Read/Glob/Grep/Bash/Agent outside the sandbox. + * \`EVAL_DISABLE_WORKSPACE_WRITE_GUARD=1\` — allow Write/Edit outside the run directory (read sandbox unchanged unless also disabled above). + */ +function createRunDirPreToolHook(ctx: { + allowedRootDir: string; + agentCwd: string; + runDir: string; + repoRoot: string; + localDocs: boolean; + readGuardOn: boolean; + writeGuardOn: boolean; +}) { + const { allowedRootDir, agentCwd, runDir, repoRoot, localDocs, readGuardOn, writeGuardOn } = ctx; + + return async (input: HookInput) => { + if (input.hook_event_name !== "PreToolUse") return {}; + + if (readGuardOn && input.tool_name === "Agent" && !envFlagTruthy(process.env.EVAL_ALLOW_AGENT_TOOL)) { + return preToolDeny( + "Outpost agent-eval: the Agent subagent is disabled for fair scoring (set EVAL_ALLOW_AGENT_TOOL=1 to allow).", + ); + } + + if (readGuardOn && input.tool_name === "Read") { + const raw = toolInputReadFilePath(input.tool_input); + if (raw) { + const abs = resolveMaybeRelativePath(raw, agentCwd); + if (!pathAllowedForReadTool(abs, runDir, repoRoot, localDocs)) { + return preToolDeny( + `Outpost agent-eval: Read must stay under the scenario run directory or (with EVAL_LOCAL_DOCS) ${join(repoRoot, "docs")}. Refused: ${abs}`, + ); + } + } + return {}; + } + + if (readGuardOn && input.tool_name === "Glob") { + const inp = input.tool_input; + if (typeof inp === "object" && inp !== null) { + const pathRaw = (inp as Record).path; + if (typeof pathRaw === "string" && pathRaw.length > 0) { + const abs = resolveMaybeRelativePath(pathRaw, agentCwd); + if (!pathAllowedForReadTool(abs, runDir, repoRoot, localDocs)) { + return preToolDeny( + `Outpost agent-eval: Glob path must stay under the run directory or repo docs/. Refused: ${abs}`, + ); + } + } + } + return {}; + } + + if (readGuardOn && input.tool_name === "Grep") { + const inp = input.tool_input; + if (typeof inp === "object" && inp !== null) { + const pathRaw = (inp as Record).path; + if (typeof pathRaw === "string" && pathRaw.length > 0) { + const abs = resolveMaybeRelativePath(pathRaw, agentCwd); + if (!pathAllowedForReadTool(abs, runDir, repoRoot, localDocs)) { + return preToolDeny( + `Outpost agent-eval: Grep path must stay under the run directory or repo docs/. Refused: ${abs}`, + ); + } + } + } + return {}; + } + + if (readGuardOn && input.tool_name === "Bash") { + const inp = input.tool_input; + if (typeof inp === "object" && inp !== null) { + const cmd = (inp as Record).command; + if (typeof cmd === "string" && cmd.trim().length > 0) { + if (!bashCommandAllowed(cmd, runDir, repoRoot, localDocs)) { + return preToolDeny( + `Outpost agent-eval: Bash must not traverse the Outpost monorepo outside this run (or docs/ when EVAL_LOCAL_DOCS=1). Refused command prefix: ${cmd.slice(0, 120)}${cmd.length > 120 ? "…" : ""}`, + ); + } + } + } + return {}; + } + + if (writeGuardOn) { + const candidate = toolInputWritePath(input.tool_name, input.tool_input); + if (candidate && !filePathIsInsideRunDir(allowedRootDir, candidate)) { + return preToolDeny( + `Outpost agent-eval: ${input.tool_name} must target only the scenario run directory tree. Use a path under ${allowedRootDir}. Refused: ${resolve(candidate)}`, + ); + } + } + return {}; + }; +} + +function defaultEvalTools(env: NodeJS.ProcessEnv): string { + if (env.EVAL_TOOLS?.trim()) { + return env.EVAL_TOOLS.trim(); + } + // dontAsk + allowedTools: only listed tools are pre-approved; others are denied. + // Write/Edit: materialize scripts and apps into the per-run directory (agent cwd). + // Bash: npm/npx/go mod/pip/uv for app scenarios (05–07) and installs for 02–04. + // WebFetch: omitted when EVAL_LOCAL_DOCS uses repo paths + Read instead. + return envFlagTruthy(env.EVAL_LOCAL_DOCS) + ? "Read,Glob,Grep,Write,Edit,Bash" + : "Read,Glob,Grep,WebFetch,Write,Edit,Bash"; +} + +function buildBaseOptions(ctx: { + agentCwd: string; + writeGuardRoot: string; + runDir: string; + repoRoot: string; + localDocs: boolean; +}): Options { + const { agentCwd, writeGuardRoot, runDir, repoRoot, localDocs } = ctx; + const toolsRaw = defaultEvalTools(process.env); + const allowedTools = toolsRaw + .split(",") + .map((s) => s.trim()) + .filter(Boolean); + + const mode = (process.env.EVAL_PERMISSION_MODE ?? "dontAsk") as NonNullable< + Options["permissionMode"] + >; + + const maxTurns = Number(process.env.EVAL_MAX_TURNS ?? "80"); + const persistSession = process.env.EVAL_PERSIST_SESSION !== "false"; + + const o: Options = { + cwd: agentCwd, + allowedTools, + permissionMode: mode, + maxTurns: Number.isFinite(maxTurns) ? maxTurns : 80, + persistSession, + env: { + ...process.env, + CLAUDE_AGENT_SDK_CLIENT_APP: "outpost-docs-agent-eval/1.0.0", + } as Record, + }; + + const readGuardOn = !envFlagTruthy(process.env.EVAL_DISABLE_WORKSPACE_READ_GUARD); + const writeGuardOn = !envFlagTruthy(process.env.EVAL_DISABLE_WORKSPACE_WRITE_GUARD); + if (readGuardOn || writeGuardOn) { + o.hooks = { + PreToolUse: [ + { + hooks: [ + createRunDirPreToolHook({ + allowedRootDir: writeGuardRoot, + agentCwd, + runDir, + repoRoot, + localDocs, + readGuardOn, + writeGuardOn, + }), + ], + }, + ], + }; + } + + if (process.env.EVAL_MODEL?.trim()) { + o.model = process.env.EVAL_MODEL.trim(); + } + + return o; +} + +async function main(): Promise { + const { values } = parseArgs({ + options: { + scenario: { type: "string" }, + scenarios: { type: "string" }, + all: { type: "boolean", default: false }, + "skip-optional": { type: "boolean", default: false }, + "dry-run": { type: "boolean", default: false }, + "no-score": { type: "boolean", default: false }, + "no-score-llm": { type: "boolean", default: false }, + help: { type: "boolean", short: "h", default: false }, + }, + allowPositionals: false, + }); + + if (values.help) { + console.log(` +Outpost agent evaluation (Claude Agent SDK) + +Usage: + npm run eval -- --scenario 01 + npm run eval -- --scenarios 01,02,05 + npm run eval -- --all # deliberate: every scenario (costly) + npm run eval -- --skip-optional + npm run eval -- --no-score # skip heuristic-score.json + npm run eval -- --no-score-llm # skip llm-score.json (no Success-criteria judge) + npm run eval -- --no-score --no-score-llm # transcripts only + npm run eval -- --dry-run + +You must pass --scenario, --scenarios, or --all so the set of runs is explicit (cost and scope). +After each scenario: transcript + heuristic-score.json + llm-score.json (judge uses ## Success criteria) unless disabled above. +Exit 1 if any enabled score fails. + +Environment: + Values can be set in docs/agent-evaluation/.env (loaded automatically) or exported in the shell. + ANTHROPIC_API_KEY Required + EVAL_TEST_DESTINATION_URL Required — Hookdeck Console Source URL (fed into {{TEST_DESTINATION_URL}}) + EVAL_API_BASE_URL Optional (default: managed production URL) + EVAL_TOPICS_LIST Optional + EVAL_DOCS_URL Optional (ignored for doc links when EVAL_LOCAL_DOCS is set) + EVAL_LOCAL_DOCS Set to 1/true/yes to replace Documentation URLs with repo file paths (unpublished docs) + EVAL_LLMS_FULL_URL Optional (omit docs line if unset) + EVAL_TOOLS Optional, comma-separated (default: Read,Glob,Grep[,WebFetch],Write,Edit,Bash — see README) + EVAL_MODEL Optional + EVAL_MAX_TURNS Optional (default: 80; npm/go mod installs can exceed 40; lower only for smoke — may not finish 08–10) + EVAL_PROGRESS Set to 1/true/yes — log heartbeats to stderr during each agent query (see EVAL_PROGRESS_INTERVAL_MS) + EVAL_PROGRESS_INTERVAL_MS Optional (default: 30000, min 5000) — wall-clock heartbeat while the SDK stream is quiet + EVAL_PERMISSION_MODE Optional (default: dontAsk) + EVAL_PERSIST_SESSION Set to "false" to disable session persistence (breaks multi-turn resume) + EVAL_DISABLE_WORKSPACE_WRITE_GUARD Set to 1 to allow Write/Edit outside the run dir (not recommended) + EVAL_DISABLE_WORKSPACE_READ_GUARD Set to 1 to allow Read/Glob/Grep/Bash/Agent outside the run dir (+ docs/ when local) + EVAL_ALLOW_AGENT_TOOL Set to 1 to allow the Agent subagent (default: denied for fair scoring) + EVAL_SKIP_HARNESS_PRE_STEPS Set to 1 to skip ## Eval harness preSteps (git_clone, etc.); see scenario markdown + +Outputs under docs/agent-evaluation/results/runs/ (gitignored): each scenario gets + results/runs/-scenario-NN/transcript.json + heuristic-score.json and llm-score.json unless disabled (see above). +Also set EVAL_NO_SCORE_HEURISTIC=1 or EVAL_NO_SCORE_LLM=1 in .env to skip scoring without flags. + +Agent cwd is usually the run directory. Scenarios may define ## Eval harness (JSON) to clone a baseline into a subfolder first. +`); + process.exit(0); + } + + if (!process.env.ANTHROPIC_API_KEY?.trim()) { + console.error("Missing ANTHROPIC_API_KEY"); + process.exit(1); + } + + const mdx = await readFile(PROMPT_MDX, "utf8"); + const template = extractTemplateFromMdx(mdx); + const filledTemplate = applyPlaceholders(template, process.env, REPO_ROOT); + + const allFiles = await listScenarioFiles(); + let selected: string[]; + + if (values.all) { + selected = allFiles; + } else if (values.scenarios) { + const ids = values.scenarios.split(",").map((s) => s.trim()); + selected = allFiles.filter((f) => ids.includes(idFromFilename(f))); + const missing = ids.filter((id) => !selected.some((f) => idFromFilename(f) === id)); + if (missing.length) { + console.error("Unknown scenario id(s):", missing.join(", ")); + process.exit(1); + } + } else if (values.scenario) { + const id = values.scenario.padStart(2, "0"); + selected = allFiles.filter((f) => idFromFilename(f) === id); + if (selected.length === 0) { + console.error("Unknown scenario:", values.scenario); + process.exit(1); + } + } else { + console.error( + "Choose which scenarios to run (cost is proportional): --scenario , --scenarios id,id, or --all for the full set.", + ); + console.error(`Available: ${allFiles.map((f) => idFromFilename(f)).join(", ")}`); + process.exit(1); + } + + if (values["dry-run"]) { + const localDocs = envFlagTruthy(process.env.EVAL_LOCAL_DOCS); + const sampleRun = join(RUNS_DIR, "dry-run-example-scenario"); + const sampleAgent = join(sampleRun, "app-baseline"); + const boundarySample = buildWorkspaceBoundaryAppendix(sampleRun, sampleAgent, REPO_ROOT, localDocs); + console.log("Dry run: would execute", selected.join(", ")); + console.log( + "Turn 0 base template (chars):", + filledTemplate.length, + "| + workspace boundary (~chars):", + boundarySample.length, + ); + process.exit(0); + } + + await mkdir(RUNS_DIR, { recursive: true }); + const stamp = new Date().toISOString().replace(/[:.]/g, "-"); + + const wantScore = + !values["no-score"] && + !envFlagTruthy(process.env.EVAL_NO_SCORE_HEURISTIC); + const wantLlm = + !values["no-score-llm"] && + !envFlagTruthy(process.env.EVAL_NO_SCORE_LLM); + + let anyScoreFailure = false; + + console.error( + `Running ${selected.length} scenario(s): ${selected.join(", ")} (heuristic=${String(wantScore)}, llm=${String(wantLlm)})`, + ); + + registerEvalSignalHandlers(); + + for (const file of selected) { + const scenarioIdEarly = idFromFilename(file); + const runDir = join(RUNS_DIR, `${stamp}-scenario-${scenarioIdEarly}`); + await mkdir(runDir, { recursive: true }); + + const scenarioPath = join(SCENARIOS_DIR, file); + const scenarioMd = await readFile(scenarioPath, "utf8"); + const harnessConfig = parseEvalHarness(scenarioMd); + const { agentCwd, writeGuardRoot } = await applyEvalHarness(runDir, harnessConfig); + const localDocs = envFlagTruthy(process.env.EVAL_LOCAL_DOCS); + const baseOptions = buildBaseOptions({ + agentCwd, + writeGuardRoot, + runDir, + repoRoot: REPO_ROOT, + localDocs, + }); + const turn0Prompt = + filledTemplate + buildWorkspaceBoundaryAppendix(runDir, agentCwd, REPO_ROOT, localDocs); + console.error(`\n>>> Scenario ${file} (run dir ${runDir}, agent cwd ${agentCwd}) ...`); + if (scenarioIdEarly === "08" || scenarioIdEarly === "09" || scenarioIdEarly === "10") { + console.error( + "Note: Scenarios 08–10 clone a full baseline and install deps — often 30–90+ min wall time with sparse console output until transcript.json. Ctrl+C aborts (writes *.eval-aborted.json). Set EVAL_PROGRESS=1 for stderr heartbeats. See README § Wall time.", + ); + } + + const sidecars = harnessSidecarPaths(runDir); + activeHarnessAbortContext = { path: sidecars.aborted, runDirectory: runDir }; + await writeFile( + sidecars.started, + `${JSON.stringify( + { + startedAt: new Date().toISOString(), + pid: process.pid, + scenarioFile: file, + scenarioId: scenarioIdEarly, + runDirectory: runDir, + harnessSidecars: { + started: sidecars.started, + failure: sidecars.failure, + aborted: sidecars.aborted, + }, + note: "Transcript and score JSON live under runDirectory. Harness *.eval-*.json paths are siblings of the run folder (not inside it) so the agent cannot read eval metadata.", + }, + null, + 2, + )}\n`, + "utf8", + ); + + try { + const result = await runOneScenario(file, turn0Prompt, { + skipOptional: values["skip-optional"] ?? false, + baseOptions, + scenarioMarkdown: scenarioMd, + }); + + const outPath = join(runDir, "transcript.json"); + const payload = { + meta: { + scenarioId: result.scenarioId, + scenarioFile: result.scenarioFile, + runDirectory: runDir, + agentWorkspaceCwd: agentCwd, + evalHarness: { + preStepCount: harnessConfig.preSteps.length, + agentCwd: harnessConfig.agentCwd, + }, + repositoryRoot: REPO_ROOT, + completedAt: new Date().toISOString(), + sessionId: result.sessionId, + turns: result.turns, + }, + messages: result.allMessages, + }; + + await writeFile(outPath, JSON.stringify(payload, null, 2), "utf8"); + console.error(`Wrote ${outPath}`); + + if (wantScore) { + const report = await scoreRunFile(outPath); + const scorePath = join(runDir, "heuristic-score.json"); + await writeFile(scorePath, `${JSON.stringify(report, null, 2)}\n`, "utf8"); + console.error(`Wrote ${scorePath} (transcript: ${report.transcript.passed}/${report.transcript.total}, overallTranscriptPass=${String(report.overallTranscriptPass)})`); + if (report.overallTranscriptPass === false) { + anyScoreFailure = true; + } + } + + if (wantLlm) { + const scenarioPathForJudge = scenarioMdPathFromRun(EVAL_ROOT, result.scenarioFile); + const llmReport = await llmJudgeRun({ + runPath: outPath, + scenarioMdPath: scenarioPathForJudge, + apiKey: process.env.ANTHROPIC_API_KEY!.trim(), + }); + const llmPath = join(runDir, "llm-score.json"); + await writeFile(llmPath, `${JSON.stringify(llmReport, null, 2)}\n`, "utf8"); + console.error( + `Wrote ${llmPath} (LLM overall_transcript_pass=${String(llmReport.overall_transcript_pass)})`, + ); + if (!llmReport.overall_transcript_pass) { + anyScoreFailure = true; + } + } + } catch (err) { + const message = err instanceof Error ? err.message : String(err); + const stack = err instanceof Error ? err.stack : undefined; + await writeFile( + sidecars.failure, + `${JSON.stringify({ failedAt: new Date().toISOString(), message, stack, runDirectory: runDir }, null, 2)}\n`, + "utf8", + ); + console.error(`Eval scenario failed (${file}):`, err); + throw err; + } finally { + activeHarnessAbortContext = null; + } + } + + if (anyScoreFailure) { + process.exit(1); + } +} + +main().catch((err) => { + console.error(err); + process.exit(1); +}); diff --git a/docs/agent-evaluation/src/score-eval.ts b/docs/agent-evaluation/src/score-eval.ts new file mode 100644 index 000000000..4c720060d --- /dev/null +++ b/docs/agent-evaluation/src/score-eval.ts @@ -0,0 +1,183 @@ +/** + * CLI: score a transcript JSON from npm run eval. + * + * Usage: + * npm run score -- --run results/runs/2026-...-scenario-01.json + * npm run score -- --latest + * npm run score -- --latest --scenario 01 + * npm run score -- --run .json --llm --write # Anthropic judge → .llm-score.json + */ + +import { readFile, writeFile } from "node:fs/promises"; +import { join, dirname } from "node:path"; +import { fileURLToPath } from "node:url"; +import { parseArgs } from "node:util"; +import dotenv from "dotenv"; +import { + formatLlmReportHuman, + llmJudgeRun, + scenarioMdPathFromRun, + type LlmJudgeReport, +} from "./llm-judge.js"; +import { + findLatestRunFile, + formatScoreReportHuman, + resolveTranscriptJsonPath, + scoreRunFile, + scoreSidecarPaths, + type ScoreReport, +} from "./score-transcript.js"; + +const __dirname = dirname(fileURLToPath(import.meta.url)); +const EVAL_ROOT = join(__dirname, ".."); +dotenv.config({ path: join(EVAL_ROOT, ".env") }); + +const RUNS_DIR = join(EVAL_ROOT, "results", "runs"); + +async function main(): Promise { + const { values, positionals } = parseArgs({ + options: { + run: { type: "string" }, + latest: { type: "boolean", default: false }, + scenario: { type: "string" }, + json: { type: "boolean", default: false }, + write: { type: "boolean", default: false }, + llm: { type: "boolean", default: false }, + "no-heuristic": { type: "boolean", default: false }, + help: { type: "boolean", short: "h", default: false }, + }, + allowPositionals: true, + }); + + if (values.help) { + console.log(` +Score an eval transcript. + + npm run score -- --run results/runs/-scenario-01/transcript.json + npm run score -- --run results/runs/-scenario-01 # directory ok + npm run score -- --latest [--scenario 01] + npm run score -- --write # heuristic-score.json + llm-score.json in run dir + npm run score -- --llm [--write] # Anthropic judge (needs ANTHROPIC_API_KEY) + npm run score -- --llm --no-heuristic # LLM only (no regex heuristic) + +Heuristic: src/score-transcript.ts. LLM: reads scenarios/*.md Success criteria + assistant text; model from EVAL_SCORE_MODEL (default claude-sonnet-4-20250514). + +Options: + --run transcript.json, a run directory, or legacy flat *-scenario-NN.json + --latest Newest transcript (nested run dir or legacy flat file) + --scenario With --latest, filter scenario-0 + --json Print machine-readable JSON only (last scorer: heuristic or LLM if --llm-only) + --write Write sidecar file(s) for enabled scorers + --llm Call Anthropic Messages API to judge against Success criteria + --no-heuristic Skip regex heuristic (use with --llm for API-only scoring) +`); + process.exit(0); + } + + let runPath: string | null = values.run ?? null; + if (values.latest) { + runPath = await findLatestRunFile(RUNS_DIR, values.scenario); + if (!runPath) { + console.error("No matching run JSON in", RUNS_DIR); + process.exit(1); + } + } + + if (!runPath && positionals[0]) { + runPath = positionals[0]; + } + + if (!runPath) { + console.error("Provide --run or --latest"); + process.exit(1); + } + + let transcriptPath: string; + try { + transcriptPath = await resolveTranscriptJsonPath(runPath); + } catch (e) { + console.error(String(e)); + process.exit(1); + } + + const doHeuristic = !values["no-heuristic"]; + const doLlm = values.llm; + + if (!doHeuristic && !doLlm) { + console.error("Nothing to run: enable heuristic (default) or pass --llm"); + process.exit(1); + } + + let heuristicReport: ScoreReport | null = null; + let llmReport: LlmJudgeReport | null = null; + let fail = false; + + if (doHeuristic) { + heuristicReport = await scoreRunFile(transcriptPath); + if (heuristicReport.overallTranscriptPass === false) { + fail = true; + } + } + + if (doLlm) { + const key = process.env.ANTHROPIC_API_KEY?.trim(); + if (!key) { + console.error("Missing ANTHROPIC_API_KEY for --llm"); + process.exit(1); + } + const raw = await readFile(transcriptPath, "utf8"); + const meta = JSON.parse(raw) as { meta?: { scenarioFile?: string } }; + const scenarioPath = scenarioMdPathFromRun(EVAL_ROOT, meta.meta?.scenarioFile); + llmReport = await llmJudgeRun({ + runPath: transcriptPath, + scenarioMdPath: scenarioPath, + apiKey: key, + }); + if (!llmReport.overall_transcript_pass) { + fail = true; + } + } + + if (values.json) { + if (doLlm && values["no-heuristic"]) { + console.log(JSON.stringify(llmReport, null, 2)); + } else if (doHeuristic && !doLlm) { + console.log(JSON.stringify(heuristicReport, null, 2)); + } else { + console.log( + JSON.stringify({ heuristic: heuristicReport, llm: llmReport }, null, 2), + ); + } + } else { + if (heuristicReport) { + console.log(formatScoreReportHuman(heuristicReport)); + console.log(""); + } + if (llmReport) { + console.log(formatLlmReportHuman(llmReport)); + } + } + + if (values.write) { + const { heuristic: heuristicOut, llm: llmOut } = scoreSidecarPaths(transcriptPath); + if (heuristicReport) { + await writeFile(heuristicOut, `${JSON.stringify(heuristicReport, null, 2)}\n`, "utf8"); + if (!values.json) { + console.error(`Wrote ${heuristicOut}`); + } + } + if (llmReport) { + await writeFile(llmOut, `${JSON.stringify(llmReport, null, 2)}\n`, "utf8"); + if (!values.json) { + console.error(`Wrote ${llmOut}`); + } + } + } + + process.exit(fail ? 1 : 0); +} + +main().catch((e) => { + console.error(e); + process.exit(1); +}); diff --git a/docs/agent-evaluation/src/score-transcript.ts b/docs/agent-evaluation/src/score-transcript.ts new file mode 100644 index 000000000..2dbfb3d59 --- /dev/null +++ b/docs/agent-evaluation/src/score-transcript.ts @@ -0,0 +1,1233 @@ +/** + * Heuristic transcript scoring for agent eval runs. + * Maps to human checklist items in scenarios/*.md — not a substitute for execution verification. + */ + +import { readFile, readdir, stat } from "node:fs/promises"; +import { basename, dirname, join } from "node:path"; + +export interface CheckResult { + readonly id: string; + readonly pass: boolean; + readonly detail: string; +} + +export interface TranscriptScore { + readonly passed: number; + readonly total: number; + readonly checks: readonly CheckResult[]; + readonly fraction: number; +} + +export interface ScoreReport { + readonly runFile: string; + readonly scenarioId: string; + readonly scenarioFile: string; + readonly transcript: TranscriptScore; + /** Automated harness does not run Outpost; use `scripts/execute-ci-artifacts.sh` or CI for live 01/02 smoke. */ + readonly execution: { readonly status: "not_automated"; readonly note: string }; + /** null when no automated transcript rubric exists for this scenario yet */ + readonly overallTranscriptPass: boolean | null; +} + +interface RunJson { + meta?: { + scenarioId?: string; + scenarioFile?: string; + turns?: readonly { label?: string; messageCount?: number }[]; + }; + messages?: unknown[]; +} + +export function extractAssistantText(messages: unknown[] | undefined): string { + if (!messages?.length) return ""; + let out = ""; + for (const m of messages) { + if (typeof m !== "object" || m === null) continue; + const o = m as Record; + if (o.type !== "assistant") continue; + const inner = o.message; + if (typeof inner !== "object" || inner === null) continue; + const msg = inner as Record; + const content = msg.content; + if (!Array.isArray(content)) continue; + for (const block of content) { + if (typeof block !== "object" || block === null) continue; + const b = block as Record; + if (b.type === "text" && typeof b.text === "string") { + out += b.text; + } + } + } + return out; +} + +const MAX_TOOL_SCORING_CHARS = 600_000; + +/** + * Assistant-visible text plus tool inputs and Write/Edit file bodies from the transcript. + * Heuristics use this so scored content includes material that only appeared in tool calls/results. + */ +export function extractTranscriptScoringText(messages: unknown[] | undefined): string { + const assistant = extractAssistantText(messages); + if (!messages?.length) return assistant; + const chunks: string[] = []; + let budget = MAX_TOOL_SCORING_CHARS; + + const push = (s: string) => { + if (budget <= 0) return; + const take = s.slice(0, budget); + chunks.push(take); + budget -= take.length; + }; + + for (const m of messages) { + if (typeof m !== "object" || m === null) continue; + const o = m as Record; + + if (o.type === "assistant") { + const inner = o.message; + if (typeof inner !== "object" || inner === null) continue; + const content = (inner as Record).content; + if (!Array.isArray(content)) continue; + for (const block of content) { + if (typeof block !== "object" || block === null) continue; + const b = block as Record; + if (b.type !== "tool_use") continue; + const input = b.input; + if (input !== undefined) { + try { + push(`\n[tool_use ${String(b.name ?? "?")}]\n${JSON.stringify(input)}\n`); + } catch { + push(`\n[tool_use ${String(b.name ?? "?")}]\n`); + } + } + } + continue; + } + + if (o.type === "user") { + const tur = o.tool_use_result; + if (typeof tur === "object" && tur !== null) { + const t = tur as Record; + if (typeof t.content === "string") { + push(`\n[tool_result content]\n${t.content}\n`); + } + if (typeof t.newContent === "string") { + push(`\n[tool_result newContent]\n${t.newContent}\n`); + } + } + const inner = o.message; + if (typeof inner === "object" && inner !== null) { + const content = (inner as Record).content; + if (Array.isArray(content)) { + for (const block of content) { + if (typeof block !== "object" || block === null) continue; + const b = block as Record; + if (b.type === "tool_result" && typeof b.content === "string") { + push(`\n[tool_result]\n${b.content}\n`); + } + } + } + } + } + } + + return `${assistant}\n\n--- tool corpus ---\n${chunks.join("")}`; +} + +function hadOptionalSecondUserTurn(meta: RunJson["meta"]): boolean { + const turns = meta?.turns ?? []; + return turns.some((t) => { + const l = (t.label ?? "").toLowerCase(); + return l.includes("turn 2") || l.includes("optional"); + }); +} + +/** Likely pasted API key (not env var reference). */ +function containsLikelyLeakedKey(text: string): boolean { + if (/Bearer\s+sk-ant-api/i.test(text)) return true; + if (/Bearer\s+[a-zA-Z0-9_-]{40,}/.test(text)) return true; + return false; +} + +/** + * Option 3 (08–10): corpus should show publish on a real domain path, not only a synthetic + * “test event” / publish-test helper. Multiple publish sites, or one publish without test-only + * markers, passes. Weak signal — confirm with scenario Success criteria + execution smoke. + */ +function corpusSuggestsPublishBeyondTestOnly(corpus: string): boolean { + const t = corpus; + const publishHits = t.match(/publish\.event|Publish\.Event|PublishEvent/gi); + if (!publishHits?.length) return false; + if (publishHits.length >= 2) return true; + const lower = t.toLowerCase(); + const testish = + /publish-test|publish_test|publishtest|test_publish|send test|synthetic.*(event|publish)|test event/.test( + lower, + ); + if (!testish) return true; + const domainish = + /signup|register|user\.created|item\.|order\.|after_commit|post_save|on_.*create|createuser|create.?item|router\.(post|put|patch)|@router\.(post|put|patch)|handler\.|func.*create|def create_/.test( + lower, + ) && /publish|outpost/.test(lower); + return domainish; +} + +function scoreScenario01(corpus: string, assistant: string, meta: RunJson["meta"]): TranscriptScore { + const t = corpus; + const lower = t.toLowerCase(); + const checks: CheckResult[] = []; + + const managed = + t.includes("api.outpost.hookdeck.com/2025-07-01") || + /\$OUTPOST_API_BASE_URL/.test(t); + // Self-hosted snippet must not be what the assistant told the user to run (tool corpus can quote docs). + const selfHostedInUserGuidance = /\blocalhost:3333\/api\/v1\b/.test(assistant); + checks.push({ + id: "managed_base_url", + pass: managed && !selfHostedInUserGuidance, + detail: !managed + ? "Expected api.outpost.hookdeck.com/2025-07-01 or $OUTPOST_API_BASE_URL" + : selfHostedInUserGuidance + ? "Assistant guidance includes localhost:3333/api/v1 (self-hosted) as primary" + : "Uses managed API base (or OUTPOST_API_BASE_URL); no self-hosted path in assistant guidance", + }); + + const tenantPut = + /PUT|put/i.test(t) && + (t.includes("/tenants/") || t.includes("/tenants/$") || t.includes("/tenants/${")); + checks.push({ + id: "tenant_put", + pass: tenantPut, + detail: tenantPut ? "PUT …/tenants/… present" : "Expected PUT with /tenants/ path", + }); + + const dest = + lower.includes("webhook") && + (t.includes("/destinations") || t.includes("/destinations\"")) && + (lower.includes("post") || t.includes("-X POST") || t.includes("-X post")); + checks.push({ + id: "destination_webhook", + pass: dest, + detail: dest ? "POST destinations with webhook" : "Expected POST …/destinations with webhook type", + }); + + const publish = + (t.includes("/publish") || t.includes("/publish\"")) && + (lower.includes("post") || t.includes("-X POST")); + checks.push({ + id: "publish_post", + pass: publish, + detail: publish ? "POST …/publish present" : "Expected POST publish", + }); + + const afterPublish = t.split(/\/publish/i).pop() ?? t; + // Tool corpus JSON-stringifies Write bodies, so bash-escaped keys look like \"data\": not "data": + const wrongPayload = + /"payload"\s*:/.test(afterPublish) || /\\"payload\\"\s*:/.test(afterPublish); + const hasData = + /"data"\s*:/.test(afterPublish) || /\\"data\\"\s*:/.test(afterPublish); + checks.push({ + id: "publish_body_data_not_payload", + pass: publish && !wrongPayload && hasData, + detail: !publish + ? "N/A (no publish block)" + : wrongPayload + ? 'Found "payload" after /publish — Outpost expects "data"' + : hasData + ? 'Publish section uses "data"' + : 'Missing "data" in publish JSON (check manually)', + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const verifyTurn = hadOptionalSecondUserTurn(meta); + if (verifyTurn) { + const verify = + lower.includes("hookdeck") && + (lower.includes("console") || lower.includes("dashboard") || lower.includes("log")); + checks.push({ + id: "verification_console_or_logs", + pass: verify, + detail: verify + ? "Turn 2+ mentions Hookdeck Console / dashboard / logs" + : "Optional verify turn ran but no Console/dashboard/logs mention found", + }); + } + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { + passed, + total, + checks, + fraction: total ? passed / total : 0, + }; +} + +function scoreScenario02(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const checks: CheckResult[] = []; + + const sdk = /@hookdeck\/outpost-sdk\b/.test(t); + checks.push({ + id: "ts_sdk_dependency", + pass: sdk, + detail: sdk ? "References @hookdeck/outpost-sdk" : "Expected @hookdeck/outpost-sdk in code or package.json", + }); + + const client = /new\s+Outpost\s*\(|Outpost\s*\(\s*\{/.test(t); + checks.push({ + id: "outpost_client", + pass: client, + detail: client ? "Constructs Outpost client" : "Expected new Outpost(…) or Outpost({ … })", + }); + + const envKey = /process\.env\.OUTPOST_API_KEY|OUTPOST_API_KEY/.test(t); + checks.push({ + id: "env_api_key", + pass: envKey, + detail: envKey ? "Uses OUTPOST_API_KEY from env" : "Expected process.env.OUTPOST_API_KEY (or documented env)", + }); + + const upsert = /tenants\.upsert|tenants\?\.upsert/.test(t); + checks.push({ + id: "tenants_upsert", + pass: upsert, + detail: upsert ? "Calls tenants.upsert" : "Expected tenants.upsert", + }); + + const dest = /destinations\.create|destinations\?\.create/.test(t); + checks.push({ + id: "destinations_create", + pass: dest, + detail: dest ? "Calls destinations.create" : "Expected destinations.create", + }); + + const pub = /publish\.event|publish\?\.event/.test(t); + checks.push({ + id: "publish_event", + pass: pub, + detail: pub ? "Calls publish.event" : "Expected publish.event", + }); + + const hookUrl = /OUTPOST_TEST_WEBHOOK_URL/.test(t); + checks.push({ + id: "webhook_env", + pass: hookUrl, + detail: hookUrl ? "Uses OUTPOST_TEST_WEBHOOK_URL" : "Expected OUTPOST_TEST_WEBHOOK_URL for webhook URL", + }); + + const run = /npx\s+tsx\b|tsx\s+\S+\.ts\b|ts-node\b|node\s+.*\.ts\b/.test(t); + checks.push({ + id: "run_instructions", + pass: run, + detail: run ? "Mentions npx tsx / ts-node / running .ts" : "Expected run instructions (e.g. npx tsx …)", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +function scoreScenario03(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const checks: CheckResult[] = []; + + const imp = /from\s+outpost_sdk\s+import|import\s+outpost_sdk/.test(t); + checks.push({ + id: "python_sdk_import", + pass: imp, + detail: imp ? "Imports outpost_sdk" : "Expected `from outpost_sdk import …` or import outpost_sdk", + }); + + const client = /Outpost\s*\(/.test(t); + checks.push({ + id: "outpost_client", + pass: client, + detail: client ? "Constructs Outpost(…)" : "Expected Outpost(…) client", + }); + + const upsert = /tenants\.upsert|tenants\?\.upsert/.test(t); + checks.push({ + id: "tenants_upsert", + pass: upsert, + detail: upsert ? "Calls tenants.upsert" : "Expected tenants.upsert", + }); + + const dest = /destinations\.create|destinations\?\.create/.test(t); + checks.push({ + id: "destinations_create", + pass: dest, + detail: dest ? "Calls destinations.create" : "Expected destinations.create", + }); + + const pub = /publish\.event|publish\?\.event/.test(t); + checks.push({ + id: "publish_event", + pass: pub, + detail: pub ? "Calls publish.event" : "Expected publish.event", + }); + + const env = /os\.environ|getenv\s*\(\s*["']OUTPOST_API_KEY/.test(t); + checks.push({ + id: "env_api_key", + pass: env, + detail: env ? "Reads API key from environment" : "Expected os.environ or getenv for OUTPOST_API_KEY", + }); + + const hookUrl = /OUTPOST_TEST_WEBHOOK_URL/.test(t); + checks.push({ + id: "webhook_env", + pass: hookUrl, + detail: hookUrl ? "Uses OUTPOST_TEST_WEBHOOK_URL" : "Expected OUTPOST_TEST_WEBHOOK_URL", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +function scoreScenario04(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const checks: CheckResult[] = []; + + const mod = /hookdeck\/outpost.*outpost-go|outpost-go|outpostgo/.test(t); + checks.push({ + id: "go_sdk_module", + pass: mod, + detail: mod ? "References outpost-go / outpostgo" : "Expected github.com/hookdeck/outpost/.../outpost-go or outpostgo", + }); + + const newClient = /outpostgo\.New\s*\(|\bNew\s*\(\s*context\./.test(t); + checks.push({ + id: "go_client_new", + pass: newClient, + detail: newClient ? "Creates client with New(…)" : "Expected outpostgo.New(…) or similar", + }); + + const sec = /WithSecurity|WithServerURL/.test(t); + checks.push({ + id: "go_client_options", + pass: sec, + detail: sec ? "Uses WithSecurity or WithServerURL" : "Expected WithSecurity (and optional WithServerURL)", + }); + + const upsert = /Tenants\.Upsert|\.Upsert\s*\(/.test(t); + checks.push({ + id: "tenants_upsert", + pass: upsert, + detail: upsert ? "Calls Tenants.Upsert" : "Expected Tenants.Upsert", + }); + + const dest = /Destinations\.Create|CreateDestinationCreateWebhook/.test(t); + checks.push({ + id: "destinations_create", + pass: dest, + detail: dest ? "Creates webhook destination" : "Expected Destinations.Create / CreateDestinationCreateWebhook", + }); + + const pub = /Publish\.Event|\.Event\s*\(/.test(t); + checks.push({ + id: "publish_event", + pass: pub, + detail: pub ? "Calls Publish.Event" : "Expected Publish.Event", + }); + + const envKey = /Getenv\s*\(\s*["']OUTPOST_API_KEY["']/.test(t); + checks.push({ + id: "env_api_key", + pass: envKey, + detail: envKey ? "Reads OUTPOST_API_KEY via os.Getenv" : "Expected os.Getenv(\"OUTPOST_API_KEY\")", + }); + + const hookUrl = /OUTPOST_TEST_WEBHOOK_URL/.test(t); + checks.push({ + id: "webhook_env", + pass: hookUrl, + detail: hookUrl ? "Uses OUTPOST_TEST_WEBHOOK_URL" : "Expected OUTPOST_TEST_WEBHOOK_URL", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +function scoreScenario05(corpus: string, assistant: string, meta: RunJson["meta"]): TranscriptScore { + const t = corpus; + const lower = t.toLowerCase(); + const checks: CheckResult[] = []; + + const next = + /"next"\s*:\s*"/.test(t) || + /next\/dev|next\s+dev|next\.config/.test(t) || + /\bnext@\d/.test(t); + checks.push({ + id: "nextjs_signals", + pass: next, + detail: next ? "Next.js dependency or dev command present" : "Expected next in package.json or next dev / next.config", + }); + + const sdk = /@hookdeck\/outpost-sdk\b/.test(t); + checks.push({ + id: "outpost_ts_sdk", + pass: sdk, + detail: sdk ? "Uses @hookdeck/outpost-sdk" : "Expected @hookdeck/outpost-sdk in dependencies or imports", + }); + + const api = + /app\/api\/[^"'\s]+\/route\.(t|j)sx?/.test(t) || + /pages\/api\//.test(t) || + /["']\/api\/(destination|destinations|event|publish)/.test(t); + checks.push({ + id: "api_routes_layer", + pass: api, + detail: api ? "App/Pages API route layer present" : "Expected app/api/.../route or pages/api or /api/… fetches", + }); + + const twoFlows = + (/destination|webhook|subscribe/i.test(t) && /publish|event|send/i.test(t) && /\/api\//.test(t)) || + (t.includes("/api/destination") && t.includes("/api/event")); + checks.push({ + id: "destination_and_publish_surface", + pass: twoFlows, + detail: twoFlows + ? "Distinct destination + publish flows (URLs or labels)" + : "Expected separate destination registration and publish (e.g. two API routes or actions)", + }); + + const serverEnv = + /route\.(t|j)sx?[\s\S]{0,12000}process\.env\.OUTPOST_API_KEY|OUTPOST_API_KEY[\s\S]{0,800}(route\.(t|j)sx?|api\/)/i.test( + t, + ) || (/process\.env\.OUTPOST_API_KEY/.test(t) && /app\/api\//.test(t)); + checks.push({ + id: "server_env_outpost_key", + pass: serverEnv, + detail: serverEnv + ? "OUTPOST_API_KEY read server-side (e.g. API route)" + : "Expected process.env.OUTPOST_API_KEY in API route context", + }); + + const leakClient = /NEXT_PUBLIC_OUTPOST_API_KEY/.test(t); + checks.push({ + id: "no_next_public_api_key", + pass: !leakClient, + detail: leakClient + ? "NEXT_PUBLIC_OUTPOST_API_KEY would expose key to browser" + : "No NEXT_PUBLIC_OUTPOST_API_KEY", + }); + + const readme = /README/i.test(t) && /OUTPOST_API_KEY/.test(t); + checks.push({ + id: "readme_env", + pass: readme, + detail: readme ? "README mentions OUTPOST_API_KEY" : "Expected README with OUTPOST_API_KEY", + }); + + const managed = + !/\blocalhost:3333\/api\/v1\b/.test(t) && + (!/localhost:\d{2,5}\s*\/\s*api\/v1/.test(t) || /OUTPOST_API_BASE_URL/.test(t)); + checks.push({ + id: "managed_base_not_selfhosted", + pass: managed, + detail: managed + ? "No self-hosted localhost API path as default" + : "Avoid localhost:3333/api/v1 unless user asked for self-hosted", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const stressTurn = (meta?.turns?.length ?? 0) >= 4; + if (stressTurn) { + const hookdeckHint = + lower.includes("hookdeck") && + (lower.includes("console") || lower.includes("source") || lower.includes("dashboard")); + checks.push({ + id: "stress_public_url_hint", + pass: hookdeckHint, + detail: hookdeckHint + ? "Turn 3+ stress: mentions Hookdeck Console/Source/dashboard for webhook URL" + : "Stress turn present but no Hookdeck Console/Source hint found", + }); + } + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +function scoreScenario06(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const lower = t.toLowerCase(); + const checks: CheckResult[] = []; + + const fast = /FastAPI|from\s+fastapi\s+import/.test(t); + checks.push({ + id: "fastapi_framework", + pass: fast, + detail: fast ? "Uses FastAPI" : "Expected FastAPI import or class", + }); + + const sdk = /from\s+outpost_sdk\s+import|import\s+outpost_sdk|outpost_sdk/.test(t); + checks.push({ + id: "python_outpost_sdk", + pass: sdk, + detail: sdk ? "Uses outpost_sdk" : "Expected outpost_sdk import or usage", + }); + + const uv = /uvicorn/.test(lower); + checks.push({ + id: "uvicorn_documented", + pass: uv, + detail: uv ? "Mentions uvicorn" : "Expected uvicorn run command or import", + }); + + const envKey = /OUTPOST_API_KEY/.test(t) && (/os\.environ|getenv/.test(t) || /Depends?\(/.test(t)); + checks.push({ + id: "server_env_api_key", + pass: envKey, + detail: envKey ? "API key from environment on server" : "Expected OUTPOST_API_KEY via os.environ/getenv or settings", + }); + + const two = + (/destination|webhook/i.test(t) && /publish|event/i.test(t)) || + (/@app\.(get|post)|APIRouter/.test(t) && /publish/i.test(t) && /destination|webhook/i.test(t)); + checks.push({ + id: "register_and_publish_flow", + pass: two, + detail: two ? "Both destination/webhook and publish/event surfaced" : "Expected register webhook + publish flows", + }); + + const readme = /README/i.test(t) && /OUTPOST_API_KEY/.test(t); + checks.push({ + id: "readme_env", + pass: readme, + detail: readme ? "README mentions OUTPOST_API_KEY" : "Expected README with OUTPOST_API_KEY", + }); + + const hookOrDoc = /OUTPOST_TEST_WEBHOOK_URL|TEST_WEBHOOK|webhook\s*url/i.test(t); + checks.push({ + id: "webhook_url_documented", + pass: hookOrDoc, + detail: hookOrDoc ? "Webhook URL env or field documented" : "Expected OUTPOST_TEST_WEBHOOK_URL or webhook URL docs", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +function scoreScenario07(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const lower = t.toLowerCase(); + const checks: CheckResult[] = []; + + const httpLib = /"net\/http"|net\/http/.test(t) || /\bhttp\.HandleFunc\b/.test(t); + checks.push({ + id: "stdlib_http", + pass: httpLib, + detail: httpLib ? "Uses net/http" : "Expected net/http or http.HandleFunc", + }); + + const sdk = /hookdeck\/outpost.*outpost-go|outpostgo|CreateDestinationCreateWebhook/.test(t); + checks.push({ + id: "go_outpost_sdk", + pass: sdk, + detail: sdk ? "Uses Outpost Go SDK patterns" : "Expected outpost-go / CreateDestinationCreateWebhook", + }); + + const createWebhook = /CreateDestinationCreateWebhook/.test(t); + checks.push({ + id: "create_destination_webhook", + pass: createWebhook, + detail: createWebhook ? "CreateDestinationCreateWebhook present" : "Expected CreateDestinationCreateWebhook wrapper", + }); + + const htmlUi = / c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +/** Option 3 — integrate Outpost into an existing SaaS-style codebase (Next.js baseline). */ +function scoreScenario08(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const lower = t.toLowerCase(); + const checks: CheckResult[] = []; + + const baseline = + /leerob\/next-saas-starter|next-saas-starter/.test(t) || + (/git\s+clone\b/.test(lower) && /github\.com/.test(t)); + checks.push({ + id: "baseline_or_clone", + pass: baseline, + detail: baseline + ? "References next-saas-starter baseline or git clone from GitHub" + : "Expected clone/setup of the documented baseline (e.g. leerob/next-saas-starter)", + }); + + const sdk = /@hookdeck\/outpost-sdk\b/.test(t); + checks.push({ + id: "outpost_ts_sdk", + pass: sdk, + detail: sdk ? "Uses @hookdeck/outpost-sdk" : "Expected @hookdeck/outpost-sdk", + }); + + const integration = + /publish\.event|destinations\.create|tenants\.upsert/.test(t) || + /\/api\/.*outpost|outpost.*publish/i.test(t); + checks.push({ + id: "outpost_integration_calls", + pass: integration, + detail: integration + ? "Server-side Outpost client usage (publish / destinations / tenants)" + : "Expected publish.event, destinations.create, or tenants.upsert (or clear API wrapper)", + }); + + const topic = /user\.created|topic|TOPIC/.test(t); + checks.push({ + id: "topic_or_event_hook", + pass: topic, + detail: topic ? "Topic or event hook documented" : "Expected topic from prompt or explicit event naming", + }); + + const serverKey = + /process\.env\.OUTPOST_API_KEY/.test(t) && + !/NEXT_PUBLIC_OUTPOST_API_KEY/.test(t); + checks.push({ + id: "server_env_key_only", + pass: serverKey, + detail: serverKey + ? "OUTPOST_API_KEY read server-side; no NEXT_PUBLIC_ key" + : "Expected process.env.OUTPOST_API_KEY and no NEXT_PUBLIC_OUTPOST_API_KEY", + }); + + const destDoc = + /destination|webhook\s*url|register.*webhook/i.test(t) && /tenant|customer|team/i.test(lower); + checks.push({ + id: "destination_per_customer_doc", + pass: destDoc, + detail: destDoc + ? "Documents webhook destination registration per tenant/customer (or team)" + : "Expected how operators register webhook URLs per customer/tenant", + }); + + const beyondTest = corpusSuggestsPublishBeyondTestOnly(t); + checks.push({ + id: "publish_beyond_test_only", + pass: beyondTest, + detail: beyondTest + ? "Publish appears beyond a synthetic test-only path (or multiple publish sites)" + : "Expected domain publish (not only publish-test / send test) — see scenario Success criteria", + }); + + const fullStackSignals = + /(attempt|retry|list\s*attempt|destination[_-]?scoped|\/activity|\/attempts|events?\s*\(|list\s*events|manual\s*retry)/i.test( + t, + ) && /(outpost|destination|tenant)/i.test(t); + checks.push({ + id: "delivery_activity_signals", + pass: fullStackSignals, + detail: fullStackSignals + ? "Transcript mentions delivery visibility (attempts/events/retry/activity) with Outpost context" + : "Scenario 8 expects destination-scoped activity UI — see Building your own UI checklists + success criteria", + }); + + const testPublishSeparate = + /(test\s*publish|publish\s*test|send\s*test\s*event|\/api\/.*test|test.?event)/i.test(t); + checks.push({ + id: "separate_test_publish_signal", + pass: testPublishSeparate, + detail: testPublishSeparate + ? "Separate test publish / test event control mentioned" + : "Expected distinct test-publish path or control (see scenario 8 success criteria)", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +/** Option 3 — existing FastAPI SaaS baseline. */ +function scoreScenario09(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const lower = t.toLowerCase(); + const checks: CheckResult[] = []; + + const baseline = + /philipokiokio\/fastapi_saas_template|fastapi_saas_template|FastAPI_SAAS/i.test(t) || + /fastapi\/full-stack-fastapi-template|full-stack-fastapi-template|full_stack_fastapi_template/i.test( + t, + ) || + (/git\s+clone\b/.test(lower) && /github\.com/.test(t)); + checks.push({ + id: "baseline_or_clone", + pass: baseline, + detail: baseline + ? "References FastAPI baseline (full-stack template or legacy SaaS template) or git clone" + : "Expected clone/setup of fastapi/full-stack-fastapi-template (or documented alternative)", + }); + + const sdk = /from\s+outpost_sdk\s+import|import\s+outpost_sdk/.test(t); + checks.push({ + id: "python_outpost_sdk", + pass: sdk, + detail: sdk ? "Imports outpost_sdk" : "Expected outpost_sdk import", + }); + + const integration = + /publish\.event|destinations\.create|tenants\.upsert/.test(t); + checks.push({ + id: "outpost_integration_calls", + pass: integration, + detail: integration ? "Uses tenants/destinations/publish APIs" : "Expected SDK API calls for Outpost", + }); + + const hook = + /signal|event|webhook|post_save|after_create|lifecycle|router\.(post|put)/i.test(t) && + /publish|outpost/i.test(lower); + checks.push({ + id: "domain_event_hook", + pass: hook, + detail: hook + ? "Hooks Outpost publish into an application event or route" + : "Expected tying publish to a domain event or HTTP handler", + }); + + const env = /OUTPOST_API_KEY/.test(t) && (/os\.environ|getenv|settings|Depends/.test(t)); + checks.push({ + id: "env_api_key", + pass: env, + detail: env ? "API key from environment / settings" : "Expected OUTPOST_API_KEY from env", + }); + + const clientKeyLeak = + /NEXT_PUBLIC_OUTPOST_API_KEY\s*[=:]/.test(t) || + /VITE_OUTPOST_API_KEY\s*[=:]/.test(t) || + /process\.env\.NEXT_PUBLIC_OUTPOST_API_KEY\b/.test(t) || + /import\.meta\.env\.(?:VITE_OUTPOST_API_KEY|NEXT_PUBLIC_OUTPOST_API_KEY)\b/.test(t); + checks.push({ + id: "no_client_bundled_outpost_key", + pass: !clientKeyLeak, + detail: clientKeyLeak + ? "Corpus suggests Outpost API key wired into client-visible env — keep server-side only" + : "No client env assignment/access for OUTPOST_API_KEY (NEXT_PUBLIC_/VITE_) in corpus", + }); + + const beyondTest = corpusSuggestsPublishBeyondTestOnly(t); + checks.push({ + id: "publish_beyond_test_only", + pass: beyondTest, + detail: beyondTest + ? "Publish appears beyond a synthetic test-only path (or multiple publish sites)" + : "Expected domain publish (not only publish-test / send test) — see scenario Success criteria", + }); + + const readmeOrEnvDocs = + /OUTPOST_API_KEY/.test(t) && + /README|development\.md|\.env\.example|backend\/readme/i.test(t); + checks.push({ + id: "readme_or_env_docs", + pass: readmeOrEnvDocs, + detail: readmeOrEnvDocs + ? "README / development.md / .env.example (or similar) touches OUTPOST_API_KEY" + : "Expected operator docs listing OUTPOST env vars (see scenario Success criteria)", + }); + + const fullStackSignals09 = + /(attempt|retry|list\s*attempt|destination[_-]?scoped|\/activity|\/attempts|events?\s*\(|list\s*events|manual\s*retry)/i.test( + t, + ) && /(outpost|destination|tenant)/i.test(t); + checks.push({ + id: "delivery_activity_signals", + pass: fullStackSignals09, + detail: fullStackSignals09 + ? "Transcript mentions delivery visibility (attempts/events/retry/activity) with Outpost context" + : "Scenario 9 expects full-stack activity UI — see Building your own UI checklists + success criteria", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +/** Option 3 — existing Go SaaS/API baseline. */ +function scoreScenario10(corpus: string, assistant: string): TranscriptScore { + const t = corpus; + const lower = t.toLowerCase(); + const checks: CheckResult[] = []; + + const baseline = + /devinterface\/startersaas-go-api|startersaas-go-api|StarterSaaS/.test(t) || + (/git\s+clone\b/.test(lower) && /github\.com/.test(t)); + checks.push({ + id: "baseline_or_clone", + pass: baseline, + detail: baseline + ? "References StarterSaaS Go API baseline or git clone" + : "Expected clone/setup of devinterface/startersaas-go-api (or documented alternative)", + }); + + const sdk = /hookdeck\/outpost.*outpost-go|outpostgo\.|github\.com\/hookdeck\/outpost/.test(t); + checks.push({ + id: "go_outpost_sdk", + pass: sdk, + detail: sdk ? "Uses Outpost Go module" : "Expected outpost-go / outpostgo import path", + }); + + const integration = /Publish\.Event|Tenants\.|Destinations\./.test(t); + checks.push({ + id: "outpost_integration_calls", + pass: integration, + detail: integration ? "Uses Outpost Go client operations" : "Expected Publish / Tenants / Destinations usage", + }); + + const hook = + /handler|middleware|OnUser|event|CreateUser|signup|register/i.test(t) && /publish|outpost/i.test(lower); + checks.push({ + id: "domain_event_hook", + pass: hook, + detail: hook + ? "Hooks publish into a handler or domain flow" + : "Expected publish tied to a concrete code path", + }); + + const envKey = /Getenv\s*\(\s*["']OUTPOST_API_KEY["']/.test(t); + checks.push({ + id: "env_api_key", + pass: envKey, + detail: envKey ? "Reads OUTPOST_API_KEY via os.Getenv" : "Expected os.Getenv(\"OUTPOST_API_KEY\")", + }); + + const beyondTest = corpusSuggestsPublishBeyondTestOnly(t); + checks.push({ + id: "publish_beyond_test_only", + pass: beyondTest, + detail: beyondTest + ? "Publish appears beyond a synthetic test-only path (or multiple publish sites)" + : "Expected domain publish (not only publish-test / send test) — see scenario Success criteria", + }); + + checks.push({ + id: "no_key_in_reply", + pass: !containsLikelyLeakedKey(assistant), + detail: containsLikelyLeakedKey(assistant) + ? "Possible raw API key in assistant-visible text" + : "No obvious raw Bearer secret in assistant text", + }); + + const passed = checks.filter((c) => c.pass).length; + const total = checks.length; + return { passed, total, checks, fraction: total ? passed / total : 0 }; +} + +/** Scenarios with a non-empty regex rubric in this file (used for exit / overallTranscriptPass). */ +export const SCENARIO_IDS_WITH_HEURISTIC_RUBRIC: ReadonlySet = new Set([ + "01", + "02", + "03", + "04", + "05", + "06", + "07", + "08", + "09", + "10", +]); + +function scoreByScenarioId( + scenarioId: string, + corpus: string, + assistant: string, + meta: RunJson["meta"], +): TranscriptScore { + switch (scenarioId) { + case "01": + return scoreScenario01(corpus, assistant, meta); + case "02": + return scoreScenario02(corpus, assistant); + case "03": + return scoreScenario03(corpus, assistant); + case "04": + return scoreScenario04(corpus, assistant); + case "05": + return scoreScenario05(corpus, assistant, meta); + case "06": + return scoreScenario06(corpus, assistant); + case "07": + return scoreScenario07(corpus, assistant); + case "08": + return scoreScenario08(corpus, assistant); + case "09": + return scoreScenario09(corpus, assistant); + case "10": + return scoreScenario10(corpus, assistant); + default: + return { + passed: 0, + total: 0, + checks: [], + fraction: 0, + }; + } +} + +export async function scoreRunJson( + runPath: string, + raw: string, +): Promise { + const data = JSON.parse(raw) as RunJson; + const scenarioId = data.meta?.scenarioId ?? "unknown"; + const scenarioFile = data.meta?.scenarioFile ?? `${scenarioId}-unknown.md`; + const assistantOnly = extractAssistantText(data.messages); + const corpus = extractTranscriptScoringText(data.messages); + const transcript = scoreByScenarioId(scenarioId, corpus, assistantOnly, data.meta); + + const hasRubric = SCENARIO_IDS_WITH_HEURISTIC_RUBRIC.has(scenarioId); + const overallTranscriptPass = hasRubric + ? transcript.total > 0 && transcript.passed === transcript.total + : null; + + return { + runFile: runPath, + scenarioId, + scenarioFile, + transcript, + execution: { + status: "not_automated", + note: + "Execution (live Outpost) is not scored here. After running curls/code with OUTPOST_API_KEY, mark the Execution row in scenarios/*.md or results/RUN-RECORDING.template.md.", + }, + overallTranscriptPass, + }; +} + +export async function scoreRunFile(runPath: string): Promise { + const raw = await readFile(runPath, "utf8"); + return scoreRunJson(runPath, raw); +} + +/** Resolve a run directory or legacy flat JSON path to transcript.json path. */ +export async function resolveTranscriptJsonPath(input: string): Promise { + let st; + try { + st = await stat(input); + } catch { + throw new Error(`Path not found: ${input}`); + } + if (st.isDirectory()) { + const t = join(input, "transcript.json"); + try { + await stat(t); + } catch { + throw new Error(`No transcript.json in directory: ${input}`); + } + return t; + } + return input; +} + +/** Sidecar score paths: nested run dir vs legacy flat *-scenario-NN.json */ +export function scoreSidecarPaths(transcriptPath: string): { + heuristic: string; + llm: string; +} { + if (basename(transcriptPath) === "transcript.json") { + const dir = dirname(transcriptPath); + return { + heuristic: join(dir, "heuristic-score.json"), + llm: join(dir, "llm-score.json"), + }; + } + return { + heuristic: transcriptPath.replace(/\.json$/i, ".score.json"), + llm: transcriptPath.replace(/\.json$/i, ".llm-score.json"), + }; +} + +export async function findLatestRunFile( + runsDir: string, + scenarioId?: string, +): Promise { + const entries = await readdir(runsDir, { withFileTypes: true }); + /** Mutable holder so TS control flow tracks updates across async `consider` calls. */ + const latest = { path: null as string | null, mtime: -Infinity }; + + const consider = async (transcriptPath: string) => { + try { + const st = await stat(transcriptPath); + if (st.mtimeMs > latest.mtime) { + latest.path = transcriptPath; + latest.mtime = st.mtimeMs; + } + } catch { + /* skip */ + } + }; + + for (const ent of entries) { + const name = ent.name; + if (ent.isDirectory()) { + if (!/-scenario-\d{2}$/i.test(name)) continue; + if ( + scenarioId && + !name.endsWith(`scenario-${scenarioId.padStart(2, "0")}`) + ) { + continue; + } + await consider(join(runsDir, name, "transcript.json")); + continue; + } + if ( + ent.isFile() && + /-scenario-\d{2}\.json$/i.test(name) && + !name.endsWith(".score.json") && + !name.endsWith(".llm-score.json") + ) { + if ( + scenarioId && + !name.includes(`scenario-${scenarioId.padStart(2, "0")}`) + ) { + continue; + } + await consider(join(runsDir, name)); + } + } + + return latest.path; +} + +export function formatScoreReportHuman(r: ScoreReport): string { + const lines: string[] = [ + `Transcript: ${r.runFile}`, + `Scenario: ${r.scenarioId} (${r.scenarioFile})`, + ]; + if (basename(r.runFile) === "transcript.json") { + lines.push(`Run directory (agent workspace): ${dirname(r.runFile)}`); + } + lines.push(""); + if (r.transcript.total === 0) { + lines.push("Transcript checks: (no automated rubric — add scorers in src/score-transcript.ts)"); + } else { + lines.push( + `Transcript checks: ${r.transcript.passed}/${r.transcript.total} passed (${Math.round(r.transcript.fraction * 100)}%)`, + ); + } + for (const c of r.transcript.checks) { + lines.push(` [${c.pass ? "PASS" : "FAIL"}] ${c.id}: ${c.detail}`); + } + lines.push(""); + lines.push(`Execution: ${r.execution.status} — ${r.execution.note}`); + lines.push(""); + lines.push( + `Overall transcript pass: ${ + r.overallTranscriptPass === null ? "N/A (no rubric)" : r.overallTranscriptPass ? "YES" : "NO" + }`, + ); + return lines.join("\n"); +} diff --git a/docs/agent-evaluation/tsconfig.json b/docs/agent-evaluation/tsconfig.json new file mode 100644 index 000000000..80fcf22d3 --- /dev/null +++ b/docs/agent-evaluation/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "target": "ES2022", + "module": "NodeNext", + "moduleResolution": "NodeNext", + "lib": ["ES2022"], + "strict": true, + "skipLibCheck": true, + "noEmit": true, + "esModuleInterop": true, + "verbatimModuleSyntax": true, + "resolveJsonModule": true + }, + "include": ["src/**/*.ts"] +} diff --git a/docs/apis/openapi.yaml b/docs/apis/openapi.yaml index d5c114488..ba3309cc6 100644 --- a/docs/apis/openapi.yaml +++ b/docs/apis/openapi.yaml @@ -7,7 +7,7 @@ info: contact: name: Outpost Support email: support@hookdeck.com - url: https://outpost.hookdeck.com/docs + url: https://hookdeck.com/docs/outpost security: - AdminApiKey: [] - TenantJwt: [] @@ -2012,7 +2012,10 @@ components: properties: key: type: string - description: The config key used to store and retrieve the field value. Matches the key in the destination's config or credentials object. + description: >- + Property name for this value inside the destination `config` or `credentials` object + on create/update (for example `url` for a webhook endpoint URL). This is the key used + to store and retrieve the field value in the destination's config or credentials object. example: "url" type: type: string @@ -3860,6 +3863,7 @@ paths: instructions: "Enter the URL..." config_fields: [ { + key: "url", type: "text", label: "URL", description: "The URL to send the webhook to.", @@ -3869,6 +3873,7 @@ paths: ] credential_fields: [ { + key: "secret", type: "text", label: "Secret", description: "Optional signing secret.", @@ -3883,30 +3888,35 @@ paths: config_fields: [ { + key: "brokers", type: "text", label: "Brokers", description: "Comma-separated list of Kafka broker addresses.", required: true, }, { + key: "topic", type: "text", label: "Topic", description: "The Kafka topic to publish messages to.", required: true, }, { + key: "tls", type: "checkbox", label: "TLS", description: "Enable TLS for the connection.", default: "true", }, { + key: "partition_key_template", type: "text", label: "Partition Key Template", description: "JMESPath template to extract the partition key from the event payload.", required: false, }, { + key: "sasl_mechanism", type: "select", label: "SASL Mechanism", description: "SASL authentication mechanism.", @@ -3921,12 +3931,14 @@ paths: credential_fields: [ { + key: "username", type: "text", label: "Username", description: "SASL username for authentication.", required: true, }, { + key: "password", type: "text", label: "Password", description: "SASL password for authentication.", @@ -3942,12 +3954,14 @@ paths: config_fields: [ { + key: "queue_url", type: "text", label: "Queue URL", description: "The URL of the SQS queue.", required: true, }, { + key: "endpoint", type: "text", label: "Endpoint", description: "Optional custom AWS endpoint URL.", @@ -3957,6 +3971,7 @@ paths: credential_fields: [ { + key: "key", type: "text", label: "Key", description: "AWS Access Key ID.", @@ -3964,6 +3979,7 @@ paths: sensitive: true, }, { + key: "secret", type: "text", label: "Secret", description: "AWS Secret Access Key.", @@ -3971,6 +3987,7 @@ paths: sensitive: true, }, { + key: "session", type: "text", label: "Session", description: "Optional AWS Session Token.", @@ -4015,6 +4032,7 @@ paths: # remote_setup_url is optional, omitted here config_fields: [ { + key: "url", type: "text", label: "URL", description: "The URL to send the webhook to.", @@ -4024,6 +4042,7 @@ paths: ] credential_fields: [ { + key: "secret", type: "text", label: "Secret", description: "Optional signing secret.", diff --git a/docs/content/concepts.mdoc b/docs/content/concepts.mdoc index 02b8a1b93..270764d7d 100644 --- a/docs/content/concepts.mdoc +++ b/docs/content/concepts.mdoc @@ -3,6 +3,23 @@ title: "Concepts" description: "Core concepts and architecture of Outpost: tenants, destinations, topics, events, and delivery attempts." --- +## How this fits your product + +If you run a **SaaS**, **platform**, or **API product** and want each of **your customers** to receive webhooks or other event destinations, Outpost gives you a **multi-tenant** control plane for that. + +At a high level, the same mental model as a single-tenant webhook product still applies: something happens in your system (**event**), it belongs to a category (**topic**), and the consumer cares about **where** it should be delivered (**URL**, queue, and so on). Outpost adds one layer: those subscriptions live **per customer** in your product, which maps to a **tenant** in Outpost. + +**Typical flow:** + +1. **Map your customer to a tenant** — Each organization, team, or account in your app should have a stable **tenant id** in Outpost (often the same id you already use internally). Create or upsert that tenant when the customer is ready to use outbound events (onboarding, first visit to integrations, and so on). +2. **Each tenant has zero or more destinations** — A **destination** is a concrete subscription: it combines a **destination type** (webhook, SQS, Hookdeck, …), one or more **topics** the customer wants to receive, and **type-specific configuration** (for a webhook, the HTTPS **endpoint URL** and signing secret; for a queue, the queue identifier; and so on). One tenant may have several destinations (for example production vs staging endpoints, or different systems). +3. **Your backend publishes events** — When something happens, your **server** calls the publish API (or SDK) with **`tenant_id`**, **`topic`**, and payload metadata. Outpost does **not** infer the tenant from the browser; publishing uses your **platform** credentials and explicit tenant scope. +4. **Outpost delivers to matching destinations** — For that tenant, every destination whose **topic subscription** includes the event’s topic gets a delivery attempt. A single publish can fan out to **many** destinations or to **none** if no destination subscribes to that topic. + +**What to build in your UI (conceptually):** screens or flows scoped to the **current customer** (tenant): list their **destinations**, **create or edit** a destination (choose type → choose topics → enter URL or other config), and surfaces for **events and delivery attempts** when you want users to inspect what was sent and how delivery behaved. Your UI talks to Outpost **through your backend** (recommended) or via **per-tenant JWT**, never by embedding your platform API key in the browser. See the [Building your own UI](/docs/outpost/guides/building-your-own-ui) guide for screen-level structure and API patterns. + +For topic subscription behavior (wildcard `*`, multiple topics, fan-out), see [Topics](/docs/outpost/features/topics). + ## Models - **Tenants** — A tenant represents a user, team, or organization in your product. All destinations and events are scoped to a tenant. @@ -86,6 +103,12 @@ The following destination types are available for your tenants to configure: - [Kafka](/docs/outpost/destinations/kafka) - Amazon EventBridge (planned) +**Hookdeck Outpost** is the same [open-source Outpost](https://github.com/hookdeck/outpost) project, operated on Hookdeck’s infrastructure. We do not maintain a separate hosted fork; what we run tracks the public codebase. + +If there is an event destination type you would like to see supported, [open a feature request on GitHub](https://github.com/hookdeck/outpost/issues/new?assignees=&labels=enhancement&projects=&template=feature_request.md&title=%F0%9F%9A%80+Feature%3A+). + +For a diagram of how the API, delivery, and log services connect in **self-hosted** deployments, see [Self-hosting architecture](/docs/outpost/self-hosting/architecture). + ## Observability Outpost provides observability metrics via OpenTelemetry and the Metrics API. Those metrics can be used to monitor your Outpost deployment and provide metrics to your end-users such as events failure rates by destination or events per topic. diff --git a/docs/content/guides/building-your-own-ui.mdoc b/docs/content/guides/building-your-own-ui.mdoc index e152347f4..59ea3ae51 100644 --- a/docs/content/guides/building-your-own-ui.mdoc +++ b/docs/content/guides/building-your-own-ui.mdoc @@ -3,376 +3,234 @@ title: "Building Your Own UI" description: "Build your own UI for users to manage their destinations and view their events using the Outpost API." --- -While Outpost offers a Tenant User Portal, you may want to build your own UI for users to manage their destinations and view their events. +While Outpost offers a Tenant User Portal, you may want to build your own UI so your customers can manage their destinations and view delivery activity. -The portal is built using the Outpost API with JWT authentication. You can leverage the same API to build your own UI. +This page is for **teams shipping that experience**—usually product engineers and anyone designing settings, integrations, or support tooling around webhooks and other destination types. It is framework-agnostic: screens, flows, and how they map to Outpost. If you use an **AI coding assistant** with Hookdeck’s optional [integration prompt](/docs/outpost/quickstarts/hookdeck-outpost-agent-prompt), that document carries workflow-specific instructions; this guide stays focused on what your **customers** should see and what your **backend** should enforce. -Within this guide, we will use the User Portal as a reference implementation for a simple UI. You can find the full source code for the User Portal [here](https://github.com/hookdeck/outpost/tree/main/internal/portal). +The portal uses the same Outpost API you can call from your product. Its source is a useful reference ([`internal/portal`](https://github.com/hookdeck/outpost/tree/main/internal/portal), React); you are not required to match its stack. -In this guide, we will assume you are using React (client-side) to build your own UI, but the same principles can be applied to any other framework. +For paths, query parameters, request and response JSON, status codes, and authentication, use the [OpenAPI specification](/docs/outpost/api) as the authoritative contract. If anything here disagrees with OpenAPI, trust the spec. + +**Prefer official SDKs on the server** where Hookdeck provides them for your backend language—see the [SDK overview](/docs/outpost/sdks) and the **curl**, **TypeScript**, **Python**, or **Go** quickstart in this documentation for runnable examples. The SDKs wrap the same API: less boilerplate, typed clients, and fewer raw HTTP mistakes. Use **OpenAPI** as the contract for **wire JSON** (especially when your browser or BFF returns JSON that should match the HTTP API), for generated clients, or when you integrate from a stack without a first-party SDK. + +### Working from OpenAPI + +Map each surface in your product to named operations in the spec (list destinations, create destination, list events, and so on). Use the published schemas for request bodies and list rows, and implement those operations with the **official SDK** on your backend when available. + +Destination type labels, icons, and dynamic form fields come from `GET /destination-types`—specifically `config_fields` and `credential_fields` (see [Destination type metadata and dynamic config](#destination-type-metadata-and-dynamic-config)). That response is the source for field keys and types, not guesses from older examples. Each field object includes a **`key`**: the property name inside the destination’s `config` or `credentials` object (for example `url` for a webhook). This is documented on **`DestinationSchemaField`** in [OpenAPI](/docs/outpost/api). + +Whether the browser uses a **tenant JWT** or talks only to **your** API, the operations are the ones in OpenAPI; see [Authentication](#authentication) for how credentials and `tenant_id` are applied. + +The portal shows full UI code for complex forms; this page avoids long framework-specific snippets so the spec stays the single place for shapes and validation. + +## UI structure and flow + +The tenant portal illustrates how screens map to tenant → destinations → topics → delivery target. Following that shape helps your customers understand subscriptions and targets instead of a single anonymous “webhook URL.” + +**Tenant context** + +- Everything below applies to one tenant at a time: the signed-in account in your SaaS (your customer). Use that account’s `tenant_id` when listing or creating destinations and when publishing from your backend. +- With a tenant JWT, the token is scoped to that tenant. If you proxy through your API, resolve the signed-in account to `tenant_id` and forward it on list, create, and publish calls. + +**Recommended areas / screens** + +| Area | Purpose | +| ---- | ------- | +| Destinations list | All destinations for the current tenant (type, human-readable target such as webhook URL, queue name, or Hookdeck label, plus subscribed topics). Entry point to edit, disable, or remove. | +| Create destination | Multi-step flow: (1) choose destination type, (2) select topics from your Outpost project configuration, (3) fill type-specific config from the type schema. Optional: instructions or remote setup URL from the schema. | +| Events and delivery attempts | Default pattern: open activity from a destination (events, then attempts, then retry in that context). Optional: a tenant-wide activity view with a destination filter for support or power users. See [Default information architecture](#default-information-architecture-multi-destination-products) and [Events, attempts, and retries](#events-attempts-and-retries). | + +### Default information architecture (multi-destination products) + +When a tenant can have many destinations—of any [destination type](/docs/outpost/overview#supported-destinations) your project enables—the primary path is destination → activity: people ask “what was delivered to this subscription?” rather than seeing all traffic in one undifferentiated list. The same API applies for webhooks, queues, and other types; only create/edit forms differ, driven by [destination type metadata and dynamic config](#destination-type-metadata-and-dynamic-config). + +For list events and list attempts, reuse the same endpoints everywhere: vary query parameters (for example `destination_id`, cursors) rather than inventing parallel client-side contracts. Pagination and auth details are defined in [OpenAPI](/docs/outpost/api); [Events, attempts, and retries](#events-attempts-and-retries) below summarizes how those endpoints support common UI needs. + +**Example routes** (rename to fit your product—integrations, event destinations, webhooks, etc.): + +| Example route | What it does | Spec | +| ------------- | ------------ | ---- | +| `…/destinations` or `…/integrations` | Hub: list destinations; create or drill down | [Listing destinations](#listing-configured-destinations) · [List destinations](/docs/outpost/api/destinations#list-destinations) | +| `…/destinations/new` (or wizard) | Create destination: choose type ([types](/docs/outpost/overview#supported-destinations); `GET /destination-types` in [OpenAPI](/docs/outpost/api)), then topics and config | [Creating a destination](#creating-a-destination) | +| `…/destinations/:destinationId` | Detail: edit config, enable/disable, topics | [OpenAPI](/docs/outpost/api) — Destinations | +| `…/destinations/:destinationId/activity` | Activity for this destination: events, attempts, retry | [Events, attempts, and retries](#events-attempts-and-retries) · [List events](/docs/outpost/api/events#list-events) · [List attempts](/docs/outpost/api/attempts#list-attempts) | +| `…/activity` (optional) | Tenant-wide activity; optional filter by `destination_id` | Same list-events operation with different query params ([OpenAPI](/docs/outpost/api)) | + +For the conceptual model, see [Outpost Concepts](/docs/outpost/concepts), especially “How this fits your product.” + +## OpenAPI: core operations for a tenant dashboard + +| Goal | OpenAPI entry point | In the UI | +| ---- | ------------------- | --------- | +| Types, labels, icons, dynamic form defs | [Destination types / schema](/docs/outpost/api/schemas) — `GET /destination-types` | Type picker; join list rows on `destination.type` (the type id is `type`, not a separate `id` on the type object). | +| Topics for subscriptions | [Topics](/docs/outpost/api/topics#list-topics) — `GET /topics` | Checkboxes or multi-select on create/update. | +| List destinations | [List destinations](/docs/outpost/api/destinations#list-destinations) | Main table; show `target` / `target_url` per schema. | +| Create destination | [Create destination](/docs/outpost/api/destinations#create-destination) | Body: `type`, `topics`, type-specific `config` / credentials per spec. | +| Get / update / delete | [OpenAPI](/docs/outpost/api) — Destinations | Detail and edit flows. | +| Tenant JWT (optional browser calls) | [Tenant JWT](/docs/outpost/api/tenants#get-tenant-jwt-token) | Short-lived token; BFF is often simpler if you need to hide capabilities. | +| Events, attempts, retry | [Events](/docs/outpost/api/events#list-events), [Attempts](/docs/outpost/api/attempts#list-attempts), [Retry](/docs/outpost/api/attempts#retry-attempt) | Activity and recovery; see below. | ## Authentication -To perform API calls on behalf of your tenants, you can either generate a JWT token, which can be used client-side to make Outpost API calls, or you can proxy any API requests to the Outpost API through your own API. When proxying through your own API, you can ensure the API call is made for the currently authenticated tenant using the API `tenant_id` parameter. +You can issue a tenant JWT for client-side calls to Outpost, or proxy requests through your own API. With a proxy, attach your platform’s Outpost API key on the server and scope each call to the authenticated tenant (for example via `tenant_id` on admin-key routes). + +Proxying is useful when you want to restrict which Outpost features are exposed or to keep the admin key off the client entirely. + +### Browser, your API, and Outpost (BFF pattern) + +In a typical **backend-for-frontend** arrangement, the customer’s browser calls **your** product API only. Your servers call Outpost with the **platform** API key and the correct **`tenant_id`** for the signed-in account. Teams refer to this as a **BFF**, an **Outpost proxy**, or a server-side integration layer—the pattern is the same. + +The alternative is for the browser to call Outpost **directly** using a short-lived **tenant JWT** ([Generating a JWT Token](#generating-a-jwt-token-optional) below). Many products prefer a proxy so the admin key never ships to the client and so they can limit which Outpost capabilities the UI may invoke. + +### API base URL (managed and self-hosted) + +Use one configurable base URL for Outpost (no trailing slash), for example `API_URL` or `OUTPOST_API_BASE_URL`. Paths in this guide match [OpenAPI](/docs/outpost/api) (`/tenants/...`, `/topics`, `/destination-types`, …). + +- **Managed Hookdeck Outpost:** use the base URL from your project (see the [curl quickstart](/docs/outpost/quickstarts/hookdeck-outpost-curl)). +- **Self-hosted:** use your deployment’s public origin plus any path prefix (often `/api/v1`). Local development should still read host and port from configuration or environment so the same code works in staging and production. -Proxying through your own API can be useful if you want to limit access to some configuration or functionality of Outpost. +In your product, treat the base URL like any other external service: load it from config or env, not from literals baked into client bundles. ### Generating a JWT Token (Optional) -You can generate a JWT token by using the [Tenant JWT Token API](/docs/api/tenants#get-tenant-jwt-token). +See the [Tenant JWT Token API](/docs/outpost/api/tenants#get-tenant-jwt-token). ```bash -curl --location 'localhost:3333/api/v1/tenants//token' \ - --header 'Content-Type: application/json' \ - --header 'Authorization: Bearer ' \ -``` +export OUTPOST_API_BASE_URL="https://api.outpost.hookdeck.com/2025-07-01" # or your self-hosted root, e.g. …/api/v1 +TENANT_ID="" -## Fetching Destination Type Schema - -The destination type schema can be fetched using the [Destination Types Schema API](/docs/api/schemas). It can be used to render destination information such as the destination type icon and label. Additionally, the schema includes the destination type configuration fields, which can be used to render the destination configuration UI. - -## Listing Configured Destinations - -Destinations are listed using the [List Destinations API](/docs/api/destinations#list-destinations). Destinations can be listed by type and topic. Since each destination type has different configuration, the `target` field can be used to display a recognizable label for the destination, such as the Webhook URL, the SQS queue URL, or Hookdeck Source Name associated with the destination. Each destination type will return a sensible `target` value to display. - -```tsx -// React example to fetch and render a list of destinations - -const [destinations, setDestinations] = useState([]); - -const [destination_types, setDestinationTypes] = useState([]); - -const fetchDestinations = async () => { - // Get the tenant destinations - const response = await fetch(`${API_URL}/api/v1/tenants/destinations`, { - headers: { - Authorization: `Bearer ${token}`, - }, - }); - - const destinations = await response.json(); - setDestinations(destinations); -}; - -const fetchDestinationTypes = async () => { - // Get the destination types schemas - const response = await fetch(`${API_URL}/api/v1/destination-types`, { - headers: { - Authorization: `Bearer ${token}`, - }, - }); - - const destination_types = await response.json(); - setDestinationTypes(destination_types); -}; - -useEffect(() => { - fetchDestinations(); - fetchDestinationTypes(); -}, []); - -if (!destination_types || !destinations) { - return
Loading...
; -} - -const destination_type_map = destination_types.reduce((acc, type) => { - acc[type.id] = type; - return acc; -}, {}); - -return ( -
    - {destinations.map((destination) => ( -
  • - -

    {destination_type_map[destination.type].label}

    - {destination.target_url ? ( - - {destination.target_url} - - ) : ( -

    {destination.target}

    - )} -
  • - ))} -
-); +curl --request GET "$OUTPOST_API_BASE_URL/tenants/$TENANT_ID/token" \ + --header "Authorization: Bearer " ``` -You can find the source code of the `DestinationList.tsx` component of the User Portal here: [DestinationList.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/scenes/DestinationsList/DestinationList.tsx) - -## Creating a Destination - -To create a destination, the form will require three steps: one to choose the destination type, one to select the topics (optional), and one to configure the destination. - -### Choosing the Destination Type - -The list of available destination types is rendered from the list of destination types fetched from the API. - -```tsx -const [destination_types, setDestinationTypes] = useState([]); - -const fetchDestinationTypes = async () => { - // Get the destination types schemas - const response = await fetch(`${API_URL}/api/v1/destination-types`, { - headers: { - Authorization: `Bearer ${token}`, - }, - }); - - const destination_types = await response.json(); - setDestinationTypes(destination_types); -}; - -useEffect(() => { - fetchDestinationTypes(); -}, []); - -const handleSubmit = (e: React.FormEvent) => { - e.preventDefault(); - const formData = new FormData(e.target as HTMLFormElement); - const destination_type = formData.get("type"); - goToNextStep(destination_type); -}; - -if (!destination_types) { - return
Loading...
; -} - -return ( -
-

Choose a destination type

-
- {destinations?.map((destination) => ( - - ))} -
-
-); -``` +## Destination type metadata and dynamic config -You can find the source code of the `CreateDestination.tsx` component of the User Portal here: [CreateDestination.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/scenes/CreateDestination/CreateDestination.tsx) - -### Selecting Topics - -Available topics are returned from the [List Topics API](/docs/api/topics#list-topics). You can display the list of topics as a list of checkboxes to capture the user input. - -```tsx -const [topics, setTopics] = useState([]); - -const fetchTopics = async () => { - const response = await fetch(`${API_URL}/api/v1/topics`, { - headers: { - Authorization: `Bearer ${token}`, - }, - }); - - const topics = await response.json(); - setTopics(topics); -}; - -useEffect(() => { - fetchTopics(); -}, []); - -if (!topics) { - return
Loading...
; -} - -return ( -
-

Select topics

-
- {topics.map((topic) => ( - - ))} -
-); -``` +`GET /destination-types` returns everything needed to render type pickers and config forms. See the [Destination Types Schema API](/docs/outpost/api/schemas). -You can find the source code of the `TopicPicker.tsx` component of the User Portal here: [TopicPicker.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/common/TopicPicker/TopicPicker.tsx) +Each entry typically includes (confirm names and optionality in OpenAPI): -### Configuring the Destination +- `type` — Stable identifier (e.g. `webhook`). Matches `destination.type` on list rows; not named `id` on the type object. +- `label`, `description`, `icon` — Display metadata; `icon` is often an SVG string (some older code used the name `svg`). Sanitize if you render inline HTML. +- `config_fields`, `credential_fields` — Field definitions for the config step (snake_case in JSON). Include every field from both arrays on create and edit. +- `instructions` — Markdown for complex setup (for example cloud resources). +- `remote_setup_url` — Optional external setup flow before or instead of inline fields. -Using the destination type schema for the selected destination type, you can render a form to create and manage destinations configuration. The configuration fields are found in the `configuration_fields` and `credentials_fields` arrays of the destination type schema. +### Dynamic field shape (for forms) -To render your form, you should render all fields from both arrays. Note that some of the `credentials_fields` will be obfuscated once the destination is created, and in order to edit the input, the value must be cleared first. +Field objects are fully described in OpenAPI (`DestinationSchemaField`), including **`key`** (where to place the value in `config` / `credentials` on create/update). Each field has `label`, `type` (text vs checkbox vs select vs key-value map), `required`, optional `description`, validation (`minlength`, `maxlength`, `pattern`), `default`, `disabled`, and `sensitive` (password-style; values may be masked after create—clear to edit). On submit, map each value to the **`key`** Outpost expects inside `config` / `credentials`, regardless of how property names were transformed earlier in your stack—see [Wire JSON, SDK responses, and your UI](#wire-json-sdk-responses-and-your-ui). -The input field schema is as follows: +**Reference:** [DestinationConfigFields.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/common/DestinationConfigFields/DestinationConfigFields.tsx) maps schema fields to inputs. -```ts -type InputField = { - type: "text" | "checkbox"; // Only text and checkbox fields are supported - required: boolean; // If true, the field will be required - description?: string; // Field description, to use as a tooltip - sensitive?: boolean; // If true, the field will be obfuscated once the destination is created and should be treated as a password input - default?: string; // Default value for the field - minlength?: number; // Minimum length for the field - maxlength?: number; // Maximum length for the field - pattern?: string; // Regex validation pattern, to use with the input's pattern attribute -}; -``` +### Wire JSON, SDK responses, and your UI -#### Remote Setup URL - -Some destination type schemas have a `remote_setup_url` property that contains a URL to a page where the destination can be configured. Destinations that support remote URLs have a simplified setup flow that doesn't require instructions. For example, with the Hookdeck destination, the user is taken through a setup flow managed by Hookdeck to configure the destination. - -The URL is optional but provides a better user experience than following sometimes lengthy instructions to configure the destination. - -#### Instructions - -Each destination type schema has an `instructions` property that contains instructions to configure the destination as a markdown string. These instructions should be displayed to the user to help them configure the destination, as for some destination types, such as AWS, the necessary configuration can be complex and require multiple steps by the user within AWS. - -Example of a destination configuration form: - -```tsx -const DestinationConfigForm = ({ - destination_type, -}: { - destination_type: string; -}) => { - const [destination_types, setDestinationTypes] = useState([]); - //... Fetch the destination type schema - - if (!destination_types) { - return
Loading...
; - } - - const type_schema = destination_types.find( - (type) => type.id === destination_type - ); - - return ( - <> - {destination_type_schema.remote_setup_url ? ( - - Setup in {destination_type_schema.label} - - ) : ( - - )} - - {[...type_schema.config_fields, ...type_schema.credential_fields].map( - (field) => ( -
- - {field.type === "text" && ( - <> - - - )} - {field.type === "checkbox" && ( - - )} - {field.description &&

{field.description}

} -
- ) - )} - - - ); -}; -``` +This section matters whether you use an **official SDK** on the server (recommended when available) or raw HTTP: the **HTTP API** always follows [OpenAPI](/docs/outpost/api), while SDKs present language-native types to your backend code. -You can find the source code of the `DestinationConfigForm.tsx` component of the User Portal here: [DestinationConfigForm.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/common/DestinationConfigFields/DestinationConfigFields.tsx#L14) - -## Listing Events - -Events are listed using the [List Events API](/docs/api/events#list-events). You can use the `topic` parameter to filter the events by topic or the `destination_id` parameter to filter the events by destination. - -```tsx -const [events, setEvents] = useState([]); - -const fetchEvents = async () => { - const response = await fetch(`${API_URL}/api/v1/tenants/events`, { - headers: { - Authorization: `Bearer ${token}`, - }, - }); -}; - -useEffect(() => { - fetchEvents(); -}, []); - -if (!events) { - return
Loading...
; -} - -return ( -
-

Events

-
    - {events.map((event) => ( -
  • -

    {event.id}

    -

    {event.created_at}

    -

    {event.payload}

    -
  • - ))} -
-
-); -``` +HTTP responses from Outpost on the wire use JSON property names that match OpenAPI—typically **snake_case** (for example `config_fields`, `credential_fields`, and `remote_setup_url` on `GET /destination-types`). + +Official **SDKs** deserialize into language-native structures; names often differ from the wire format (for example TypeScript may expose **camelCase** such as `configFields` and `credentialFields`). Mutations use each SDK’s documented request types, which may not mirror OpenAPI field names literally. + +When a **browser** loads destination-type metadata via **your** backend, it receives whatever JSON your server returns. Options include forwarding the **raw** Outpost response body (so the client matches OpenAPI) or translating once on the server and treating that as your product’s contract. In all cases, create and update bodies must still place each value under the schema field’s **`key`** inside `config` and `credentials` as defined in OpenAPI. + +**Shape mismatches** between layers often appear as missing dynamic fields or create errors referencing absent `config.*` keys (for example `config.url` for webhooks). Comparing the **actual** JSON your UI receives with the property names your rendering code expects (`config_fields` versus `configFields`, and similar) usually isolates the problem. + +### Remote setup URL + +When `remote_setup_url` is present, you can link users through an external setup flow (for example Hookdeck-managed configuration) instead of only inline fields. + +### Instructions + +Render `instructions` as markdown when the destination type needs context beyond simple fields. + +## Listing configured destinations + +Use the [List Destinations API](/docs/outpost/api/destinations#list-destinations). OpenAPI describes variants for admin API key (tenant in path or query) versus tenant JWT (tenant inferred from the token); choose the operations that match how you authenticate. + +- Call list and render `type`, `target`, `target_url` when present, and subscribed topics. +- Optionally fetch `GET /destination-types` in parallel and map `type` string → schema row for `label` and `icon`. +- Link each row to destination detail and destination-scoped activity ([Default information architecture](#default-information-architecture-multi-destination-products)). + +**Reference:** [DestinationList.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/scenes/DestinationsList/DestinationList.tsx) + +## Creating a destination + +The product flow is three steps; the API is typically one [create destination](/docs/outpost/api/destinations#create-destination) request once you have `type`, `topics`, and `config` (plus credentials if required). OpenAPI defines the body. + +### Step 1 — Choose destination type + +- Data: `GET /destination-types` ([schemas](/docs/outpost/api/schemas)). +- Show each type’s `label`, `description`, and `icon`; store the chosen `type` string. + +**Reference:** [CreateDestination.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/scenes/CreateDestination/CreateDestination.tsx) + +### Step 2 — Select topics + +- Data: `GET /topics` ([list topics](/docs/outpost/api/topics#list-topics)). +- Collect topic strings, or `*` for all topics, as allowed by the create schema. + +**Reference:** [TopicPicker.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/common/TopicPicker/TopicPicker.tsx) + +### Step 3 — Configure the destination + +- Read `config_fields` and `credential_fields` for the selected type from `GET /destination-types` (or a single-type endpoint if you use one—see OpenAPI). +- If `remote_setup_url` is set, consider that flow first. +- Otherwise render fields per [Dynamic field shape](#dynamic-field-shape-for-forms) and submit via [Create destination](/docs/outpost/api/destinations#create-destination). + +**Reference:** [DestinationConfigFields.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/common/DestinationConfigFields/DestinationConfigFields.tsx) + +## Events, attempts, and retries + +This section connects what your customers see (what was delivered, what failed, how to retry) to the API. Request and response shapes live in [OpenAPI](/docs/outpost/api); the [portal](https://github.com/hookdeck/outpost/tree/main/internal/portal) shows one full implementation. + +### How the pieces fit + +1. **Destinations list** — Each row is a subscription. By default, link into destination-scoped activity ([Default information architecture](#default-information-architecture-multi-destination-products)). An optional tenant-wide activity route should still call the same list endpoints with different query parameters, not a separate unofficial API contract. +2. **Events** — Your backend published each event (topic + payload). [List events](/docs/outpost/api/events#list-events) is paginated. Common filters: `destination_id` for a per-destination screen; `topic`, time ranges, and `limit` / `next` / `prev` for broader views. With a tenant JWT, results are limited to that tenant; with an admin key, supply `tenant_id` (your backend usually injects it for the signed-in account). +3. **Attempts** — One row per delivery try (status, HTTP code, timing, optional response). Tie attempts to events with `event_id` and `destination_id`. Tenant-wide: [list attempts](/docs/outpost/api/attempts#list-attempts). Destination-scoped routes are under [OpenAPI](/docs/outpost/api) (tenant destination attempts). +4. **Retry** — Outpost [retries automatically](/docs/outpost/features/event-delivery) with backoff. [Manual retry](/docs/outpost/api/attempts#retry-attempt) is `POST /retry` with `event_id` and `destination_id` after the customer fixes their endpoint. The destination must be enabled and subscribed to the event’s topic. + +### What to expose in your dashboard UI + +| User need | API direction | +| --------- | ------------- | +| “What was delivered here?” (this destination) | List events with `destination_id`, then list attempts for the chosen `event_id` (and destination as needed)—same idea for webhooks, queues, and other types. | +| “Why did it fail?” | Surface attempt status, code, and response when present; link to your docs on URLs, auth, or timeouts. | +| “Send it again” | Retry on failed attempts → `POST /retry`; handle 202 vs errors such as disabled destination. | + +### Implementation notes + +- Event and attempt lists use cursor pagination; pass through `next` and `prev` (or “load more”) for busy tenants. +- If the browser never holds the admin key, proxy these calls through your backend with the platform key and the correct `tenant_id`, same as destination CRUD. +- **Reference:** [Events.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/scenes/Destination/Events/Events.tsx) for destination-scoped activity layout. + +## Implementation checklists + +Use these lists before launch, in design or code review, or when comparing your tenant experience to the patterns above. They do not replace OpenAPI, security review, or testing against your deployment. + +For **customer-facing** destination and delivery UI, work through **Planning and contract**, **Destinations experience**, and **Activity, attempts, and retries** at minimum. Skip rows that clearly do not apply (for example, if you only expose destinations through your own API and have no in-app activity screens—document how customers verify delivery instead). + +### Planning and contract + +- [ ] Every call is scoped to the correct tenant (`tenant_id` on admin-key routes, or tenant inferred from JWT). +- [ ] Outpost base URL comes from configuration or environment for dev, staging, and production (not a single hardcoded host in app code). +- [ ] Server-side Outpost calls use an **official SDK** when Hookdeck ships one for your language; raw HTTP or generated OpenAPI clients are fine when they fit better. +- [ ] You chose an auth approach (browser JWT, server-side proxy/BFF, or mix) and use the matching OpenAPI operations and headers consistently. +- [ ] Dynamic destination UI (labels, icons, form fields) is driven by `GET /destination-types`, not copied field lists from examples. + +### Destinations experience + +- [ ] List view shows type, human-readable target, and subscribed topics; each row reaches detail edit and destination-scoped activity. +- [ ] Create flow covers: pick type → select topics (`GET /topics`) → collect `config` and credentials per the selected type’s `config_fields` and `credential_fields`. +- [ ] When a type exposes `instructions` or `remote_setup_url`, the UI surfaces them (markdown / external flow) so customers are not blocked on opaque fields. +- [ ] Detail supports lifecycle your product needs: view, update, delete, enable/disable—per OpenAPI and your product policy. + +### Activity, attempts, and retries + +- [ ] Default path is destination → events → attempts; optional tenant-wide activity still uses the same list endpoints with different query parameters. +- [ ] Cursor pagination is implemented for busy tenants (`next` / `prev` or equivalent “load more”). +- [ ] Failed deliveries show enough context (status, HTTP code, response when present) for customers to fix their side. +- [ ] Manual retry is available where appropriate; errors such as disabled destination are handled with a clear message. -For each event, you can retrieve all its associated delivery attempts using the [List Event Attempts API](/docs/api/events#list-event-attempts). +### Content from the API -You can find the source code of the `Events.tsx` component of the User Portal here: [Events.tsx](https://github.com/hookdeck/outpost/blob/main/internal/portal/src/scenes/Destination/Events/Events.tsx) +- [ ] Inline icons or `instructions` markdown are rendered safely if they contain HTML or untrusted strings. +- [ ] Sensitive credential fields respect masking and “clear to edit” behavior described in the spec. diff --git a/docs/content/nav.json b/docs/content/nav.json index 9e7da01a2..f7147e05b 100644 --- a/docs/content/nav.json +++ b/docs/content/nav.json @@ -127,7 +127,31 @@ "structure": [ { "label": "Quickstarts", - "sections": [[{ "slug": "quickstarts/overview", "title": "Overview" }]] + "sections": [ + [ + { "slug": "quickstarts/overview", "title": "Overview" }, + { + "slug": "quickstarts/hookdeck-outpost-curl", + "title": "Hookdeck Outpost: curl" + }, + { + "slug": "quickstarts/hookdeck-outpost-typescript", + "title": "Hookdeck Outpost: TypeScript" + }, + { + "slug": "quickstarts/hookdeck-outpost-python", + "title": "Hookdeck Outpost: Python" + }, + { + "slug": "quickstarts/hookdeck-outpost-go", + "title": "Hookdeck Outpost: Go" + }, + { + "slug": "quickstarts/hookdeck-outpost-agent-prompt", + "title": "Hookdeck Outpost: agent prompt" + } + ] + ] } ] }, diff --git a/docs/content/overview.mdoc b/docs/content/overview.mdoc index 2e2f141df..0f1c23cfb 100644 --- a/docs/content/overview.mdoc +++ b/docs/content/overview.mdoc @@ -41,9 +41,9 @@ Outpost delivers events to any of the following destination types: - **[Azure Service Bus](/docs/outpost/destinations/azure-service-bus)** — Send to a Service Bus queue or topic - **[GCP Pub/Sub](/docs/outpost/destinations/gcp-pubsub)** — Publish to a Pub/Sub topic - **[RabbitMQ (AMQP)](/docs/outpost/destinations/rabbitmq)** — Send to a remote RabbitMQ exchange -- **[Kafka](/docs/outpost/destinations/kafka)** — Send to a remote Kafka topic +- **[Kafka (planned)](https://github.com/hookdeck/outpost/issues/141)** — Send to a remote Kafka topic -If you'd like to see more destination types added, submit an issue or PR: GITHUB LINK +If you'd like to see more destination types added, [open an issue or PR](https://github.com/hookdeck/outpost/issues). ## Get Started @@ -53,12 +53,12 @@ Hookdeck Outpost is available through [Hookdeck](https://hookdeck.com/get-starte Once your account has been created, follow the documentation or in-product get started guide: 1. [Concepts](/docs/outpost/concepts) — Understand the core models -2. [Publishing Events](/docs/outpost/publishing-events) — Publish your first event +2. [Publishing Events](/docs/outpost/publishing/events) — Publish your first event 3. [Destinations](/docs/outpost/destinations/webhook) — Configure where events are delivered -4. [Tenant Portal](/docs/outpost/features/tenant-portal) — Embed the portal in your product +4. [Tenant Portal](/docs/outpost/features/tenant-user-portal) — Embed the portal in your product {% /tab %} {% tab label="Self-Hosted" %} -Follow one of our sel-fhosted quickstart to deploy Outpost: +Follow one of our self-hosted quickstarts to deploy Outpost: - [Deploy on Railway](/docs/outpost/self-hosting/quickstarts/railway) — One-click deployment with a Railway template - [Deploy with Docker](/docs/outpost/self-hosting/quickstarts/docker) — Docker Compose with RabbitMQ or SQS @@ -66,7 +66,7 @@ Follow one of our sel-fhosted quickstart to deploy Outpost: Then explore: - [Concepts](/docs/outpost/concepts) — Architecture and core models -- [Publishing Events](/docs/outpost/publishing-events) — Publish your first event -- [Configuration Reference](/docs/outpost/references/configuration) — All environment variables +- [Publishing Events](/docs/outpost/publishing/events) — Publish your first event +- [Configuration Reference](/docs/outpost/self-hosting/configuration) — All environment variables {% /tab %} {% /tabs %} diff --git a/docs/content/quickstarts/hookdeck-outpost-agent-prompt.mdoc b/docs/content/quickstarts/hookdeck-outpost-agent-prompt.mdoc new file mode 100644 index 000000000..7422c5e38 --- /dev/null +++ b/docs/content/quickstarts/hookdeck-outpost-agent-prompt.mdoc @@ -0,0 +1,190 @@ +--- +title: "Hookdeck Outpost — agent prompt template" +description: "Copy-paste template for AI coding agents. Dashboard teams should inject the placeholders server-side or client-side." +--- + +This page is a **reference template** for the Hookdeck Outpost onboarding flow. Replace `{{PLACEHOLDERS}}` with values from the operator’s project (or render them in the dashboard). **Do not** put the API key in the prompt; the operator sets `OUTPOST_API_KEY` separately (for example in a project **`.env`** file loaded by their shell or app—never pasted into chat). API keys are created under the Outpost project: **Settings → Secrets** (the same Outpost API key used by the REST API and SDKs). + +## Template + +``` +## Hookdeck Outpost integration + +You are helping integrate Hookdeck Outpost into a platform to deliver events to the platform's customers via **event destinations** (webhook URLs, cloud queues, Hookdeck, and other supported types—see **{{DOCS_URL}}/destinations**). + +### Credentials + +- API base URL: {{API_BASE_URL}} +- API key (Outpost API key from the project **Settings → Secrets**): load from the `OUTPOST_API_KEY` environment variable — typically a **`.env`** file in the operator’s project (or another secrets mechanism their tooling loads); never ask the user to paste the key into chat + +### Configured topics + +{{TOPICS_LIST}} + +These names must **exist in the Outpost project** (dashboard) for publishes and destination subscriptions to work. + +**Naming:** In typical B2B SaaS, lifecycle topics like **`user.created`** mean an **end-user of the tenant’s account** (your customer’s customer—e.g. a team member), **not** your platform’s internal operator or staff. Use topic names that match **your product’s domain** (`order.shipped`, `item.deleted`, …) when those are the real events. + +**Reconciliation (default):** Derive **`topic` strings in code** from **real state changes** in the app. If **Configured topics** above is missing a name the app should emit, **do not** bend the product model to fit the list—tell the operator to **add that topic in the Outpost project** (Hookdeck) and to **refresh `{{TOPICS_LIST}}`** in the dashboard so a regenerated prompt matches the project. Only narrow or rename domain publishes when the operator **explicitly** asks for a minimal wiring demo with a fixed topic set. + +### Test destination + +Use this **Hookdeck Console Source** URL to verify event delivery (the webhook `config.url`, or `OUTPOST_TEST_WEBHOOK_URL` in the SDK quickstarts). Your dashboard supplies it for this project: + +{{TEST_DESTINATION_URL}} + +### Documentation + +**Core (read for every path):** + +- Getting started (curl / HTTP only, no SDK): {{DOCS_URL}}/quickstarts/hookdeck-outpost-curl +- TypeScript quickstart (TypeScript SDK): {{DOCS_URL}}/quickstarts/hookdeck-outpost-typescript +- Python quickstart (Python SDK): {{DOCS_URL}}/quickstarts/hookdeck-outpost-python +- Go quickstart (Go SDK): {{DOCS_URL}}/quickstarts/hookdeck-outpost-go +- API reference and OpenAPI (REST JSON shapes and status codes): {{DOCS_URL}}/api +- **Concepts — how tenants, destinations (subscriptions), topics, and publish fit a SaaS/platform:** {{DOCS_URL}}/concepts +- Full docs bundle (when available on the public site): {{LLMS_FULL_URL}} +- SDK overview: {{DOCS_URL}}/sdks — use **only** for high-level context; for **TypeScript, Python, or Go** code, follow that language’s **quickstart** for correct method signatures (e.g. Python `publish.event` uses `request={{...}}`, not TypeScript-style spreads as Python kwargs). + +**SDK vs OpenAPI (BFF / dashboard UI):** **Prefer the official server SDK** when Hookdeck provides one for the repo’s backend language (**{{DOCS_URL}}/sdks**). Keep these invariants: (1) **Wire JSON** matches **OpenAPI** (often **snake_case**). **SDKs** rename fields in language types (e.g. TypeScript **camelCase**). (2) The **browser** should consume the same JSON shape your BFF actually returns—or the server should **normalize** (e.g. forward raw `GET /destination-types`). (3) On create/update, each schema field’s **`key`** maps into `config` / `credentials` per OpenAPI. **Calling** Outpost: use **SDK** types when using the SDK; use **OpenAPI** for raw `fetch` / curl. Detail: **{{DOCS_URL}}/guides/building-your-own-ui#authentication** and **{{DOCS_URL}}/guides/building-your-own-ui#wire-json-sdk-responses-and-your-ui**. + +**When you build customer-facing UI or integrate into an existing product (not for quick path only):** + +- **Building your own UI — screen structure and flow** (list destinations—**any type**; create: choose **type** → topics → type-specific config; **events** / **attempts** / **manual retry**; tenant scope; default **destination → activity**): {{DOCS_URL}}/guides/building-your-own-ui +- Destination types: {{DOCS_URL}}/destinations +- Topics and destination subscriptions (fan-out, `*`): {{DOCS_URL}}/features/topics + +### Scope: choose the right depth (read before you build) + +Operators often give **short** answers (“TypeScript example,” “show me in Go”). **You** infer **how much** to build from their words—not from habit, and **not** from the language alone. + +**Three paths** (dashboard or chat may use other labels—“try it out,” “small demo app,” “our existing codebase,” or “Option 1 / 2 / 3”—map them to the same three): + +1. **Quick path** — Smallest runnable artifact: one **shell script** (curl) or **one source file** run with `npx tsx`, `python`, `go run`, etc., exactly as that language’s **quickstart** describes. No application framework, no multi-route server, no dev-server “project,” unless they clearly asked for an app. +2. **New minimal application** — They want a **new** small service or UI (pages, forms, a demo they can open in a browser). Use the **official SDK on the server** for whatever stack they name; stay **framework-agnostic** unless they specify a framework—do not impose one. +3. **Existing application** — They are changing **their current codebase**. Same SDK-on-server rules; integrate on **real** domain paths. Use the **full-stack** guidance in **Existing application (full-stack products)** below when the repo already has customer-facing UI. + +**Default when scope is ambiguous:** Prefer **Quick path**. If they only name a language, say “example,” “quickstart,” “try it,” “just show me,” or similar—and they do **not** ask for an app, UI, pages, a server project, or changes **inside their repo**—deliver **only** the quickstart-shaped artifact for that language (or curl if they gave no language). **Brief user messages are normal;** map them to the **smallest** matching path. + +**Language ≠ architecture:** **TypeScript**, **Python**, and **Go** select **which quickstart and SDK** to use. They do **not** mean “build a web application.” If they want an app or a full integration, they will signal it (“small dashboard,” “add to our backend,” “we use X in production,” etc.)—or ask **one** clarifying question if truly unclear. + +**Do not over-build:** + +- **Quick path** → **No** framework scaffold (no app router, no `create-*-app`, no Express/FastAPI/Go HTTP **project** just to demo Outpost). One file or one shell script is enough. +- **Quick path** → Do **not** default to a large stack because the language was TypeScript or Node; a **single `.ts` file** per the TypeScript quickstart is the right shape unless they asked for more. +- **New minimal application** → Do **not** ship full portal depth (events UI, retry flows, every destination type) unless they asked for that level; grow into **Building your own UI** when they want customer-grade destination management. +- **Existing application** → Do **not** stop at a throwaway demo route when they asked for real integration; follow **Minimum integration depth** under that section. + +### If the operator said… (mapping hints) + +| They said (examples) | Likely path | +|----------------------|-------------| +| “Example,” “quickstart,” “fastest,” “simplest,” “just show me,” or **only** a language name with no app/repo context | **Quick path** | +| “Small app,” “UI,” “page,” “form,” “demo site,” “dashboard” (greenfield, not their production repo) | **New minimal application** | +| “Our app,” “existing code,” “add to my API,” “integrate into this repo,” “we already run …” | **Existing application** | + +Use judgment; when two paths seem possible, prefer **Quick path** unless they clearly want UI or repo integration. + +### Language → SDK vs HTTP + +**You** map their words to the right doc—**after** you have chosen **scope** above. + +- **No language named** + simplest / minimal / “just show me” / no framework → **curl quickstart** + OpenAPI. One runnable shell script. **No SDK.** +- **TypeScript** or **Node** → **TypeScript quickstart** + **`@hookdeck/outpost-sdk`** as that doc shows. They do not need to say “SDK.” +- **Python** → **Python quickstart** + **`outpost_sdk`** (e.g. Python `publish.event` uses `request={{...}}` — **not** TypeScript-style kwargs). +- **Go** → **Go quickstart** + official Go SDK as that doc shows. +- **curl**, **HTTP only**, or **REST** without a language SDK → **curl quickstart** + OpenAPI. + +Do **not** mix argument styles across languages (e.g. do not apply TypeScript `publish.event({ ... })` shapes to Python). + +### Quick path — how to deliver + +Goal: tenant → **one destination** (often webhook to `{{TEST_DESTINATION_URL}}` / `OUTPOST_TEST_WEBHOOK_URL`) → **publish** → clear success (event id, HTTP 2xx, log line). + +- Default to **curl** when they want the absolute minimum and did not name a language. +- When they name **TypeScript**, **Python**, or **Go**, produce **only** what that language’s **quickstart** describes—typically **one file** (plus `package.json` / `go.mod` / venv if the quickstart needs it), not a full application tree. +- Ask only for env vars and details the quickstart still needs. + +### New minimal application + +When they want a **new** small app (not quick path): use the **official SDK on the server** for **their** stack. **Do not** treat any single framework as the default—follow what they name (or ask once). Prefer each language’s **quickstart** for Outpost call shapes, then add routes/pages as their stack requires. + +**Before** designing screens or forms, read **Concepts** and **Building your own UI** (under Documentation): reflect **tenant scope**, **multiple destinations per tenant**, and **destination = topic subscription + delivery target** (not one anonymous webhook field unless they ask for that simplification). + +For a **tiny** demo, keep **tenant** in scope, **create destination** as **topics + delivery target**, and a **separate** way to **publish a test event** so they can verify delivery—avoid one giant form unless they insist. Events / attempts UI is optional for the smallest demo; add it when matching **Building your own UI**. + +### Existing application + +Use the **official SDK for the repo’s backend language** on the **server** (or REST/OpenAPI if they refuse SDKs). Read that language’s quickstart for call shapes; integrate on **real** domain paths (signup, entities, workflows), not throwaway demos only. + +**Minimum integration depth:** (1) **Topic reconciliation** — every **`topic` in `publish`** appears under **Configured topics** above **or** the operator is told to **add that topic in the Outpost project** (prefer fixing the project to match the domain, not retargeting domain logic to a stale list). (2) **Domain publish** — at least one **`publish` on a real state-change path**, not only a synthetic “test event” route. (3) **Same tenant mapping** everywhere you call Outpost for that customer. + +### Existing application (full-stack products) + +If the codebase already has **customer-facing UI** (dashboard, settings, integrations) **or** a client that talks to **your** API, operators usually want customers to **manage destinations** (every **destination type** the project enables; see **{{DOCS_URL}}/destinations** and **`GET /destination-types`** in OpenAPI) **inside the product**: + +- **Backend:** **`OUTPOST_API_KEY`** and all Outpost SDK usage **server-side only**. **Tenant** upsert/sync where it fits, **publish** on real domain events, and **authenticated routes** (backend-for-frontend / BFF, server handlers, server actions—whatever matches **their** stack) to list/create/update/delete destinations for the **signed-in customer’s** tenant. Handlers call Outpost with platform credentials; responses expose only what the customer should see (ids, targets, topics—**never** the platform API key). +- **Frontend:** **Logged-in** clients call **your** backend (session, JWT, existing API client)—**not** Hookdeck’s API directly; **not** the Outpost SDK in the browser. Reuse their design system and routing. **Before** building screens, read **Concepts** and **Building your own UI**: **tenant scope**, **multiple destinations**, **destination = topics + delivery target** (avoid one undifferentiated “webhook” field unless they want that simplification). +- **Events and retries:** Surface **events** (filter by **destination** when useful) and **attempts** per event; offer **manual retry** for failed attempts (server-side retry API with `event_id` and `destination_id`) after they fix downstream—see **Building your own UI** (default **destination → activity**). +- **Test publish (recommended when shipping destination UI):** A **separate** control that **publishes a test event** for the signed-in tenant (server-side `publish` to a configured topic). Complementary to domain publishes; **does not replace** a real domain `publish`. +- **API-only products:** Document how tenants manage destinations via **your** API; keep the platform key on the server. + +### What to do + +1. **Infer scope** from **Scope** + **If the operator said…** (default **Quick path** when unclear). +2. **Map language** under **Language → SDK vs HTTP**. +3. **Execute** the matching section: **Quick path**, **New minimal application**, or **Existing application** (+ **full-stack** subsection when applicable). +4. Read the **single** language-appropriate quickstart (and OpenAPI for raw HTTP) before coding. For existing apps with UI, read **Building your own UI** before destination-management screens. + +### Before you stop (verify) + +Apply **only** the items below that fit the task; **skip** any that do not apply (e.g. skip existing-repo items for a standalone script or curl-only flow). + +**Always (when you produced or changed runnable code):** + +- [ ] **Ran** the smallest end-to-end check that fits this task (e.g. run the script or shell flow once, exercise one new API path, or smoke the UI/API flow you added) and saw a clear success signal (e.g. event id, HTTP 2xx, or expected output). +- [ ] **Secrets:** The platform Outpost API key remains **server-side** / **environment** only — not in client bundles, not hard-coded in committed source. +- [ ] **Repeatable:** Env vars, how to run, and how to verify with the test destination above are stated briefly (README, comments, or chat — match the task size; a one-file script may need only inline or chat notes). + +**When editing an existing application repository (Existing application or equivalent):** + +- [ ] **Topic reconciliation:** Every **`topic`** in `publish` is either in **Configured topics** above **or** README/chat tells the operator exactly which topics to **add in Hookdeck**—**domain-first**; do not retarget real features to wrong topic names to match an incomplete **Configured topics** list unless the operator explicitly asked for a minimal demo scope. +- [ ] **Domain publish:** At least one **`publish` on a real application path** (entity create/update, signup, etc.), not solely a synthetic “test event” endpoint—unless the operator explicitly scoped the task to wiring-only. +- [ ] **Test publish (if you added one):** Kept as a **separate** control from domain logic; does not satisfy the domain-publish item by itself. +- [ ] **Build integrity:** Generated outputs, route or module registries, and dependency lockfiles are **consistent** with new or edited source so a **clean** install + typecheck or build (or the repo’s documented CI step) would pass. + +**When you added or changed customer-facing destination management in an existing full-stack product** (dashboard, settings, or integrations UI—per **Existing application (full-stack products)** above): + +- [ ] **Full-stack UI bar:** Walked **Planning and contract**, **Destinations experience**, and **Activity, attempts, and retries** in **{{DOCS_URL}}/guides/building-your-own-ui#implementation-checklists** and confirmed the implementation matches: list rows reach **detail** and **destination-scoped activity** (events → attempts → manual retry as appropriate), **dynamic** create (and edit if you expose it) is driven by **`GET /destination-types`** (including each field’s **`key`** in `config` / `credentials`), and a **separate server-side test publish** control exists when customers can manage destinations. *Skip this item if the product is **API-only** (no customer UI for destinations) or the operator explicitly excluded activity / test UI—then document verification instead (README, Outpost dashboard, or curl to list events/attempts).* + +**Files on disk:** When your environment supports it, **write runnable artifacts into the operator’s project workspace** (real files: scripts, app source, `package.json`, `go.mod`, README) rather than only pasting long code in chat—so they can run, diff, and commit. Keep everything for one task in the same directory. For **new minimal application**, scaffold and install dependencies as you normally would (`npm` / `npx`, `go mod`, `pip` or `uv`). For **existing** full-stack products, change both **backend and frontend** (or equivalent UI layer) when the repo already includes customer-facing UI—do not stop at OpenAPI-only unless the product is genuinely API-only or the operator asks to skip UI work. + +**Concepts:** Read **{{DOCS_URL}}/concepts** for tenants, destinations as subscriptions, topics, and how **publish** fans out. Use **{{DOCS_URL}}/guides/building-your-own-ui** for recommended screens and implementation checklists. **Configured topics** above lists this project’s topic names (dashboard); **`user.*`** naming semantics are explained under **Configured topics** in this prompt. +``` + +## Placeholder reference + +| Placeholder | Example | Notes | +|-------------|---------|--------| +| `{{API_BASE_URL}}` | `https://api.outpost.hookdeck.com/2025-07-01` | Safe to embed in the prompt | +| `{{TOPICS_LIST}}` | Bullet list or comma-separated topic names | From dashboard config — operators should keep this aligned with what the integrated app will **publish** and what destinations subscribe to | +| `{{TEST_DESTINATION_URL}}` | **Required** — HTTPS URL of the Hookdeck Console **Source** created for this onboarding flow (fed in by the dashboard). | +| `{{DOCS_URL}}` | `https://hookdeck.com/docs/outpost` | Production **Outpost** docs base on Hookdeck (no trailing slash). Template paths append the same segments as **`/docs/outpost/…`** on the docs site (see `docs/content/nav.json`). For unpublished docs, evals can set **`EVAL_LOCAL_DOCS=1`** so the Documentation section is replaced with repository file paths (see `docs/agent-evaluation/README.md`). | +| `{{LLMS_FULL_URL}}` | `https://hookdeck.com/outpost/docs/llms-full.txt` | Optional; omit the line if not live yet | + +### Building your own UI — where the detail lives + +Product guidance is consolidated in **[Building your own UI](/docs/outpost/guides/building-your-own-ui)**: + +- **[Implementation checklists](/docs/outpost/guides/building-your-own-ui#implementation-checklists)** — ship/review rows for destinations and activity (referenced from **Before you stop (verify)** in the template above; not duplicated here). +- **[Authentication](/docs/outpost/guides/building-your-own-ui#authentication)** — browser vs your API vs Outpost (**BFF** pattern) and JWT option. +- **[Wire JSON, SDK responses, and your UI](/docs/outpost/guides/building-your-own-ui#wire-json-sdk-responses-and-your-ui)** — snake_case wire vs SDK names, `key` in `config` / `credentials`, shape mismatches. + +That page is written for **teams integrating Outpost** (engineers, PMs, reviewers). **Agent evaluation** in the Outpost repository (`docs/agent-evaluation/scenarios/`, scenarios **8–10** for existing-app baselines) uses the same implementation checklist when a run includes **customer-facing** destination UI—see each scenario’s success criteria for links. + +## Operator checklist (dashboard UI) + +- Show **API base URL** and **topics** next to the copyable prompt. +- **`{{TOPICS_LIST}}`:** Should match what the **integrated product** will publish (domain-first). If the baseline app emits events the project does not list yet, **add topics in Hookdeck** and refresh this list—avoid expecting the agent to **reshape the app** to fit a stale default (e.g. only `user.created` when the real model is `item.*`). +- Feed **`{{TEST_DESTINATION_URL}}`** from a Hookdeck Console **Source** URL you create for the operator (same value can be shown for `OUTPOST_TEST_WEBHOOK_URL` in env UI). Explain **Settings → Secrets** for `OUTPOST_API_KEY` (recommend a project **`.env`** or env-injection pattern, not pasting into the agent). Optional `OUTPOST_API_BASE_URL`. +- Keep the **API key out of the prompt text** to reduce exposure via model logs and chat history. diff --git a/docs/content/quickstarts/hookdeck-outpost-curl.mdoc b/docs/content/quickstarts/hookdeck-outpost-curl.mdoc new file mode 100644 index 000000000..69cabaeea --- /dev/null +++ b/docs/content/quickstarts/hookdeck-outpost-curl.mdoc @@ -0,0 +1,103 @@ +--- +title: "Hookdeck Outpost Quickstart: curl" +--- + +[Hookdeck Outpost](https://outpost.hookdeck.com) is Hookdeck’s managed [Outpost](https://github.com/hookdeck/outpost) service: a control plane and delivery layer for event destinations (webhooks, queues, and more) scoped per **tenant**—each tenant is one of your platform’s customers. + +This quickstart uses the REST API with `curl`. Topics are assumed to be configured already in the Hookdeck dashboard; use a topic name that exists there when you publish. + +## Prerequisites + +- A Hookdeck account with an Outpost project +- An **API key** (Outpost API key) from your project: **Settings → Secrets** +- **Topics** already configured in the dashboard (for example `user.created`, `order.completed`) +- API base URL: `https://api.outpost.hookdeck.com/2025-07-01` + +## Set up credentials + +In the Hookdeck Dashboard, open your Outpost project, go to **Settings → Secrets**, and create or copy an API key. That value is the same Outpost API key you use for the REST API and the SDKs. + +Store the API key and base URL in your shell (or in a `.env` file you `source`): + +```sh +export OUTPOST_API_BASE_URL="https://api.outpost.hookdeck.com/2025-07-01" +export OUTPOST_API_KEY="your_api_key" +``` + +Use them in the requests below as `$OUTPOST_API_BASE_URL` and `$OUTPOST_API_KEY`. + +## Create a tenant + +Each tenant maps to one of your customers. Pick a stable ID from your own system (for example a team or account ID). + +```sh +TENANT_ID="customer_acme_001" + +curl --request PUT "$OUTPOST_API_BASE_URL/tenants/$TENANT_ID" \ + --header "Authorization: Bearer $OUTPOST_API_KEY" +``` + +## Create a webhook destination + +Subscribe the tenant to one or more topics you configured in the dashboard. Set `config.url` to an HTTPS endpoint you control. + +If you do not have your own endpoint yet, open [Hookdeck Console](https://console.hookdeck.com?ref=outpost-docs), create a **Source**, and paste that Source URL as the webhook URL below (or any HTTPS URL you own). Replace `REPLACE_WITH_YOUR_WEBHOOK_URL` accordingly. + +Replace `user.created` with a topic that exists in your project if needed. + +```sh +curl --request POST "$OUTPOST_API_BASE_URL/tenants/$TENANT_ID/destinations" \ + --header "Authorization: Bearer $OUTPOST_API_KEY" \ + --header "Content-Type: application/json" \ + --data '{ + "type": "webhook", + "topics": ["user.created"], + "config": { + "url": "REPLACE_WITH_YOUR_WEBHOOK_URL" + } + }' +``` + +To receive every configured topic on this destination, set `"topics": ["*"]` instead. + +## Publish a test event + +Use the same tenant ID and a `topic` that matches both your dashboard configuration and the destination’s `topics`. + +```sh +curl --request POST "$OUTPOST_API_BASE_URL/publish" \ + --header "Authorization: Bearer $OUTPOST_API_KEY" \ + --header "Content-Type: application/json" \ + --data '{ + "tenant_id": "'"$TENANT_ID"'", + "topic": "user.created", + "eligible_for_retry": true, + "metadata": { + "source": "quickstart" + }, + "data": { + "user_id": "user_123" + } + }' +``` + +A `202` response means the event was accepted for delivery. + +## Shell scripts: status codes and portability + +If you combine API response bodies with `curl --write-out '\n%{http_code}'`: + +- **Publish** success is **HTTP 202** (not only 200/201). Treat **202** as success in conditional checks. +- **Portability:** GNU `head -n -1` (“all lines but the last”) is **not** available on macOS BSD `head`. Prefer splitting with **`sed '$d'`** (body) and **`tail -n 1`** (status), or another POSIX-friendly approach, so the same script runs on Linux and macOS. + +## Verify delivery + +- In **Hookdeck Console**, inspect the connection or destination you used (for example the Source you created) and confirm the webhook request and payload look correct. +- In the **Hookdeck Dashboard**, open **your Outpost project** and review **logs** (and any deliveries or event views your project exposes) to confirm the event was processed and delivered. + +## Next steps + +- [Destination types](/docs/outpost/overview#supported-destinations) — webhooks, AWS SQS, RabbitMQ, Hookdeck, and more +- [Tenant user portal](/docs/outpost/features/tenant-user-portal) — optional UI for tenants to manage their own destinations +- [SDKs](/docs/outpost/sdks) — TypeScript, Python, Go, and others +- [API reference](/docs/outpost/api) — full REST API diff --git a/docs/content/quickstarts/hookdeck-outpost-go.mdoc b/docs/content/quickstarts/hookdeck-outpost-go.mdoc new file mode 100644 index 000000000..1b8f999d4 --- /dev/null +++ b/docs/content/quickstarts/hookdeck-outpost-go.mdoc @@ -0,0 +1,163 @@ +--- +title: "Hookdeck Outpost Quickstart: Go" +--- + +[Hookdeck Outpost](https://outpost.hookdeck.com) is Hookdeck’s managed [Outpost](https://github.com/hookdeck/outpost) service. Use **tenants** for each customer, **destinations** for delivery targets, and **topics** aligned with your dashboard configuration. + +## Prerequisites + +- A Hookdeck account with an Outpost project +- An **API key** (Outpost API key) from your project: **Settings → Secrets** +- **Topics** already configured in the dashboard +- [Go](https://go.dev/) 1.22+ recommended +- API base URL: `https://api.outpost.hookdeck.com/2025-07-01` + +## Install the SDK + +```sh +go get github.com/hookdeck/outpost/sdks/outpost-go +``` + +## Set up credentials + +In the Hookdeck Dashboard, open your Outpost project, go to **Settings → Secrets**, and create or copy an API key. Export it (and optionally the base URL) in your shell: + +```sh +export OUTPOST_API_KEY="your_api_key" +export OUTPOST_API_BASE_URL="https://api.outpost.hookdeck.com/2025-07-01" +``` + +If `OUTPOST_API_BASE_URL` is unset, the SDK uses its default production server URL. + +## Set environment variables + +Set these in the shell where you run `go run .` (or inject them the way your deployment platform expects). + +1. **`OUTPOST_API_KEY`** — **Required.** From **Settings → Secrets**. The program exits if it is missing. + +2. **`OUTPOST_API_BASE_URL`** — **Optional.** When set, the client is configured with `WithServerURL`. Otherwise the Go SDK uses its default Hookdeck Outpost production URL. + +3. **`OUTPOST_TEST_WEBHOOK_URL`** — **Required for this walkthrough.** Webhook destination URL (HTTPS). Use your own server or a [Hookdeck Console](https://console.hookdeck.com?ref=outpost-docs) **Source** URL for a quick test. + +## Create and run the quickstart program + +Use `main.go` in a small module (after `go get github.com/hookdeck/outpost/sdks/outpost-go`). + +The program (**1)** configures the client with your API key, (**2)** upserts a tenant, (**3)** creates a webhook destination for your topic, (**4)** publishes one event, and (**5)** prints ids. + +```go +package main + +import ( + "context" + "fmt" + "log" + "os" + + outpostgo "github.com/hookdeck/outpost/sdks/outpost-go" + "github.com/hookdeck/outpost/sdks/outpost-go/models/components" +) + +func main() { + ctx := context.Background() + + // + // --- 1. Authenticated client (API key from Settings → Secrets) --- + // + + apiKey := os.Getenv("OUTPOST_API_KEY") + if apiKey == "" { + log.Fatal("Set OUTPOST_API_KEY") + } + + opts := []outpostgo.SDKOption{outpostgo.WithSecurity(apiKey)} + if base := os.Getenv("OUTPOST_API_BASE_URL"); base != "" { + opts = append(opts, outpostgo.WithServerURL(base)) + } + + s := outpostgo.New(opts...) + + // + // --- 2. Tenant id, topic name, and webhook URL (from env) --- + // + // tenantID = one of your customers in Outpost. + // topic = must match a topic configured in the dashboard. + // + + tenantID := "customer_acme_001" + topic := "user.created" + + webhookURL := os.Getenv("OUTPOST_TEST_WEBHOOK_URL") + if webhookURL == "" { + log.Fatal("Set OUTPOST_TEST_WEBHOOK_URL (e.g. a Hookdeck Console Source URL)") + } + + // + // --- 3. Create or update the tenant --- + // + + if _, err := s.Tenants.Upsert(ctx, tenantID, nil); err != nil { + log.Fatal(err) + } + + // + // --- 4. Webhook destination: events on `topic` are POSTed to this URL --- + // + + destBody := components.CreateDestinationCreateWebhook( + components.DestinationCreateWebhook{ + Topics: components.CreateTopicsArrayOfStr([]string{topic}), + Config: components.WebhookConfig{URL: webhookURL}, + }, + ) + + createRes, err := s.Destinations.Create(ctx, tenantID, destBody) + if err != nil { + log.Fatal(err) + } + + if createRes != nil && createRes.GetDestinationWebhook() != nil { + fmt.Println("Destination id:", createRes.GetDestinationWebhook().GetID()) + } + + // + // --- 5. Publish one event --- + // + + pubRes, err := s.Publish.Event(ctx, components.PublishRequest{ + TenantID: outpostgo.String(tenantID), + Topic: outpostgo.String(topic), + EligibleForRetry: outpostgo.Bool(true), + Metadata: map[string]string{"source": "quickstart"}, + Data: map[string]any{"user_id": "user_123"}, + }) + + if err != nil { + log.Fatal(err) + } + + if pubRes != nil && pubRes.GetPublishResponse() != nil { + fmt.Println("Published event id:", pubRes.GetPublishResponse().GetID()) + } +} +``` + +Run: + +```sh +go run . +``` + +For all topics on that destination, use `components.CreateTopicsTopicsEnum(components.TopicsEnumWildcard)` instead of `CreateTopicsArrayOfStr`. + +## Verify delivery + +- In **Hookdeck Console**, confirm the webhook hit your test URL. +- In the **Hookdeck Dashboard**, open **your Outpost project** and review **logs** to confirm the event was processed and delivered. + +## Next steps + +- [Destination types](/docs/outpost/overview#supported-destinations) +- [Tenant user portal](/docs/outpost/features/tenant-user-portal) +- [SDKs](/docs/outpost/sdks) +- [API reference](/docs/outpost/api) diff --git a/docs/content/quickstarts/hookdeck-outpost-python.mdoc b/docs/content/quickstarts/hookdeck-outpost-python.mdoc new file mode 100644 index 000000000..37a627001 --- /dev/null +++ b/docs/content/quickstarts/hookdeck-outpost-python.mdoc @@ -0,0 +1,134 @@ +--- +title: "Hookdeck Outpost Quickstart: Python" +--- + +[Hookdeck Outpost](https://outpost.hookdeck.com) is Hookdeck’s managed [Outpost](https://github.com/hookdeck/outpost) service. Each **tenant** is one of your customers; **destinations** receive events; **topics** must match what you configured in the dashboard. + +## Prerequisites + +- A Hookdeck account with an Outpost project +- An **API key** (Outpost API key) from your project: **Settings → Secrets** +- **Topics** already configured in the dashboard +- Python 3.9+ recommended +- API base URL: `https://api.outpost.hookdeck.com/2025-07-01` + +## Install the SDK + +```sh +pip install outpost_sdk +``` + +## Set up credentials + +In the Hookdeck Dashboard, open your Outpost project, go to **Settings → Secrets**, and create or copy an API key. Export it (and optionally the base URL) in your shell: + +```sh +export OUTPOST_API_KEY="your_api_key" +export OUTPOST_API_BASE_URL="https://api.outpost.hookdeck.com/2025-07-01" +``` + +The SDK defaults to the production API base URL when `server_url` is omitted. + +## Set environment variables + +Set these in the same shell before you run the script (or load them with your preferred `.env` helper). + +1. **`OUTPOST_API_KEY`** — **Required.** From **Settings → Secrets**. Without it the script exits, because every API call must be authenticated. + +2. **`OUTPOST_API_BASE_URL`** — **Optional.** Passed through as `server_url` on the client. Omit it to use the SDK default production URL for Hookdeck Outpost. + +3. **`OUTPOST_TEST_WEBHOOK_URL`** — **Required for this walkthrough.** Webhook destinations need an HTTPS URL. Use your own endpoint or a [Hookdeck Console](https://console.hookdeck.com?ref=outpost-docs) **Source** URL for a quick, no-server test. + +## Create and run the quickstart script + +Save as `outpost_quickstart.py`. + +The script (**1)** creates an authenticated client, (**2)** upserts a tenant, (**3)** creates a webhook destination subscribed to your topic, (**4)** publishes one test event, and (**5)** prints the event id. + +```python +import os + +from outpost_sdk import Outpost + +# +# --- 1. Authenticated client (API key from Settings → Secrets) --- +# + +api_key = os.environ.get("OUTPOST_API_KEY") +if not api_key: + raise SystemExit("Set OUTPOST_API_KEY") + +base_url = os.environ.get("OUTPOST_API_BASE_URL") +client = Outpost(api_key=api_key, server_url=base_url) + +# +# --- 2. Tenant id, topic name, and webhook URL (from env) --- +# +# tenant_id = one of your customers in Outpost. +# topic = must match a topic configured in the dashboard. +# + +tenant_id = "customer_acme_001" +topic = "user.created" + +webhook_url = os.environ.get("OUTPOST_TEST_WEBHOOK_URL") +if not webhook_url: + raise SystemExit( + "Set OUTPOST_TEST_WEBHOOK_URL (e.g. a Hookdeck Console Source URL)" + ) + +# +# --- 3. Create or update the tenant --- +# + +client.tenants.upsert(tenant_id=tenant_id) + +# +# --- 4. Webhook destination: events on `topic` are POSTed to this URL --- +# + +client.destinations.create( + tenant_id=tenant_id, + body={ + "type": "webhook", + "topics": [topic], + "config": {"url": webhook_url}, + }, +) + +# +# --- 5. Publish one event --- +# + +published = client.publish.event( + request={ + "tenant_id": tenant_id, + "topic": topic, + "eligible_for_retry": True, + "metadata": {"source": "quickstart"}, + "data": {"user_id": "user_123"}, + } +) + +print("Published event id:", published.id) +``` + +Run: + +```sh +python outpost_quickstart.py +``` + +Use `topics: ["*"]` on the destination to receive all configured topics. + +## Verify delivery + +- In **Hookdeck Console**, confirm the webhook hit your test URL. +- In the **Hookdeck Dashboard**, open **your Outpost project** and review **logs** to confirm the event was processed and delivered. + +## Next steps + +- [Destination types](/docs/outpost/overview#supported-destinations) +- [Tenant user portal](/docs/outpost/features/tenant-user-portal) +- [SDKs](/docs/outpost/sdks) +- [API reference](/docs/outpost/api) diff --git a/docs/content/quickstarts/hookdeck-outpost-typescript.mdoc b/docs/content/quickstarts/hookdeck-outpost-typescript.mdoc new file mode 100644 index 000000000..c58381103 --- /dev/null +++ b/docs/content/quickstarts/hookdeck-outpost-typescript.mdoc @@ -0,0 +1,135 @@ +--- +title: "Hookdeck Outpost Quickstart: TypeScript" +--- + +[Hookdeck Outpost](https://outpost.hookdeck.com) is Hookdeck’s managed [Outpost](https://github.com/hookdeck/outpost) service. Each **tenant** represents one of your platform’s customers; **destinations** are where events are delivered; **topics** route events to the right destinations. + +This quickstart uses the official TypeScript SDK. Configure **topics** in the Hookdeck dashboard before publishing—use a topic name that exists there in the code below. + +## Prerequisites + +- A Hookdeck account with an Outpost project +- An **API key** (Outpost API key) from your project: **Settings → Secrets** +- **Topics** already configured in the dashboard +- [Node.js](https://nodejs.org/) 18+ recommended +- API base URL: `https://api.outpost.hookdeck.com/2025-07-01` + +## Install the SDK + +```sh +npm install @hookdeck/outpost-sdk +``` + +## Set up credentials + +In the Hookdeck Dashboard, open your Outpost project, go to **Settings → Secrets**, and create or copy an API key. Export it (and optionally the base URL) in your shell: + +```sh +export OUTPOST_API_KEY="your_api_key" +export OUTPOST_API_BASE_URL="https://api.outpost.hookdeck.com/2025-07-01" +``` + +The SDK defaults to the production API base URL, so `OUTPOST_API_BASE_URL` is only needed if you want to be explicit or point at another environment. + +## Set environment variables + +Before you run the quickstart script, define these in the same terminal session (or load them from a `.env` file if your tooling supports it). + +1. **`OUTPOST_API_KEY`** — **Required.** Copy the Outpost API key from **Settings → Secrets** in your project. The script passes this to the SDK as the Bearer token. Without it, the script stops with an error. + +2. **`OUTPOST_API_BASE_URL`** — **Optional.** Only set this if you need to override the API host. For Hookdeck Outpost you can omit it entirely: the SDK already uses `https://api.outpost.hookdeck.com/2025-07-01`. + +3. **`OUTPOST_TEST_WEBHOOK_URL`** — **Required for this walkthrough.** The script creates a webhook destination, which must point at an HTTPS URL. Easiest path: open [Hookdeck Console](https://console.hookdeck.com?ref=outpost-docs), create a **Source**, copy its URL, and assign it to this variable so you can see the webhook payload without deploying your own server. + +## Create and run the quickstart script + +Save the following as `outpost-quickstart.ts`. + +The script (**1)** builds an authenticated SDK client, (**2)** ensures a tenant exists, (**3)** adds a webhook destination subscribed to your topic, (**4)** publishes one test event, and (**5)** prints the event id. + +```typescript +import { Outpost } from "@hookdeck/outpost-sdk"; + +// +// --- 1. Authenticated client (API key from Settings → Secrets) --- +// + +const apiKey = process.env.OUTPOST_API_KEY; +if (!apiKey) { + throw new Error("Set OUTPOST_API_KEY"); +} + +const outpost = new Outpost({ + apiKey, + ...(process.env.OUTPOST_API_BASE_URL + ? { serverURL: process.env.OUTPOST_API_BASE_URL } + : {}), +}); + +// +// --- 2. Tenant id, topic name, and webhook URL (from env) --- +// +// tenantId = one of your customers in Outpost. +// topic = must match a topic configured in the dashboard. +// + +const tenantId = "customer_acme_001"; +const topic = "user.created"; + +const webhookUrl = process.env.OUTPOST_TEST_WEBHOOK_URL; +if (!webhookUrl) { + throw new Error( + "Set OUTPOST_TEST_WEBHOOK_URL to an HTTPS endpoint (e.g. a Hookdeck Console Source URL)", + ); +} + +// +// --- 3. Create or update the tenant --- +// + +await outpost.tenants.upsert(tenantId); + +// +// --- 4. Webhook destination: Outpost delivers events on `topic` to this URL --- +// + +await outpost.destinations.create(tenantId, { + type: "webhook", + topics: [topic], + config: { url: webhookUrl }, +}); + +// +// --- 5. Publish one event (delivered to destinations subscribed to `topic`) --- +// + +const published = await outpost.publish.event({ + tenantId, + topic, + eligibleForRetry: true, + metadata: { source: "quickstart" }, + data: { user_id: "user_123" }, +}); + +console.log("Published event id:", published.id); +``` + +Run: + +```sh +npx tsx outpost-quickstart.ts +``` + +To subscribe the destination to all topics, pass `topics: ["*"]` instead of `[topic]`. + +## Verify delivery + +- In **Hookdeck Console**, inspect the Source or connection you used for `OUTPOST_TEST_WEBHOOK_URL` and confirm the webhook request arrived as expected. +- In the **Hookdeck Dashboard**, open **your Outpost project** and review **logs** to confirm the event was processed and delivered. + +## Next steps + +- [Destination types](/docs/outpost/overview#supported-destinations) +- [Tenant user portal](/docs/outpost/features/tenant-user-portal) +- [SDKs](/docs/outpost/sdks) +- [API reference](/docs/outpost/api) diff --git a/docs/content/quickstarts/overview.mdoc b/docs/content/quickstarts/overview.mdoc index aae1e4e94..ef794eba9 100644 --- a/docs/content/quickstarts/overview.mdoc +++ b/docs/content/quickstarts/overview.mdoc @@ -1,22 +1,22 @@ --- -title: "Quickstart" -description: "A 0 to guide to implement Outpost in your application to send webhooks and events to your end-users." +title: "Outpost Quickstarts" +description: "Hookdeck managed Outpost and self-hosted deployment quickstarts." --- -TODO: Add quickstart. Explain the following steps +## Hookdeck Outpost (managed) -NOTICE: Ask you agent to do the work with prompt +Use Hookdeck’s hosted Outpost API with your dashboard API key and preconfigured topics: -- Deploy or create a Hookdeck account/project -- Set your topics (optional) -- Create your first tenant - - Bring your own ID - - Tip: Set a metadata `name` to set a display name in the dashboard -- Create your first destination -- Publish your first event +- [curl](/docs/outpost/quickstarts/hookdeck-outpost-curl) +- [TypeScript](/docs/outpost/quickstarts/hookdeck-outpost-typescript) +- [Python](/docs/outpost/quickstarts/hookdeck-outpost-python) +- [Go](/docs/outpost/quickstarts/hookdeck-outpost-go) +- [Agent prompt template](/docs/outpost/quickstarts/hookdeck-outpost-agent-prompt) (for AI-assisted integration) -- Whats next - - Building a UI or exposing the portal to your users - - Set your delivery configuration (user-agent, http headers, signature format, etc) - - Import your existing tenants and webhooks - - Production checklist +## Self-hosted + +Run Outpost in your own infrastructure: + +- [Docker with RabbitMQ or AWS SQS via LocalStack](/docs/outpost/self-hosting/quickstarts/docker) +- [Kubernetes with RabbitMQ](/docs/outpost/self-hosting/quickstarts/kubernetes) +- [Railway](/docs/outpost/self-hosting/quickstarts/railway) diff --git a/docs/content/redirects.json b/docs/content/redirects.json index 37ad58938..cc0570e92 100644 --- a/docs/content/redirects.json +++ b/docs/content/redirects.json @@ -21,7 +21,7 @@ }, { "from": "/docs/outpost/destinations", - "to": "/docs/outpost/overview" + "to": "/docs/outpost/overview#supported-destinations" }, { "from": "/docs/outpost/guides", diff --git a/docs/content/self-hosting/configuration.mdoc b/docs/content/self-hosting/configuration.mdoc index ec8e9126c..b7af25c83 100644 --- a/docs/content/self-hosting/configuration.mdoc +++ b/docs/content/self-hosting/configuration.mdoc @@ -102,6 +102,13 @@ Choose one for event log persistence: | `ALERT_CONSECUTIVE_FAILURE_COUNT` | `20` | Consecutive failures before alert triggers | | `ALERT_AUTO_DISABLE_DESTINATION` | `true` | Auto-disable destination when failure count reaches 100% | +## Destinations + +| Variable | Default | Description | +|----------|---------|-------------| +| `MAX_DESTINATIONS_PER_TENANT` | `20` | Maximum destinations each tenant may create. Set as low as is practical for your product to limit abuse and load; lowering this value later does **not** remove destinations that already exist. | +| `DESTINATIONS_METADATA_PATH` | — | Optional. Filesystem path to a directory of [custom destination metadata](https://github.com/hookdeck/outpost/tree/main/internal/destregistry/metadata/providers) (per-type `metadata.json` and `instructions.md`). Non-core fields such as `label`, `description`, `icon`, and `instructions` can be customized; `config_fields` and `credential_fields` cannot be overridden. | + ## Webhook Behavior | Variable | Default | Description | diff --git a/examples/azure/README.md b/examples/azure/README.md index b2434da4f..12d9a8e9a 100644 --- a/examples/azure/README.md +++ b/examples/azure/README.md @@ -368,7 +368,7 @@ For most users, `azure-deploy.sh` offers a balance of automation, reliability, a If you are not using the `dependencies.sh` and `local-deploy.sh` scripts to provision your infrastructure, you will need to create the `.env.outpost` and `.env.runtime` files manually. -See the [Configure Azure Service Bus as the Outpost Internal Message Queue](https://outpost.hookdeck.com/docs/guides/service-bus-internal-mq) guide for more details on the environment variables required for Outpost and how to create the values. +See the [Configure Azure Service Bus as the Outpost Internal Message Queue](https://hookdeck.com/docs/outpost/self-hosting/guides/service-bus-internal-mq) guide for more details on the environment variables required for Outpost and how to create the values. ### `.env.outpost` diff --git a/examples/demos/dashboard-integration/README.md b/examples/demos/dashboard-integration/README.md index a8026b4e9..f085ac596 100644 --- a/examples/demos/dashboard-integration/README.md +++ b/examples/demos/dashboard-integration/README.md @@ -49,7 +49,7 @@ A Next.js application demonstrating how to integrate Outpost with an API platfor TOPICS=user.created,user.updated,order.completed,payment.processed,subscription.created ``` - For a full list of Outpost configuration options, see [Outpost Configuration](https://outpost.hookdeck.com/docs/references/configuration) + For a full list of Outpost configuration options, see [Outpost Configuration](https://hookdeck.com/docs/outpost/self-hosting/configuration) 4. **Start the complete stack** (PostgreSQL, Redis, RabbitMQ, and Outpost): diff --git a/examples/kubernetes/README.md b/examples/kubernetes/README.md index 14519f954..5cfe9d147 100644 --- a/examples/kubernetes/README.md +++ b/examples/kubernetes/README.md @@ -1 +1 @@ -See https://outpost.hookdeck.com/docs/quickstarts/kubernetes \ No newline at end of file +See https://hookdeck.com/docs/outpost/self-hosting/quickstarts/kubernetes \ No newline at end of file