feat: Hookdeck Outpost quickstarts and agent onboarding prompt#815
Merged
alexbouchardd merged 48 commits intofeat/refactor-docsfrom Apr 12, 2026
Merged
feat: Hookdeck Outpost quickstarts and agent onboarding prompt#815alexbouchardd merged 48 commits intofeat/refactor-docsfrom
alexbouchardd merged 48 commits intofeat/refactor-docsfrom
Conversation
Add self-contained quickstarts for curl, TypeScript, Python, and Go against the managed API, with Settings → Secrets, env-based examples, and verification via Hookdeck Console and project logs. Nest Quickstarts nav under Hookdeck Outpost (above Self-Hosted) and add an agent prompt template page for dashboard copy/paste. Include TEMP-hookdeck-outpost-onboarding-status.md for GA tracking. Made-with: Cursor
- Claude Agent SDK runner with explicit --scenario/--scenarios/--all, per-run workspace - Heuristic + LLM scoring vs scenario Success criteria; score-transcript 01-10 - Scenarios: basics, minimal apps, existing-app integration baselines - CI slice (eval:ci), SCENARIO-RUN-TRACKER, prompt template Files on disk guidance - Allow committing docs/**/.env.example under docs/.gitignore - TEMP status and README updates Made-with: Cursor
…tracker - Agent prompt: language implies SDK; simplest path defaults to curl; option 2/3 framework mapping; warn on sdks.mdx vs per-language quickstarts. - Curl quickstart: shell script notes (HTTP 202, portable body/status split). - run-agent-eval: PreToolUse write guard, default EVAL_MAX_TURNS 80, local docs block aligned with prompt; scenario heuristic fix for publish data key escaping. - Scenarios 01-10: realistic short user turns; success-criteria fixes where needed. - SCENARIO-RUN-TRACKER: cleared run results for a fresh pass; action items reset. - README and .env.example updates for eval harness as applicable. Made-with: Cursor
Made-with: Cursor
…lock - Point local EVAL_LOCAL_DOCS guidance at full curl quickstart instead - Reword scenario 01 execution criteria to reference quickstart/OpenAPI
Made-with: Cursor
- Expand concepts with SaaS/platform flow; refine building-your-own-ui (API root, paths, no localhost:3333 in examples) - Agent prompt: link concepts, UI guide, topics; tighten option-2 guidance - Eval harness: local docs list includes concepts, building-your-own-ui, topics - SCENARIO-RUN-TRACKER: scenario 05 assessment for 17-21-22 run, heuristic notes - Minor scenario 05 doc tweak Made-with: Cursor
Made-with: Cursor
GET /topics returns a JSON array of topic names (OpenAPI). The React snippet incorrectly treated items as objects with id and name, which misled readers and agent integrations. Use the string as key, value, and label to match the API and TypeScript SDK (topicsList → Array<string>). Made-with: Cursor
- Add eval-harness.ts to parse eval-harness fenced JSON (git_clone + agentCwd). - Runner applies pre-steps per scenario, sets agent cwd and write guard to the run directory, passes scenario markdown once into runOneScenario. - Transcript meta includes evalHarness summary; document EVAL_SKIP_HARNESS_PRE_STEPS. Made-with: Cursor
- Add ## Eval harness JSON (git_clone + agentCwd) for Next.js, FastAPI, Go baselines. - Turn 1 stays in-user voice (repo present) without naming the eval harness. - Align Automated eval and success criteria with pre-cloned workspace model. Made-with: Cursor
Made-with: Cursor
Expand the copy-paste agent template so existing apps with a product UI wire backend (BFF, server SDK) and frontend (calls own API only). Point to Concepts and Building your own UI before destination screens; allow API-only path when there is no customer UI. Made-with: Cursor
Pin scenario 09 to fastapi/full-stack-fastapi-template (React + Pydantic v2). Update scoreScenario09 baseline check, README index, TEMP onboarding status, and SCENARIO-RUN-TRACKER notes. Optional clone URL override: EVAL_FASTAPI_BASELINE_URL. Made-with: Cursor
Run 2026-04-09T22-16-54-750Z-scenario-09: heuristic 6/6, LLM pass. Point execution notes to prior Docker smoke on 20-48 stamp. Made-with: Cursor
Made-with: Cursor
Reword for customer-facing UI builders: clearer tenant/auth framing, configurable API base URL, less internal jargon and emphasis noise. Add implementation checklists for planning, destinations, activity, and safe rendering without duplicating the OpenAPI mapping tables. Made-with: Cursor
- Agent prompt: topic reconciliation, domain vs test publish, full-stack UI guidance; remove eval-flavored Turn 0 / next-run wording in template. - score-transcript: publish_beyond_test_only for 08/09/10 (domain publish). - Scenarios + README: success criteria and Turn 1 nudges match prompt. - SCENARIO-RUN-TRACKER: scenario 09 review notes marked resolved. Made-with: Cursor
Rewrite Turn 1 blockquotes as natural operator speech; drop Option 3, Turn 0, and prompt-section references. Align success-criteria wording with configured onboarding topics. Tracker references user-turn scripts. Made-with: Cursor
Add no_client_bundled_outpost_key and readme_or_env_docs checks to scoreScenario09 (align with full-stack success criteria). Made-with: Cursor
Write eval-run-started.json at scenario start; eval-failure.json on uncaught errors; eval-aborted.json on SIGTERM/SIGINT. Register signal handlers so interrupted runs leave a trace (SIGKILL still silent). Made-with: Cursor
Add docs/agent-evaluation/AGENTS.md (anti-leakage checklist), root AGENTS.md pointer, and a Cursor rule scoped to docs/agent-evaluation/. Document run sidecars, re-scoring, integration verification wording, and scenario 09 heuristic summary. Fix placeholder fixtures markdown. Made-with: Cursor
Restrict PreToolUse Read/Glob/Grep to the run directory (and docs/ when EVAL_LOCAL_DOCS). Block Bash that touches the monorepo root outside those areas; deny Agent unless EVAL_ALLOW_AGENT_TOOL. Split read vs write guard env vars. Write eval-started, eval-failure, and eval-aborted next to the run folder under results/runs/ so the agent cannot read harness metadata. SIGTERM/ SIGINT abort payload includes runDirectory. Made-with: Cursor
Describe sibling *.eval-*.json harness files and expanded PreToolUse permissions (read guard, bash, Agent tool). Made-with: Cursor
Record 2026-04-10 run, quickstart.sh artifact, execution smoke test, and sibling harness sidecar layout. Made-with: Cursor
Update README, OpenAPI contact URL, entrypoint migration hint, and example READMEs so public links match Outpost docs on Hookdeck. Made-with: Cursor
- Default EVAL_DOCS_URL to https://hookdeck.com/docs/outpost - Replace invalid destinations directory path with overview + webhook mdoc - Document placeholder examples in agent prompt and fixtures Made-with: Cursor
- Point scenario and script links at docs/content paths (.mdoc) - Update SCENARIO-RUN-TRACKER for latest heuristic-pass runs - Revise README and AGENTS for current layout - Remove SKILL-UPSTREAM-NOTES (obsolete) Made-with: Cursor
Log 2026-04-10T22-14-20-704Z-scenario-10 with heuristic/LLM/execution results and execution notes (Go baseline, signup smoke, Hookdeck probe). Made-with: Cursor
Add docs-agent-eval-ci.yml: scenarios 01+02 with EVAL_LOCAL_DOCS, heuristic + LLM judge, then execute-ci-artifacts.sh (curl + TypeScript) using OUTPOST_API_KEY. Trigger on docs content/apis, agent-evaluation harness (ignoring tracker/results README noise), TypeScript SDK, and workflow edits. Ignore .env.ci for local secret template; document secrets and execution in README. Made-with: Cursor
Made-with: Cursor
GitHub rejects paths + paths-ignore on the same event; drop paths-ignore. README: manual workflow_dispatch; note broader path matches. Made-with: Cursor
Node parseArgs treats a bare -- as starting positionals; --scenarios then failed with ERR_PARSE_ARGS_UNEXPECTED_POSITIONAL in CI. Made-with: Cursor
- execute-ci-artifacts: EVAL_TEST_DESTINATION_URL fallback for webhook URL; default OUTPOST_API_BASE_URL with := (empty .env no longer strips version path); clearer errors on shell/ts failure - Add smoke-test-execute-ci-artifacts.sh + npm run smoke:execute-ci (topics *, loads .env then .env.ci) - CI execution step: OUTPOST_API_BASE_URL + OUTPOST_CI_PUBLISH_TOPIC - README troubleshooting (404) and .env.example OUTPOST_CI_PUBLISH_TOPIC Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
{{PLACEHOLDERS}}for API base, topics, test destination, docs URL).docs/agent-evaluation/: 10 scenarios (basics → app stacks → integrate-into-existing), Claude Agent SDK runner, heuristic transcript scoring + LLM judge, lifecycle sidecars, read/bash sandbox, authoring rules (AGENTS.md, Cursor rule).EVAL_LOCAL_DOCS=1), then executes generated curl + TypeScript against live Outpost; supportsworkflow_dispatchfor manual runs.DestinationSchemaField.keyso the published spec matches the API.Stacking: This PR targets
feat/refactor-docs. The broader docs platform / content restructure (Markdoc layout, nav, redirects migration, etc.) is not introduced here—it lives on the base branch. This branch adds quickstarts, the onboarding prompt, the eval harness, CI, and targeted doc updates on top of that foundation.Goals
Hookdeck quickstarts and agent prompt
docs/content/quickstarts/(curl, TypeScript, Python, Go, overview).docs/content/quickstarts/hookdeck-outpost-agent-prompt.mdoc: copy-paste template with explicit rules (no API key in chat, topics reconciliation, test destination, links into the rest of the docs).{{DOCS_URL}}and related copy aligned with production Hookdeck docs URLs where intended.string[]API shape.Agent evaluation (
docs/agent-evaluation/)01–10underscenarios/: basics (curl/TS/Python/Go), app templates (Next.js, FastAPI, Go HTTP), existing-app integration (08–10).run-agent-eval.ts), harness (eval-harness.ts), declarative Eval sections / pre-steps, sandboxed read/bash.score-transcript.ts(per-scenario heuristics, including 08–10 publish_beyond_test_only), optionalllm-judge.tsagainst each scenario’s Success criteria.SCENARIO-RUN-TRACKER.md,results/layout + templates,.env.example,AGENTS.md,.cursor/rules/agent-evaluation-authoring.mdc.execute-ci-artifacts.shruns generated curl + TypeScript against secrets;smoke-test-execute-ci-artifacts.sh+npm run smoke:execute-cifor local verification.Local runs are opt-in per scenario (
--scenario/--scenarios/--all); full suite and 08–10 are slow by design (clones, installs, multi-turn agents).Here are some examples of what was built:
Basic with some UI
Integration into SaaS templates
CI
.github/workflows/docs-agent-eval-ci.yml— Docs agent eval (CI slice).pushtomain,pull_request(same-repo only for the eval job), andworkflow_dispatch.ci-eval.sh→npm run eval:ci(scenarios 01, 02 with heuristic + LLM judge), thenexecute-ci-artifacts.shwithOUTPOST_API_KEY, webhook URL fromEVAL_TEST_DESTINATION_URL, and explicitOUTPOST_API_BASE_URL/OUTPOST_CI_PUBLISH_TOPIC.API / SDK docs
docs/apis/openapi.yaml: documentDestinationSchemaField.key(and related spec consistency).How to review
hookdeck-outpost-agent-prompt.mdocand one language quickstart end-to-end; check placeholders and links.docs/agent-evaluation/README.md+AGENTS.md; spot-check a scenario file’s Success criteria vsscore-transcript.ts/ judge behavior.ANTHROPIC_API_KEY,EVAL_TEST_DESTINATION_URL,OUTPOST_API_KEY; optional manualworkflow_dispatchafter merge if needed.feat/refactor-docsin the PR “Files changed” tab for the exact delta (this description highlights the onboarding/eval work; other touched files should match reviewer expectations).Follow-ups (optional)
SCENARIO-RUN-TRACKER.mdupdated as scenarios or the prompt change.