Skip to content

feat(scenarios): complete Tier 1-3 scenario catalog#12

Merged
That1Drifter merged 3 commits intomasterfrom
feat/scenario-catalog-complete
Apr 10, 2026
Merged

feat(scenarios): complete Tier 1-3 scenario catalog#12
That1Drifter merged 3 commits intomasterfrom
feat/scenario-catalog-complete

Conversation

@That1Drifter
Copy link
Copy Markdown
Owner

Summary

  • Fills in the five stub scenarios with full manifests + intros (doc-qa-rag, pipeline-automation, workflow-agent, legacy-migration, incident-response), closing the core content gap called out at the top of TODO.md.
  • Adds a launch_vs_testing_conflict stakeholder-conflict surprise to support-triage (Priya pushes launch for a quarterly OKR, Marcus pushes for more testing), triggered deterministically when taxonomy_defined reaches attempted.
  • Updates TODO.md to mark the six items done.

Scenarios

Tier Scenario Company Highlights
1 doc-qa-rag Helix Biotherapeutics RAG over GxP docs; phantom_policy_citation, restricted_leak, contradictory_wiki_pages
2 pipeline-automation Keelwater Brokerage Carrier rate feeds; schema_drift_carrier_rename is the centerpiece
2 workflow-agent Axleflow Telematics CRM + billing + provisioning onboarding; billing_500_mid_workflow, provisioning_rate_limit, audit_log_mandate
3 legacy-migration Ironridge Mutual 17-year-old SOAP gateway, three-way political arc; ratebureau_complaint, hector_public_pushback, compliance_audit_notice
3 incident-response Lumenvest turn_budget: 12, multi-cause RCA under time pressure; vip_public_complaint, downstream_auth_500s, imani_shift_handoff

All manifests follow support-triage as the reference structure and stay under the <$2/session cost target.

Test plan

  • pnpm validate-scenarios — 7/7 scenarios parse and validate against scenario.schema.json
  • pnpm typecheck — 6/6 workspace packages clean
  • pnpm test — 34 tests across core, rubric, scenarios all pass
  • Smoke-run one of the new scenarios via pnpm --filter @fieldwork/cli dev -- dryrun <path> --trainee perfect before the first real play session

🤖 Generated with Claude Code

That1Drifter and others added 3 commits April 10, 2026 16:09
Fill in all five stub scenarios (doc-qa-rag, pipeline-automation,
workflow-agent, legacy-migration, incident-response) with full
manifests and intros, plus a stakeholder-conflict surprise in
support-triage.

- doc-qa-rag (T1): Helix Biotherapeutics RAG buildout, hallucination
  and restricted-doc leak surprises
- pipeline-automation (T2): Keelwater Brokerage carrier feeds with
  schema drift as the core test
- workflow-agent (T2): Axleflow Telematics 3-system onboarding with
  mid-workflow billing failure
- legacy-migration (T3): Ironridge Mutual 17-year-old SOAP gateway,
  stakeholder archaeology
- incident-response (T3): Lumenvest fintech page with turn_budget: 12
- support-triage: launch_vs_testing_conflict surprise (Priya vs Marcus)

All manifests validate against scenario.schema.json; pnpm typecheck
and pnpm test pass across the workspace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…atterns

Surfaced by staging smoke test: the action_pattern surprise trigger in
surprises.ts constructed `new RegExp(pattern)` with no flags and no
try/catch, so a manifest pattern using Perl-style `(?i)` (which JS
regex doesn't support) crashed the turn handler mid-session.

- surprises.ts: build regex with `'i'` flag, wrap in try/catch like
  score.ts does — a malformed pattern now skips the rule instead of
  throwing through the request.
- score.ts: also add `'i'` flag to payload_regex for consistency with
  payload_contains, which is already documented case-insensitive.
- doc-qa-rag + workflow-agent: strip the now-redundant `(?i)` prefix
  from six patterns so manifests stay JS-regex-portable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@That1Drifter That1Drifter merged commit e69d252 into master Apr 10, 2026
1 check passed
@That1Drifter That1Drifter deleted the feat/scenario-catalog-complete branch April 10, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant