Skip to content

TH-3418: add end-to-end agent testing and production quality monitoring cookbooks#583

Merged
hadarishav merged 2 commits intoastrofrom
feature/th-3418-prod-monitoring-end-to-end
Apr 8, 2026
Merged

TH-3418: add end-to-end agent testing and production quality monitoring cookbooks#583
hadarishav merged 2 commits intoastrofrom
feature/th-3418-prod-monitoring-end-to-end

Conversation

@KarthikAvinashFI
Copy link
Copy Markdown
Contributor

@KarthikAvinashFI KarthikAvinashFI commented Apr 8, 2026

Summary

Adds two new use-case cookbooks under src/pages/docs/cookbook/use-cases/:

  • end-to-end-agent-testing: full agent lifecycle walkthrough using FutureAGI's complete stack — define the agent, use Simulate to generate 100 diverse conversations, run Evals to score quality, diagnose with Agent Compass + Fix My Agent, run Optimize to rewrite the system prompt based on the failures, add Protect guardrails, and wire it into Observe.
  • production-quality-monitoring: monitoring pipeline using Observe for tracing, inline Evals for quality scoring, Alerts for latency/error spikes, Agent Compass for failure clustering, and Protect as the final safety gate.

Both cookbooks are written as narrative walkthroughs with real screenshots and analysis from actual runs, not placeholder data.

Test plan

  • Astro preview renders both pages without MDX errors
  • All image and video URLs (S3) load correctly
  • Step numbering, code blocks, and tables render as expected
  • Internal links to other cookbook pages resolve

Two use-case cookbooks demonstrating FutureAGI's full agent and observability stack:

- end-to-end-agent-testing: simulate 100 conversations, evaluate quality,
  diagnose with Agent Compass + Fix My Agent, optimize prompt, add Protect
  guardrails, and wire into Observe
- production-quality-monitoring: define agent, trace every call with Observe,
  attach inline Evals, configure Alerts, cluster failures with Agent Compass,
  block unsafe outputs with Protect
- Add new Use Cases group to cookbook tab (between Quickstart and Getting Started)
- Replace 4 broken sibling links to other use-case cookbooks with valid links
  to quickstart cookbooks and the existing sibling
@hadarishav hadarishav merged commit deb84d4 into astro Apr 8, 2026
1 check passed
@hadarishav hadarishav deleted the feature/th-3418-prod-monitoring-end-to-end branch April 8, 2026 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants