Agent Eval

Agent Eval is a skill for evaluating agentic AI pipeline systems at both the component level and end-to-end level. It helps you define what to measure, build or sample eval cases, run repeatable tests, track regressions over time, and turn results into grounded takeaways about what improved, what regressed, and what to change next.

What it offers

Install

Manual

git clone https://github.com/fsilavong/agent-eval.git ~/.claude/skills/agent-eval

Install with the Vercel Skills CLI:

npx skills add fsilavong/agent-eval

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
references		references
README.md		README.md
SKILL.md		SKILL.md
agent-eval-demo.gif		agent-eval-demo.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Eval

What it offers

Install

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Agent Eval

What it offers

Install

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages