Skip to content

Add eval validate and create commands#13

Merged
HartBrook merged 2 commits intomainfrom
feature/improve-eval-dx
Jan 18, 2026
Merged

Add eval validate and create commands#13
HartBrook merged 2 commits intomainfrom
feature/improve-eval-dx

Conversation

@HartBrook
Copy link
Copy Markdown
Owner

@HartBrook HartBrook commented Jan 18, 2026

Summary

Improves developer experience for creating and maintaining behavioral evals:

  • stag eval validate - Validate eval YAML files before running (saves API credits)

    • Checks assertion types, required fields, YAML structure, naming conventions
    • Provides helpful suggestions for common typos (e.g., llm_rubricllm-rubric)
    • Distinguishes between errors (blocking) and warnings (non-blocking)
  • stag eval create - Create new evals from templates with interactive wizard

    • Four built-in templates: security, quality, language, blank
    • --template flag to skip wizard and use template directly
    • --from flag to copy and customize existing evals
    • --project flag to save to .staghorn/evals/
    • --team flag to save to ./evals/ for team/community sharing
  • Example evals - Added example/team-repo/evals/ with team eval patterns

Test plan

  • go build ./... passes
  • go test ./... passes (22 new tests for validation and templates)
  • stag eval validate validates all evals correctly
  • stag eval validate catches invalid assertion types with suggestions
  • stag eval create --template security --name test --description "test" creates valid eval
  • stag eval create --team saves to ./evals/
  • Example evals in example/team-repo/evals/ are valid YAML

@HartBrook HartBrook merged commit 488c98d into main Jan 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant