Create evals for the structured-context agent skill

## Context

The agent skill (`skills/structured-context/`) needs automated evals to verify it remains accurate as the tool and schemas evolve. Without evals, regressions in skill quality are invisible.

## Use cases to cover

1. **Validate a space** — agent runs `validate` on a space and correctly interprets errors
2. **Schema design and authoring** — agent writes or modifies a schema file using correct field names and structure
3. **Troubleshoot a validation error** — agent diagnoses a common error (e.g. missing field, broken wikilink) from validate output
4. **Qualitative content assessment** — agent retrieves rules via `schemas show` and applies them to review content (depends on #37)

## Notes

- Evals should run against real or fixture spaces/schemas where practical
- Consider using `bun run test` infrastructure or a separate `evals/` directory
- This tracks ongoing skill health, not just initial correctness

Closes once eval suite is running in CI or as a documented manual process.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create evals for the structured-context agent skill #38

Context

Use cases to cover

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Create evals for the structured-context agent skill #38

Description

Context

Use cases to cover

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions