Skip to content

refactor(docs): rename docs/skills → project, retain academic uses (4/4)#1245

Merged
christso merged 1 commit into
mainfrom
refactor/rename-pr4-docs
May 15, 2026
Merged

refactor(docs): rename docs/skills → project, retain academic uses (4/4)#1245
christso merged 1 commit into
mainfrom
refactor/rename-pr4-docs

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Summary

PR 4 of 4 in the benchmark → project rename. Stacks on #1244 (PR 3). Aligns documentation with the rename and adds a durable naming-convention note to AGENTS.md.

Renames (registry/workspace → "project")

  • `apps/web/src/content/docs/docs/tools/studio.mdx` — "Benchmarks Dashboard" → "Projects Dashboard"; "Registering/Removing Benchmarks" → Projects; CLI flag descriptions; `~/.agentv/benchmarks.yaml` → `projects.yaml`; YAML example top-level key; `/api/benchmarks` URLs; "Add Benchmark"/"single-benchmark"/"multi-benchmark" UI strings.
  • `apps/web/src/content/docs/docs/evaluation/running-evals.mdx` — `benchmarks.yaml` in the lightweight-config-files list → `projects.yaml`.
  • `AGENTS.md` Wire Format Convention section — `benchmarks.yaml` → `projects.yaml`, `benchmark_id` → `project_id`, `BenchmarkEntry`/`BenchmarkEntryYaml` code example → `Project*`, file pointer `packages/core/src/benchmarks.ts` → `projects.ts`.

Added

  • AGENTS.md "Naming Convention: Project vs Benchmark" section between TypeScript Guidelines and Wire Format Convention. Codifies the distinction:
    • Project = registry/workspace container (the top-level Studio organizes around). Phoenix/Langfuse/Braintrust/Weave/LangSmith terminology.
    • Benchmark = a curated eval suite (academic ML sense: MMLU, HumanEval, SWE-bench). Example dirs use this sense.
    • `benchmark.json` per-run artifact = a third, unrelated concept (Agent Skills compat).
    • Mentions the auto-migration from `benchmarks.yaml` → `projects.yaml` so contributors don't get confused by the legacy filename.

Intentionally NOT renamed

  • `benchmark.json` per-run artifact and all references (Agent Skills compatibility output — different concept).
  • `examples/*-benchmark/` directory names (`benchmark-tooling`, `multi-model-benchmark`, `offline-grader-benchmark`, `bug-fix-benchmark`) — these really are eval suites in the academic sense.
  • "benchmark agents" / "benchmark datasets" / "grader benchmarks" / "Snapshot MCP for benchmarks" usages (verb / academic ML sense).

Test plan

  • `bun run typecheck` — passes
  • `bun run lint` — clean
  • `bun run test` — 2374 tests pass (1768 core + 67 eval + 539 cli, 0 fail)
  • `bun run build` — all packages
  • Pre-push hooks green (validate:examples over 56 example evals)
  • Cross-checked that all remaining "benchmark" references in docs are either: `benchmark.json` artifact mentions, example-dir paths, academic verb usage, or the new AGENTS.md section explaining the distinction.

Cumulative state of the rename (PRs 1–4)

With this PR landed, the full rename is done:

Layer Term used PR
Core types/registry/migration project #1242
HTTP API routes + JSON keys project #1243
Studio frontend (routes, components, hooks) project #1244
Docs + AGENTS.md naming convention project this PR
Per-run `benchmark.json` artifact (Agent Skills) benchmark unchanged
`examples/*-benchmark/` example dirs benchmark unchanged

🤖 Generated with Claude Code

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 14, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: edcc9d2
Status:⚡️  Build in progress...

View logs

@christso christso marked this pull request as ready for review May 15, 2026 00:29
@christso christso force-pushed the refactor/rename-pr3-studio branch from 2f530a0 to 56265a9 Compare May 15, 2026 00:43
Base automatically changed from refactor/rename-pr3-studio to main May 15, 2026 00:43
…ic uses

PR 4 of 4 in the benchmark → project rename. Aligns documentation and skill
cards with the rename that landed in PRs 1–3, and adds a naming-convention
note to AGENTS.md so the project/benchmark distinction is durable.

Renamed (registry/workspace concept → "project"):
- apps/web/src/content/docs/docs/tools/studio.mdx
  - "## Benchmarks Dashboard" → "## Projects Dashboard"
  - "### Registering Benchmarks" → "### Registering Projects"
  - "### Removing a Benchmark" → "### Removing a Project"
  - CLI flag descriptions ("Register a benchmark by path", etc.)
  - All `~/.agentv/benchmarks.yaml` references → `projects.yaml`
  - YAML example `benchmarks:` top-level key → `projects:`
  - `/api/benchmarks` URLs → `/api/projects`
  - "Add Benchmark" / "single-benchmark view" / "multi-benchmark" text
- apps/web/src/content/docs/docs/evaluation/running-evals.mdx
  - `benchmarks.yaml` in the "Lightweight config and cache files" list → `projects.yaml`
- AGENTS.md (Wire Format Convention section)
  - YAML file list updated to `projects.yaml`
  - HTTP response field examples updated to `project_id`
  - TypeScript boundary example: BenchmarkEntry/BenchmarkEntryYaml → ProjectEntry/ProjectEntryYaml
  - "Reading back" pointer: `packages/core/src/benchmarks.ts` → `projects.ts`

Added (new):
- AGENTS.md "Naming Convention: Project vs Benchmark" section between
  TypeScript Guidelines and Wire Format Convention. Codifies the
  distinction so future contributors don't re-conflate the registry
  concept with academic eval-suite terminology.

Intentionally kept (academic / artifact / verb usages):
- benchmark.json per-run artifact and all references to it
  (Agent Skills compatibility — different concept, different rename if ever).
- examples/*-benchmark/ directory names (benchmark-tooling, multi-model-benchmark,
  offline-grader-benchmark, bug-fix-benchmark) — they really are eval suites.
- "benchmark agents" / "benchmark datasets" / "grader benchmarks" usages
  (verb / academic ML sense).
- "Snapshot MCP for benchmarks" reference in AGENTS.md (academic).

Stacks on refactor/rename-pr3-studio. With this PR landed, the codebase
consistently uses "project" for the registry/workspace concept and
"benchmark" only for eval-suite or per-run-artifact usages.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@christso christso force-pushed the refactor/rename-pr4-docs branch from bccc35d to edcc9d2 Compare May 15, 2026 00:43
@christso christso merged commit 538acfb into main May 15, 2026
3 of 4 checks passed
@christso christso deleted the refactor/rename-pr4-docs branch May 15, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant