Skip to content

feat(studio): support runtime benchmark discovery without server restart #1144

@christso

Description

@christso

Summary

Studio currently treats benchmark/project discovery as startup-time bootstrap. That makes the deployed Studio process too rigid for long-running environments where repos are added or removed over time.

We need Studio to stay up continuously and reflect benchmark repo additions/removals at runtime without requiring an agentv serve restart.

Problem

Current deployment behavior effectively assumes:

  1. clone repos
  2. register repos into ~/.agentv/projects.yaml
  3. start agentv serve

That model breaks down for a 24/7 Studio deployment because:

  • repos may be cloned/populated after Studio has already started
  • repos may be removed or replaced while Studio is running
  • startup bootstrap can be slow and couples server availability to repo sync work
  • if startup bootstrap fails or is delayed, Studio comes up empty or never becomes useful

In our OpenShift deployment this surfaced as multiple operational issues, but the deeper product issue is that Studio currently appears to rely on startup-time registration rather than runtime discovery/refresh.

Desired behavior

Studio should be able to run continuously while benchmark repos are added or removed dynamically.

Concretely:

  • agentv serve should be able to start immediately and remain running
  • Studio should continuously discover projects from configured roots, or otherwise reload its registry at runtime
  • when a repo with .agentv/ appears under a configured discovery root, it should appear in Studio without restarting agentv serve
  • when such a repo is removed, it should disappear from Studio without restarting agentv serve
  • /api/benchmarks and the UI should reflect the current project set while Studio is live

Why startup-only bootstrap is insufficient

Starting Studio only after cloning/registration makes the system too inflexible for end users:

  • the Studio process is not truly long-running if repo churn requires app restarts
  • background repo sync or sidecar population becomes awkward because Studio may not re-read project state
  • users cannot treat Studio as a continuously available service with dynamic project visibility

Proposed direction

Any implementation that achieves runtime correctness is fine, but the solution likely needs one of:

  • live discovery from configured roots (watcher or polling)
  • runtime reload of the project registry when projects.yaml changes
  • an explicit refresh mechanism exposed by the server/UI/API that updates the in-memory project list without process restart

A good implementation should avoid requiring deployments to restart the Studio process whenever repos are added or removed.

Suggested acceptance criteria

  • Studio can start with zero projects and stay healthy
  • adding a new repo containing .agentv/ under a configured discovery root causes it to appear in Studio without restarting the server
  • removing that repo causes it to disappear from Studio without restarting the server
  • /api/benchmarks updates accordingly while the same agentv serve process remains running

E2E verification

Red

  1. Start agentv serve with runtime discovery configured and no projects initially present.
  2. Confirm Studio is running.
  3. Add a repo containing .agentv/ under the configured discovery root.
  4. Observe current behavior: Studio does not show the repo until the process is restarted.

Green

  1. Start agentv serve with runtime discovery configured and no projects initially present.
  2. Confirm Studio is running.
  3. Add a repo containing .agentv/ under the configured discovery root.
  4. Verify the repo appears in /api/benchmarks and the Studio UI without restarting agentv serve.
  5. Remove the repo.
  6. Verify it disappears from /api/benchmarks and the UI without restarting agentv serve.
  7. Throughout the test, verify the same Studio process remains up continuously.

Notes

This request is about Studio/runtime project discovery behavior, not just deployment scripts. A deployment-level workaround (clone/register before startup) is not enough for the 24/7 use case described above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    in-progressClaimed by an agent — do not duplicate work

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions