Skip to content

feat: implement bundle list/show and run list/show commands#59

Merged
placerda merged 4 commits into
developfrom
feature/browse-commands
Apr 13, 2026
Merged

feat: implement bundle list/show and run list/show commands#59
placerda merged 4 commits into
developfrom
feature/browse-commands

Conversation

@Dongbumlee
Copy link
Copy Markdown
Collaborator

Summary

Implements 4 read-only browse commands that were previously planned stubs:

  • agentops bundle list — list all bundles with evaluators and threshold counts
  • agentops bundle show — display full bundle detail
  • agentops run list — list all past evaluation runs with status
  • agentops run show — display full run detail with metrics, thresholds, Foundry URL

All commands are read-only, require no Azure credentials, and have no side effects.

Changes

New files

  • src/agentops/services/browse.py — service layer with \list_bundles\, \show_bundle\, \list_runs\, \show_run\
  • tests/unit/test_browse.py — 16 tests (service + CLI)

Modified

  • src/agentops/cli/app.py — replaced 4 planned stubs with working implementations

Testing

  • 16 new unit tests (all pass)
  • 96 total tests pass
  • Smoke-tested against live workspace with 21 past runs and 10 bundles

- Add services/browse.py with list_bundles, show_bundle, list_runs, show_run
- Replace planned stubs with working implementations in cli/app.py
- bundle list: shows all bundles with evaluators and threshold count
- bundle show: displays full bundle detail (evaluators, thresholds, metadata)
- run list: shows all past runs with status, bundle, dataset, duration
- run show: displays full run detail (metrics, thresholds, items, Foundry URL)
- Add 16 unit tests (service + CLI) in test_browse.py
- All commands are read-only, no side effects, no Azure API calls
Split app.py (487 lines) into focused command modules:

- app.py (114 lines) — root app, global callback, init, sub-app registration
- eval_commands.py (108 lines) — eval run, eval compare
- report_commands.py (66 lines) — report, report show/export stubs
- browse_commands.py (152 lines) — bundle list/show, run list/show/view
- config_commands.py (56 lines) — config cicd, config validate/show stubs
- planned.py (57 lines) — dataset, monitor, trace, model, agent stubs
- _planned.py (12 lines) — shared planned command helper

No behavior changes. All 96 tests pass.
- Move dataset stubs to dataset_commands.py (ready for Tier 2 implementation)
- Inline monitor/trace/model/agent stubs in app.py (1-2 commands each)
- Delete planned.py — no more catch-all stub file
@placerda placerda merged commit ba9a465 into develop Apr 13, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants