nas: 4 new CLI commands + --group on model list (G4)#474
Conversation
Wraps the new public API routes from the agentic-surface-area onsite plan:
roboflow train cancel <project>/<version> [--continue-if-no-refund]
roboflow train stop <project>/<version>
roboflow train results <project>/<version>
roboflow model star <model-id> [--unstar]
Plus extends `roboflow model list -p <project>` with `-g/--group <modelGroup>`,
the canonical "list NAS models per run" path. When --group is set, the list
command hits the public /models endpoint (full enriched projection: hardware,
latency, map5095, paretoOptimalFor, recommended ★) instead of walking versions
via the SDK.
Adapter additions in roboflow/adapters/rfapi.py:
cancel_version_training, stop_version_training, get_training_results,
list_project_models (with optional group), get_model_by_url,
favorite_nas_model
Backend companions:
- roboflow#11603 (G1, validator)
- roboflow#11605 (G6, projection + ?group=)
- roboflow#11610 (G2, public train cancel/stop + favorite)
- roboflow#11612 (G3, training results)
Tests: +13 cases across test_train_handler.py and test_model_handler.py
covering register, success paths, 409 + MODEL_NOT_NAS hint surfacing,
unstar flow, and the --group endpoint switch. All 298 CLI tests pass
locally; ruff check + ruff format clean.
CLI-COMMANDS.md updated with two new sections (train lifecycle + NAS
list/star/deploy).
E2E: driven against staging (api.roboflow.one) on
peter-robicheaux/beer-can-hackathon:
- `train results .../410` returned full NAS bundle (52 models,
recommendedByHardware, modelGroup)
- `model list -p ... -g <modelGroup>` rendered 53-row leaderboard
table with HARDWARE / LATENCY / MAP50 / MAP5095 / REC columns
- `model star 14CwSGmGetWh6rB0EnjL` → success, favorites reflected
- `model star --unstar` flips state
- `train cancel .../318` (finished version) → 409 surfaces hint
"Cancel only applies to in-flight runs."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| str, | ||
| typer.Argument( | ||
| help=( | ||
| "NAS-trained model id (Firestore document id). " |
There was a problem hiding this comment.
I don't thin we should talk about firestore here. It shows to the user.
| output_error(args, msg, hint=hint, exit_code=3) | ||
| return | ||
|
|
||
| output( |
There was a problem hiding this comment.
[Medium] train cancel may report success even when the server did not cancel
- File:
roboflow/cli/handlers/train.py:338-342(refund semantics documented at L92-95). _cancelalways emitsstatus: "cancelled"andTraining cancelled ...after any 2xx, but the documented default behavior is for the server to replyrefund:falsewithout cancelling unless--continue-if-no-refundis passed.- Why it matters: agents/scripts will treat the CLI result as confirmation that an in-flight paid run was cancelled when the server may have only returned a refund-window check. Behavioral correctness issue on a destructive command.
- Fix: derive
statusfrom the payload — ifcancelled:falseorrefund:false, output a distinctstatus: "not_cancelled"and surface a hint to rerun with--continue-if-no-refund. Add a unit test for that exact response.
| return response.json() | ||
|
|
||
|
|
||
| def get_model_by_url(api_key: str, workspace_url: str, model_url: str): |
There was a problem hiding this comment.
Claude-only — [Medium] get_model_by_url is dead code
- File:
roboflow/adapters/rfapi.py:154-161. - No CLI command calls it. Either remove or comment as scaffolding for a follow-on PR.
The public favorite endpoint now accepts the model URL slug (roboflow#11646), so the CLI can drop the Firestore-doc-id wart. Changes: - star_model argument is now `model_url`, accepting either the bare slug (when -w is set / a default workspace exists) or the workspace-prefixed form `<ws>/<slug>` — same shape as `model get`. - rfapi.favorite_nas_model parameter renamed `model_id` → `model_url` with urllib.parse.quote() for safety, since the slug is now what appears in the path. - Hints updated to point at models[].modelUrl instead of modelId, and the workspace fallback hint mentions the prefix form. Tests: +2 cases for the new parsing (workspace-prefixed URL vs bare slug + -w fallback). 22/22 model handler tests pass; 36/36 across model + train. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the backend cleanup in roboflow#11646. The training-results fixture now uses the public shape (trainingId is workspace/project/ version, models[].modelUrl, recommendedByHardware values are URL slugs). No behavior change in the CLI handler — it passes the response through.
|
✅ E2E verified against Branch installed via
{
"trainingId": "peter-robicheaux/beer-can-hackathon/410",
"versionId": "410",
"status": "finished",
"jobType": "nas",
"modelGroup": "pVYKOWUB6AUIVJMgPc7u-410-rfdetrNasGroup",
"modelCount": 52,
"recommendedByHardware": { "gpu": "beer-can-hackathon-410-nas-gpu-ec8a0e" },
"models": [{ "modelUrl": "beer-can-hackathon-410-nas-gpu-066866", "modelType": "rfdetr-nas", "metrics": {...} }]
}
|
Mirrors the wire rename in roboflow#11646. The public API field for the opaque model identifier is now `modelId` (the value is still the URL slug; that's an implementation detail callers shouldn't have to reason about). Changes: - `roboflow model star` argument: `model_url` → `model_id`. Help text and error hints updated to point at `models[].modelId`. - `rfapi.favorite_nas_model(model_url=...)` → `favorite_nas_model( model_id=...)`. Internal local var becomes `public_model_id` to keep the call-site readable. - Test fixtures: `model_url` arg → `model_id`, `models[].modelUrl` → `models[].modelId`. 36/36 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Description
Wraps the new public API routes from the agentic-surface-area onsite plan as CLI commands. Companion backend PRs (all live on staging):
New CLI commands
Plus
roboflow model list -p <project> -g/--group <modelGroup>— the canonical "list NAS models per run" path. When--groupis set, the list command hits the public/modelsendpoint with the enriched projection (hardware/latency/map5095/paretoOptimalFor/recommended) instead of walking versions via the SDK.Adapter additions
In
roboflow/adapters/rfapi.py:cancel_version_training,stop_version_trainingget_training_resultslist_project_models(with optionalgroup=),get_model_by_urlfavorite_nas_modelHow tested
Unit tests
+13 cases across
test_train_handler.pyandtest_model_handler.pycovering register, happy paths, the 409/CANNOT_CANCELhint surfacing, theMODEL_NOT_NAShint, the unstar flow, and the--groupendpoint switch. 298/298 CLI tests pass locally;ruff check+ruff formatclean.CLI-COMMANDS.mdgot two new sections — Train, monitor, cancel, stop and NAS models — list, star, deploy.E2E verified live on staging
Driven against
api.roboflow.oneonpeter-robicheaux/beer-can-hackathonwithAPI_URL=https://api.roboflow.one:train resultson a NAS parent (v410, 52 children):model list --grouprendered a 53-row leaderboard:model staron a real NAS child:$ roboflow --json -w peter-robicheaux model star 14CwSGmGetWh6rB0EnjL {"success": true, "model": {"id": "14CwSGmGetWh6rB0EnjL", "favorites": {...}}} $ roboflow --json -w peter-robicheaux model star 14CwSGmGetWh6rB0EnjL --unstar {"success": true, ...}train cancelon a finished version (text mode shows the actionable hint):$ roboflow train cancel peter-robicheaux/beer-can-hackathon/318 Error: Cannot cancel non-running train job. Hint: Cancel only applies to in-flight runs. Check status with 'roboflow train results <project>/<version>'.E2E exposed one error-shape mismatch on the backend (flat
{error: "Conflict", message: ...}lost the descriptive message through the CLI's parser); fixed in roboflow#11610 (nested{error: {message, code, type}}).🤖 Generated with Claude Code