Skip to content

pearl th-2b5f63: th cast models — list live model groups on configured provider#62

Merged
brentrager merged 1 commit into
mainfrom
th-2b5f63-cast-models
May 22, 2026
Merged

pearl th-2b5f63: th cast models — list live model groups on configured provider#62
brentrager merged 1 commit into
mainfrom
th-2b5f63-cast-models

Conversation

@brentrager
Copy link
Copy Markdown
Contributor

Summary

Adds th cast models [--provider NAME] [--json] [--filter PATTERN] — hits GET /v1/models on the configured LiteLLM provider (e.g. llm.smoo.ai) and prints the alphabetized list of model groups with the gradient wordmark header and a N models on URL footer.

  • Default provider is the one backing the default routing slot (what th routing show highlights); --provider overrides on multi-provider setups.
  • Tolerant body parser: strips ASCII control chars (0x00-0x1F) before strict JSON parse, with a byte-scan fallback that recovers complete "id":"NAME" entries from truncated responses. When strict + lossy counts disagree, the footer surfaces a ! warning.
  • Exits 2 if no provider is configured; prints status + first 200 chars of body on non-200.
  • cmd_cast_models runs on spawn_blocking so reqwest's blocking client doesn't panic inside the tokio runtime.
  • Drive-by: adds under_test_model: None to a TuiTaskConfig literal in smooth-bench/src/main.rs (pre-existing build break on main that was blocking pnpm pre-commit-check).

Sample output

  Smooth cast · models
  https://llm.smoo.ai/v1

  claude-haiku-4-5
  claude-opus-4-7
  claude-sonnet-4-6
  gemini-2.5-flash
  smooth-coding
  smooth-fast
  smooth-judge
  …

  94 models on https://llm.smoo.ai/v1

Test plan

  • 12 colocated unit tests covering URL helper, control-char stripper, lossy parser, strict vs lossy reconciliation, sort + filter, and an end-to-end run against a hand-rolled TcpListener mock server (cargo test -p smooai-smooth-cli cast_models → 12 passed)
  • cargo fmt -- --check
  • cargo clippy --workspace --all-targets clean (0 errors)
  • Manual smoke test: th cast models, th cast models --filter smooth, th cast models --json, th cast models --provider nonexistent (exits 2)

Pearl: th-2b5f63

…d provider

Adds `th cast models [--provider NAME] [--json] [--filter PATTERN]` —
hits `GET /v1/models` on the configured LiteLLM provider (e.g.
llm.smoo.ai) and prints the alphabetized list of model groups with
the gradient wordmark header and a `N models on URL` footer.

The default provider is the one backing the `default` routing slot
(what `th routing show` highlights); `--provider` overrides on
multi-provider setups.

The body parser is tolerant: it strips ASCII control chars (0x00-
0x1F) before strict JSON parse, and a byte-scan fallback extracts
complete `"id":"NAME"` entries when the response is truncated.
When strict and lossy counts disagree, the footer surfaces a `!`
warning so deploys returning partial bodies don't fail silently.

- Exits 2 if no provider is configured.
- Prints status + first 200 chars of body on non-200.
- `cmd_cast_models` runs on a `spawn_blocking` thread so reqwest's
  blocking client doesn't panic inside the tokio runtime.

Twelve tests cover the URL helper, control-char stripper, lossy
parser, strict vs lossy reconciliation, sort + filter, and an
end-to-end run against a hand-rolled TcpListener mock server.

Also fixes a pre-existing build break in smooth-bench:
TuiTaskConfig literal was missing `under_test_model` (added field
not yet wired in), which was failing `cargo clippy --workspace
--all-targets` and blocking the pre-commit hook.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 22, 2026

🦋 Changeset detected

Latest commit: 1f6e846

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@smooai/smooth Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@brentrager brentrager merged commit a9668dc into main May 22, 2026
1 of 2 checks passed
brentrager added a commit that referenced this pull request May 22, 2026
Root cause for every score-tui 0/X result this session. The driver
model was being fed the FULL tmux pane snapshot — including its own
prior 'You: ...' turns — and continuing the user-side narrative by
inventing both the questions AND the agent's responses. Sessions like
~/.smooth/coding-sessions/c503794b-... showed long chains of:

  [user] Please read INSTRUCTIONS.md and tell me what it says.
  [user] Okay, I will read INSTRUCTIONS.md for you.       ← invented
  [user] Okay, I have read INSTRUCTIONS.md. It says: ...  ← invented file contents
  [user] Please read tests/acronym.rs ...
  [user] Okay, I will read tests/acronym.rs ...           ← invented
  [user] Okay, I have read tests/acronym.rs. It contains ... ← invented

Net effect: the under-test agent never got real chances to fire tools
(0 tool_role messages in 12-msg sessions; only ~2 actual assistant
turns); pytest ran against unedited workspaces; the bench reported
0/6 even for claude-sonnet-4-6 — a gold-standard tool-use model.
Total cost ~$0.03/task for claude across the matrix because the
agent was barely ever called.

Two fixes:

1. **Strip 'You:' lines from the pane before showing to driver** —
   new  collapses everything between
   each  marker and the next //
   marker. The driver now sees only the assistant's output, with
   no template to auto-complete.

2. **Tighten system prompt** with a CRITICAL rule explicitly
   forbidding first-person action narration ('Okay, I will...',
   'Okay, I have read...', 'I'll edit the file now'). Plus a new
    token so the driver has an explicit way to let the agent
   keep working instead of filling silence with hallucinated turns.

Drive-by: fix the rebase-leftover conflict marker in main.rs
(under_test_model field of TuiTaskConfig was uncleanly resolved
in PR #62's drive-by build fix).

Pearl th-driver-hallucination.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant