v0.6.0 — model listing & text token estimator
Two additive features, no core type breakage, no MAESTRO_DIVERGENCES.md rows. Adds maestro-cms as a third consumer of the toolkit alongside Maestro and Morris.
llms.ModelLister + per-provider LatestInFamily (ADR-0012)
Optional capability for listing a provider's catalog and detecting newer models in the same family as a pinned ID. Surfaces "newer model available, upgrade?" — never auto-updates.
lister, ok := client.(llms.ModelLister)
if !ok { return } // provider doesn't expose a list (e.g. future vLLM)
models, _ := lister.ListModels(ctx)
newer, found := anthropic.LatestInFamily(currentID, models)
if found {
fmt.Printf("Newer model: %s (released %s)\n", newer.ID, newer.Created.Format("2006-01-02"))
}
// Or one-shot:
newer, found, err = client.LatestInFamily(ctx, currentID)- Anthropic — family
claude-{opus|sonnet|haiku}, crosses generations (claude-3-5-sonnet-…andclaude-sonnet-4-5-…are bothclaude-sonnet). Ordered byCreatedAt. - OpenAI — family = ID with
-YYYY-MM-DDstripped. Self-filtering by family-prefix means embedding/image models in the catalog don't collide withgpt-*queries. Ordered byCreated(Unix). - Google — family
gemini-{pro|flash|nano|ultra}. genai exposes no created date, so ordered by parsed numeric version in the ID. - Ollama —
ListModelsonly (local pulls have no canonical family).Createdis local pull time, not provider release time.
Permissive family parsing by design. Callers wanting major-version pinning filter the list themselves.
llms.EstimateTextTokens (ADR-0013)
Exported free function for budget-aware text chunking. Zero new dependencies.
budget := llms.EstimateTextTokens(s) // approx token count
// Directly assignable as func(string) int — e.g. for maestro-cms chunk injection.- Neutral bias (~4 chars/token, rune-counted) — intentionally distinct from the middleware
TokenEstimator's high bias. - Two estimators, two purposes: over-reserving at the limiter is safe; over-estimating at chunk time wastes API calls. ADR-0013 makes the split binding.
- Tokenizer-backed / model-aware variant deferred to a future ADR when a consumer needs the fidelity.
Compatibility
Additive new surface throughout. Existing TokenEstimator, ChatClient, rate-limiter behavior all unchanged. govulncheck clean (bumped golang.org/x/net to v0.55.0 to resolve GO-2026-5026 along the way).
Pre-1.0; v0.x minor versions may break.
🤖 Generated with Claude Code