Release v0.6.0 — model listing & text token estimator · SnapdragonPartners/maestro-llms

Two additive features, no core type breakage, no MAESTRO_DIVERGENCES.md rows. Adds maestro-cms as a third consumer of the toolkit alongside Maestro and Morris.

`llms.ModelLister` + per-provider `LatestInFamily` (ADR-0012)

Optional capability for listing a provider's catalog and detecting newer models in the same family as a pinned ID. Surfaces "newer model available, upgrade?" — never auto-updates.

lister, ok := client.(llms.ModelLister)
if !ok { return } // provider doesn't expose a list (e.g. future vLLM)

models, _ := lister.ListModels(ctx)
newer, found := anthropic.LatestInFamily(currentID, models)
if found {
    fmt.Printf("Newer model: %s (released %s)\n", newer.ID, newer.Created.Format("2006-01-02"))
}

// Or one-shot:
newer, found, err = client.LatestInFamily(ctx, currentID)

Anthropic — family claude-{opus|sonnet|haiku}, crosses generations (claude-3-5-sonnet-… and claude-sonnet-4-5-… are both claude-sonnet). Ordered by CreatedAt.
OpenAI — family = ID with -YYYY-MM-DD stripped. Self-filtering by family-prefix means embedding/image models in the catalog don't collide with gpt-* queries. Ordered by Created (Unix).
Google — family gemini-{pro|flash|nano|ultra}. genai exposes no created date, so ordered by parsed numeric version in the ID.
Ollama — ListModels only (local pulls have no canonical family). Created is local pull time, not provider release time.

Permissive family parsing by design. Callers wanting major-version pinning filter the list themselves.

`llms.EstimateTextTokens` (ADR-0013)

Exported free function for budget-aware text chunking. Zero new dependencies.

budget := llms.EstimateTextTokens(s) // approx token count
// Directly assignable as func(string) int — e.g. for maestro-cms chunk injection.

Neutral bias (~4 chars/token, rune-counted) — intentionally distinct from the middleware TokenEstimator's high bias.
Two estimators, two purposes: over-reserving at the limiter is safe; over-estimating at chunk time wastes API calls. ADR-0013 makes the split binding.
Tokenizer-backed / model-aware variant deferred to a future ADR when a consumer needs the fidelity.

Compatibility

Additive new surface throughout. Existing TokenEstimator, ChatClient, rate-limiter behavior all unchanged. govulncheck clean (bumped golang.org/x/net to v0.55.0 to resolve GO-2026-5026 along the way).

Pre-1.0; v0.x minor versions may break.

🤖 Generated with Claude Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.0 — model listing & text token estimator

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

`llms.ModelLister` + per-provider `LatestInFamily` (ADR-0012)

`llms.EstimateTextTokens` (ADR-0013)

Compatibility

Uh oh!

v0.6.0 — model listing & text token estimator

llms.ModelLister + per-provider LatestInFamily (ADR-0012)

llms.EstimateTextTokens (ADR-0013)

Compatibility

Uh oh!

`llms.ModelLister` + per-provider `LatestInFamily` (ADR-0012)

`llms.EstimateTextTokens` (ADR-0013)