Skip to content

Provider-Agnostic LLM, Ordered Item Conversations, the Weather Agent

Latest

Choose a tag to compare

@bw19 bw19 released this 03 Jul 19:06

v1.44.0 reworks the LLM subsystem around two ideas: a caller should not have to name a provider, and a conversation is an ordered log of typed items rather than a flat list of messages. A Chat/ChatLoop caller now passes llmapi.ProviderAny plus a capability tier — ModelFast, ModelDefault, or ModelSmart — and llm.core resolves which configured provider serves the request at runtime, so a single API key of any brand (Claude, ChatGPT, or Gemini) is enough to run every LLM example. The per-provider model-name constants are gone; models are named by tier, by provider family, or by a concrete vendor-prefixed string, and each provider builds its alias table from its own live models-list API. The conversation model is now an ordered, append-only []llmapi.Item log — message, tool-call, tool-result, or reasoning — the neutral shape every provider translates to and from its native wire format; the ChatGPT and LiteLLM providers now speak the OpenAI Responses API. Alongside the subsystem work, the release ships the weather.example agent — the canonical "how to build an agent" example, an agent modeled as a workflow — adds a caller-facing reasoning Effort option, standardizes on "reasoning" over "thinking" in the token accounting, and hardens actor-token verification on the receive path. The upgrade skill performs the config-, const-, and field-renames mechanically and guides the []Item migration.

Highlights

  • Provider-agnostic LLM calls. Pass llmapi.ProviderAny (an empty or "any" provider) plus a tier alias (ModelFast/ModelDefault/ModelSmart), a provider-family alias, or a concrete model, and llm.core resolves the provider at runtime via a new OnResolveProvider outbound event. One key of any brand works; ChatOut.ResolvedProvider lets a caller pin the resolution for stickiness.
  • Live model-alias resolution. Each provider builds its tier/family alias table from its own /v1/models API — an eager startup warm, a 6-hour refresh ticker, and a lazy first-use fetch — instead of shipped constants. The per-provider model-name consts are removed.
  • Ordered []Item conversation model. The flat []llmapi.Message list is replaced by an append-only []llmapi.Item log (message / tool-call / tool-result / reasoning), mirroring the OpenAI Responses "items" shape. It represents interleaved reasoning and tool-call ordering that the flat model could not.
  • ChatGPT and LiteLLM speak the Responses API. Both providers now use OpenAI's /v1/responses instead of Chat Completions; the Turn bus contract is unchanged, so nothing downstream is affected.
  • Reasoning Effort option. A new Effort string on ChatOptions/TurnOptions forwards verbatim to each provider's native reasoning field. Usage.ThinkingTokens is renamed Usage.ReasoningTokens.
  • The weather.example agent. A port of the Pydantic AI weather-agent example: an AskAgent workflow whose single task runs llm.core's ChatLoop as a subgraph, exposing two tools the model chains to answer natural-language weather questions.
  • Provider endpoint-config renames. CompletionURL becomes MessagesURL (claudellm), ModelsURL (geminillm), or ResponsesURL (chatgptllm, litellm); a new ModelsURL config drives alias resolution.
  • Actor tokens are verified on presence. Any present actor token is now signature-verified regardless of requiredClaims, closing a path where an in-handler IfActor/ParseActor read could trust a forged actor.
  • RAD scaffolding. genservice now scaffolds a compiling handler stub and a per-feature test stub for every declared feature, and syncs feature godoc to the handler; a new cmd/geninit absorbs project bootstrap.

LLM Provider Portability

A caller no longer hardcodes which LLM microservice answers. It passes llmapi.ProviderAny — an empty string or "any" — together with a capability-tier alias, and llm.core picks a configured provider that serves the model:

result, _, _, err := llmapi.NewClient(svc).Chat(ctx, llmapi.ProviderAny, llmapi.ModelDefault, items, tools, nil)

The three global tiers are ModelFast, ModelDefault, and ModelSmart. Resolution runs through a new OnResolveProvider(model) outbound event: each provider sinks it and answers ok = configured && serves(model); llm.core reads the verified host from a responder's frame, picks one, and returns 503 if none is configured — there is no silent fallback to a simulated provider. Resolution happens once per Chat call, and once in the ChatLoop workflow via its persisted InitChat step so a retry does not re-resolve. ChatOut.ResolvedProvider reports which provider answered, so a caller that wants turn-to-turn stickiness can pin it.

Explicit provider selection still works — pass a concrete hostname like "chatbox.example" — and remains the only way to reach a provider that is walled off from any resolution.

Live Model-Alias Resolution, No Model Constants

The per-provider model-name constants (claudellmapi/chatgptllmapi/geminillmapi models.go) are removed. A model is named three ways: a global tier alias, a provider-family alias, or a concrete vendor-prefixed model string. Each provider builds its alias table from its own live models-list API rather than a shipped default, populated three ways — an eager OnStartup warm, a 6-hour RefreshModels ticker, and a lazy fetch on first alias resolve. Until a fetch succeeds, only concrete (vendor-prefixed) names resolve; a concrete name never depends on the fetch. Per-family "latest" access is namespaced (claude-<family>-latest, gpt-*-latest, Gemini's real gemini-*-latest pointers). LiteLLM treats each operator-defined model_name in the proxy's /v1/models as an alias, so the portable tiers work when the operator names entries fast/default/smart.

Ordered Item Conversation Model

The conversation is now an ordered, append-only []llmapi.Item log instead of a flat []llmapi.Message. An Item is exactly one of: a message, a tool call, a tool result, or a reasoning block. This is the neutral shape each provider translates to and from its native wire format, and it can represent interleaved reasoning/tool-call ordering and opaque provider reasoning payloads that the flat model could not.

Constructors wrap a typed value into an Item with AsItem():

items := []llmapi.Item{
    llmapi.NewMessage("system", "You are a helpful weather assistant…").AsItem(),
    llmapi.NewMessage("user", question).AsItem(),
}

Chat takes and returns []Item — the full conversation (input items plus produced items), matching ChatOut's contract and the error-resume pattern, so a caller that hits an error mid-loop can resume from the returned items rather than restart. Inside ChatLoop, CallLLM is the sole owner of the items key (written as a plain replace each turn), and the only append-reduced key is toolResults: each ExecuteTool branch contributes one result, the fan-in concatenates them, and the next CallLLM folds them into the conversation — removing the fragile "write only a delta" rule the old items-reducer design forced on every tool branch.

The ChatGPT and LiteLLM Providers Move to the Responses API

Both OpenAI-shaped providers now speak the Responses API (/v1/responses) instead of Chat Completions: the system message folds into instructions, assistant tool calls become function_call items, tool results become function_call_output items correlated by call_id, and reasoning replay rides on encrypted_content gated by observed reasoning tokens. The Turn bus contract (the llmapi types) is unchanged, so nothing downstream of the providers is affected. A latent LiteLLM bug — omitting stopReason from its Turn, which made llm.core read every turn as StopReasonUnknown — is fixed in the same pass.

Reasoning Effort and Reasoning Tokens

A new Effort string on ChatOptions/TurnOptions is forwarded through llm.core into each provider's native reasoning field verbatim, with no normalization: Anthropic output_config.effort, OpenAI/LiteLLM reasoning.effort (gated on the reasoning-model check), Gemini thinkingConfig.thinkingLevel. An unsupported value returns the provider's own 400 by design. To standardize on "reasoning" as the canonical neutral term, Usage.ThinkingTokens is renamed Usage.ReasoningTokens (JSON reasoningTokens). Separately, claudellm now defaults max_tokens to the resolved model's own output cap (captured from the same models-list fetch) instead of a hardcoded 4096 that truncated long responses; a caller-supplied MaxTokens still wins.

The Weather Agent Example

weather.example is the suite's canonical answer to "how do I build an agent?" — a port of the Pydantic AI weather-agent example. It answers natural-language weather questions by exposing two of its own endpoints as LLM tools and letting the model chain them: LatLng geocodes a place name, Forecast returns that coordinate's conditions, both returning deterministic mock data so the example needs no third-party weather account.

The teaching point is the shape: the agent is the AskAgent workflow, not a function, because agents in Microbus are workflows. AskAgent is a single-node graph whose one task, Answer, runs llm.core's ChatLoop as a subgraph — one node, but real depth, since ChatLoop is itself a multi-step workflow. A synchronous Ask function runs the identical loop in-process via a single Chat call to give the guided tour one browser-clickable URL. The example takes no foremanapi dependency: a microservice that provides a workflow must not depend on the execution engine to run it. The full walkthrough is in the weather package reference. Running it end-to-end needs one real LLM provider key; the tests mock ChatLoop.

Other Changes

  • examples/ is renamed exampleservices/. The example-microservice directory and its Go import paths change from .../fabric/examples/<name> to .../fabric/exampleservices/<name>. The .example hostnames are unchanged — this is a directory and import-path rename only.
  • cmd/geninit. A new generator absorbs the mechanical project scaffolding the init-project skill previously drove by hand — main.go, config/env files, CLAUDE.md, .gitignore, VS Code launch, and .claude/settings.json — and mints the Ed25519 token-signing key in Go (no openssl dependency). It is idempotent, so it is safe to re-run.
  • genservice scaffolds handler and test stubs. For every declared feature that lacks one, genservice now emits a compiling handler stub in service.go (signature and godoc projected from definition.go, body a TODO plus a naked return) and a per-feature integration-test stub in service_test.go. Both are append-only and keyed off the handler/test name, so a covered directory regenerates to no change. It also syncs each feature's godoc onto its service.go handler.
  • foreman NumShards is startup-only. Sharding is a heavyweight topology change (it opens and migrates database instances), so NumShards is applied once in OnStartup and a change now takes effect only on restart rather than hot-reloading on a live config edit. Growth-only semantics still apply at startup.
  • OpenAPI body-level descriptions. A jsonschema_description tag on an HTTPRequestBody/HTTPResponseBody magic field is now rendered onto the request/response node — the only way to describe a non-struct body such as []string.

Breaking Changes

The upgrade skill handles each of these — the config, const, and field renames are mechanical; the []Item migration is grep-guided.

  • The []llmapi.Message conversation model is replaced by []llmapi.Item. Chat, ChatLoop, and a provider's Turn now take and return []Item. Wrap a constructed message, tool call, tool result, or reasoning block into an item with AsItem(). The AppendItems helper is replaced by AsItem.
  • The per-provider model-name constants are removed. The claudellmapi/chatgptllmapi/geminillmapi models.go consts no longer exist. Name a model by tier (ModelFast/ModelDefault/ModelSmart), by provider family, or by a concrete vendor-prefixed string.
  • Usage.ThinkingTokens is renamed Usage.ReasoningTokens (JSON reasoningTokens).
  • The provider endpoint-config CompletionURL is renamedMessagesURL on claudellm, ModelsURL on geminillm, ResponsesURL on chatgptllm and litellm. Default URLs are unchanged.
  • The examples/ import path is renamed exampleservices/. Update any github.com/microbus-io/fabric/examples/<name> import to .../exampleservices/<name>.
  • foreman NumShards is no longer hot-reloadable. A change to it takes effect on restart, not on a live config edit.

Migration

From inside a Microbus project, ask Claude Code to upgrade Microbus:

Get the latest version of Microbus.

The upgrade bumps go.mod to v1.44.0 and runs the versioned upgrade-v1-44-0 routine, which applies the pure-shell renames — CompletionURLMessagesURL/ModelsURL/ResponsesURL, the removed per-provider model consts, and Usage.ThinkingTokensReasoningTokens — and grep-guides the []Message → ordered []Item conversation rewrite, which is a judgment call rather than a mechanical substitution. The orchestrator then regenerates every microservice with genservice, runs go mod tidy && go vet ./... && go test ./..., and you review the diff.

Documentation