Releases: skundu42/adk-rs
v0.6.0
Highlights
- Token-level streaming — RunConfig { streaming_mode: StreamingMode::Sse } now streams real token deltas through the
agent loop as partial events, aggregated into one persisted final event. Previously the setting was accepted but
ignored. - Lifecycle callbacks are live — all eight hooks (before/after_agent, before/after_model, on_model_error,
before/after_tool, on_tool_error) are registrable on LlmAgentBuilder and fire in the run loop: rewrite requests,
short-circuit calls, recover from errors. - Agent transfer that just works — declaring .sub_agent(...) now auto-registers the transfer_to_agent tool and
advertises sub-agents to the model. Transfer resolves through the invocation's agent-tree root, so siblings and
ancestors are reachable. A hallucinated agent name returns a recoverable tool error instead of aborting the run. - Anthropic extended thinking — thinking_config.thinking_budget maps to the Messages API thinking parameter; thinking
blocks round-trip with their cryptographic signature, redacted_thinking payloads are preserved, and streaming handles
thinking_delta/signature_delta. Gemini thoughtSignature (including on function calls) now survives round-trips too. - Python ADK eval-set wire compatibility — eval files written by Python's adk eval now load unmodified
(eval_set_id/eval_id, optional final_response, (author, parts) intermediate responses, session_input fixtures), and
session_input.state actually seeds the eval session.
Fixed
- Tool-written state was silently discarded. ToolContext::state_delta / artifact_delta now land on the tool-response
event's actions and persist; skip_summarization ends the turn with the tool response as the final answer. - Streams died at 60 seconds. The provider HTTP clients no longer apply a total timeout to SSE bodies; timeout now
governs non-streaming requests only (streams get a 10 s connect timeout). - Concurrent invocations on one session lost events (in-memory backend): the store now merges appended events instead
of replacing the whole session per writer. - Circular $refs crashed the process. Both the OpenAPI spec converter and the #[tool] schema converter now return an
error at nesting depth 64 instead of overflowing the stack. - #[tool] failed to compile downstream unless the user also depended on async-trait/schemars/serde_json; the expansion
now goes through adk_rs::__private re-exports. A trybuild suite pins the misuse diagnostics. - Forward-compatibility: unknown Gemini finishReason/harm-category values and unknown Anthropic content-block types no
longer fail the whole response; refusal maps to Safety and pause_turn to Stop on both parse paths. Added
FinishReason::ImageSafety / UnexpectedToolCall. - Eval scoring: multi-turn cases report the average across invocations (per-invocation scores in
details.per_invocation) instead of silently keeping only the last turn; ResponseMatch matches whole tokens ("cat" no
longer matches "concatenate"); metric defaults are strict (ResponseMatch 0.8, TrajectoryMatch 1.0) instead of
pass-everything 0.0. - Anthropic non-streaming refusal responses, OpenAI/Azure quirks, A2A push-notification docs, and various stale doc
comments.
Full Changelog: v0.5.0...v0.6.0
v0.5.0
- Native SSE streaming for Anthropic & OpenAI — the single-shot fallback is
gone. Text/thinking deltas stream as partial chunks, tool-call arguments are
reassembled across fragments, and the final chunk carries the stop reason and
usage. All three providers now share one chunk contract. - Multimodal input for Anthropic & OpenAI — inline images map to base64
imageblocks /data:URIimage_urlparts, inline PDFs to Anthropic
documentblocks,https://file references to URL sources. Unsupported
parts are dropped with a warning — never silently. - Anthropic prompt caching — the same
ContextCacheConfigthat drives
Gemini's explicit cache becomes acache_controlbreakpoint on the system
block (or last tool);ttl_seconds >= 3600selects the 1-hour tier. Cache
activity surfaces viacache_metadataandcached_content_token_count. - Reasoning-model fix —
max_output_tokensis sent as
max_completion_tokensfor o-series / gpt-5 models (which reject the
deprecatedmax_tokens), keeping older OpenAI-compatible servers on the
legacy field. examples/compat_check— live wire-compatibility smoke test against the
real Anthropic and OpenAI APIs (generation, streaming, tools, structured
output, images, caching). Run it with your keys; exits non-zero on failure.
Full Changelog: v0.4.0...v0.5.0
v0.4.0
🔁 Automatic provider retries
All provider clients (Gemini, Anthropic, OpenAi) — including the new embedding
clients — now retry transient failures with exponential backoff and full jitter:
- Retries 429 (honouring
Retry-Afterup to 60s), 408/409/5xx, and
connect/timeout transport errors. Other 4xx fail immediately. - Defaults mirror the official SDKs: 2 retries, 500ms initial backoff, 8s cap.
- Tune or opt out per client via the new
retry: RetryConfigfield on each provider
config (RetryConfig::disabled()restores the old fail-fast behaviour). - For streaming calls, retry covers connection establishment; mid-stream failures
still surface as stream errors.
→ https://adk-rs.vercel.app/docs/providers
🧠 Semantic memory
Memory search graduates from substring matching to embedding-based retrieval:
- New
Embeddertrait inadk_rs::core(plus acosine_similarityhelper). VectorMemoryService(adk_rs::services::mem) embeds entries at ingest time and
ranks searches by cosine similarity, withwith_top_k(n)(default 5) and
with_min_score(s)builders. Drop-in replacement forInMemoryMemoryService—
theload_memory/preload_memorytools work unchanged.- Two embedders in the box:
GeminiEmbedder(batchEmbedContents, e.g.
gemini-embedding-001) andOpenAiEmbedder(/embeddings, e.g.
text-embedding-3-small; reaches Azure/Ollama viaOPENAI_BASE_URL). MockEmbedderunder thetestingfeature: a deterministic hashed bag-of-words
embedder for offline tests of semantic-memory flows.
→ https://adk-rs.vercel.app/docs/memory
🧾 Structured output on every provider
.output_schema(...) is now server-enforced on all three providers, not just Gemini:
- OpenAI-compatible — strict structured outputs
(response_format: {"type": "json_schema", "strict": true, ...}). Optional
properties are sent as required-but-nullable to satisfy strict mode while
preserving the original semantics. A JSON mime type without a schema still maps
to plainjson_objectmode. - Anthropic — the Messages API's native
output_config: {"format": {"type": "json_schema", ...}}. Requires a
structured-outputs-capable Claude model (Claude 4.5-generation and newer). - A shared transform converts the OpenAPI-flavoured
Schemato each provider's
JSON Schema dialect: lowercased types,nullable→["type", "null"]unions,
additionalProperties: falseon objects, unsupported keywords stripped.
→ https://adk-rs.vercel.app/docs/structured-output
🎙 Gemini Live API (feature = "live")
Bidirectional WebSocket streaming over BidiGenerateContent:
Gemini::connect_live(LiveConfig)→LiveSessionwithsend_text,
send_audio(realtime PCM with server-side VAD),send_audio_stream_end,
send_tool_response,recv, andclose.LiveEventcovers incremental text, decoded PCM audio, input/output
transcriptions, tool calls and cancellations, barge-inInterrupted,
GenerationComplete,TurnComplete,GoAway, and usage metadata.LiveConfigselects response modality (TEXT or AUDIO), prebuilt voice, system
instruction, tools, and transcription flags.- The transport-security policy carries over:
wss://always, plainws://only
to loopback hosts. - Live sessions are a model-level surface today (alongside
generate_content);
runner-loop integration is planned.
→ https://adk-rs.vercel.app/docs/live
Full Changelog: v0.3.0...v0.4.0
v0.3.0
Tool confirmation (HITL) & resumable invocations
- FunctionTool::require_confirmation(true) (plus .with_confirmation_hint(..)) and per-tool ConfirmationPolicy on
McpToolset gate tool calls behind explicit user approval. - Gated calls pause the invocation with an adk_request_confirmation request (wire-compatible with Python ADK's
ToolConfirmation payloads, including actions.requested_tool_confirmations); reply with a FunctionResponse carrying
{"confirmed": true/false} to approve or deny. Denials surface to the model as a rejection result — the tool never runs. - Runner::builder().resumable(true) + Runner::resume(user, session, invocation_id, content, config) resume a paused
invocation in place: SequentialAgent records checkpoints and skips completed steps; LoopAgent suspends on pause.
Approved tools execute exactly once. - A2A tasks that pause now end in input-required (instead of completed), unblocking cross-runtime HITL with Python
google-adk clients.
Context caching & event compaction
- LlmAgent::static_instruction(..) pins a byte-stable system prefix; dynamic instructions ride in the request contents
so the prefix stays cacheable. - Runner::builder().context_cache_config(ContextCacheConfig { ttl_seconds, cache_intervals, min_tokens }) enables
explicit Gemini cachedContents management: create-once, reuse across calls, TTL/use-count refresh, and graceful
fallback (with cache invalidation) if the server rejects an entry. Responses carry cache_metadata (cache_name,
cache_hit). - Runner::builder().compaction(EventsCompactionConfig::new(model)) summarizes old events after every N invocations via
LlmEventSummarizer (pluggable EventSummarizer trait). History assembly transparently swaps compacted ranges for the
summary.
Structured output & instruction templating
- LlmAgent::output_key("answer") writes the final response into session state (the backbone of Sequential pipelines);
with output_schema(..) the model is forced to JSON and the parsed value is stored. - include_contents(IncludeContents::None) for stateless pipeline steps.
- Python-compatible {state_key} templating in static instructions: {key}, {key?} (optional),
{app:key}/{user:key}/{temp:key}, and {artifact.name}.
adk api_server-compatible dev server (adk-web UI support)
- The axum server now implements Python ADK's wire contract: camelCase event/session JSON, GET /list-apps, /health,
/version, full session CRUD under /apps/{app}/users/{user}/sessions (409 on duplicate, PATCH state-delta, {"detail":
...} errors), POST /run (bare event array) and POST /run_sse (data: framing, artifact-split quirk included),
artifact and memory endpoints, and graceful trace/eval stubs. - AgentRunRequest accepts both camelCase and snake_case, supports stateDelta, and invocationId for resume.
- CORS via AppState::with_allow_origins(..) / adk web --allow-origins http://localhost:4200 — point adk-web's
backendUrl at adk-rs and go.
🐛 Fixes
- Resumed tool calls are now replayed exactly once, by the agent that owns them. Previously the resume bookkeeping was
invocation-global and never consumed, so in multi-agent pipelines a user-confirmed (or OAuth-resumed) tool could
execute twice downstream, or kill the pipeline with an unknown-tool error.
v0.2.0
Session State & Storage
- Added scope-routed session state handling for
app:,user:, andtemp:keys. SessionService::create_session(state: Some(_))now partitions state by scope automatically.Session.statenow exposes a merged overlay of app, user, and session state.InMemorySessionServicenow includes dedicated:app_stateuser_state
stores.
SqlSessionServicenow includes:app_statetableuser_statetable
for both PostgreSQL and SQLite backends.
- Added new public helper:
State::partition_by_scope(&delta)
Auth Resume & Function Calls
- Added deterministic auth-resume flow after OAuth/credential consent.
- Gemini function calls without IDs now receive synthesized stable IDs:
adk-fc-<uuid> AuthPreprocessornow rejects empty or missingfunction_call_idvalues for:adk_request_credential
OpenAPI Tooling
- Expanded OpenAPI REST tool support for:
- request bodies
- parameter styles
- security scheme binding
- richer spec parsing
- operation parameter resolution
Concurrency & State Consistency Fixes
- Fixed TOCTOU race in in-memory
app_slot()/user_slot(). - Fixed concurrent state-write clobbering in PostgreSQL and SQLite backends.
- Added state merge locking using:
SELECT ... FOR UPDATE(Postgres)BEGIN IMMEDIATE(SQLite)
- Fixed app/user scope leakage across sessions.
- Fixed Gemini auth-resume failures caused by missing function-call IDs.
- Fixed multiple runner/agent flow ordering and persistence issues.
Database Changes
Added new tables:
app_state(app_name, key, value)
user_state(app_name, user_id, key, value)Updated:
0001_init.sqlfor SQLite0001_init.sqlfor PostgreSQL
Cargo Features
No changes to storage backend feature flags:
sqlitepostgres
v0.1.0
v0.1.0
Initial release.
Runtime spine of Python ADK ported to Rust as a single feature-gated crate.
Ships the agent framework, three LLM providers, the core service abstractions, the tool framework, the runner, MCP support, telemetry, evaluation, a dev server, and CLI scaffolding.