Release v1.6.0 — Jingle Bills: Usage Tracking & Stream Cancellation · VRSEN/agency-swarm

This release focuses on usage + cost tracking and stream cancellation (FastAPI + terminal demo), with improved hot reload, more reliable event/history handling (timestamps, orphan cleanup, LiteLLM ID normalization), and docs/test stabilization.

If you’re new to Agency Swarm, here’s what that means:

Streaming cancellation: When you stream a response (token-by-token), you can stop it early—useful when you already have the answer or want to avoid spending extra tokens.
Hot reload (terminal demo): Like FastAPI, the terminal demo now auto-restarts on local .py or .md file changes, so you can iterate faster.
Usage + cost tracking: FastAPI responses and the terminal demo now include token totals and an estimated USD cost (multi-agent runs included). We use the real model prices, including models supported by LiteLLM.
More reliable history: Messages keep the timestamp from when they were emitted, orphaned tool calls/outputs are removed, and LiteLLM placeholder IDs are normalized.

Features

Add Token Usage + Cost Tracking (FastAPI + terminal demo; multi-agent aware) by @ArtemShatokhin in #489
- What you get: A usage summary with request count, token breakdown, and estimated cost.
- Where you’ll see it: Returned by FastAPI endpoints and tracked in the terminal demo (use /cost, and it’s also printed on exit).
- Multi-agent aware: Includes tokens/cost from the main agent and any sub-agent calls made during the run.
Add Stream Cancellation support for FastAPI and terminal demo hot reload by @ArtemShatokhin in #469
- FastAPI streaming: The stream emits a run_id early; clients can cancel an in-flight run via a cancel endpoint. If the client disconnects, the run is cancelled automatically.
- Cancel modes: immediate (stop right away) and after_turn (finish the current turn, then stop).
- Terminal demo hot reload: Watches for .py and .md changes and restarts automatically during local development.
Add Cancel Feature to the Terminal Demo by @ArtemShatokhin in #474
- How to use it: Press ESC during streaming to cancel the current response.
- Cleaner history after cancel: The demo filters out duplicate/orphaned tool call messages so the saved chat stays valid and replayable.

Improvements

Improve Terminal Tool Formatting by @ArtemShatokhin in #471
Capture and assign timestamps at event emission time by @ArtemShatokhin in #485
- What changed: Each event/message now gets its timestamp when it’s emitted during streaming (not later during persistence).
- Why it matters: Sorting and replaying history is more consistent across streaming, saved threads, and UIs.
- Tool timelines: Hosted tool result preservation messages inherit the tool call’s timestamp for more accurate “when did this happen?” debugging.
- Orphan cleanup: Before saving history, the framework drops orphaned tool calls/outputs (e.g. a tool call without its matching output) to avoid replay errors.
Add user_context forwarding to FastAPI endpoints by @bonk1t in #470
Added include_search_results assignment to automatically created tools by @ArtemShatokhin in #468
Add nest asyncio patch to IPython tool to simplify async function execution by @ArtemShatokhin in #494

Fixes

LiteLLM fake id handling by @bonk1t in #479
- Problem: Some LiteLLM / Chat Completions streams reuse a placeholder id (__fake_id__) across many output items, which breaks consumers that key by item id.
- Fix: Placeholder ids are rewritten into stable per-item ids (tool calls prefer call_id), so deduplication and message linking behave predictably.
fix: normalize LiteLLM placeholder ids by @bonk1t in #491
fix: normalize agent template names and harden file cleanup by @bonk1t in #480
Updated mcp error handling to match default function tool behavior by @ArtemShatokhin in #487
Fix copilot demo packaging by @bonk1t in #476
fix: restore copilot demo streaming by @bonk1t in #484
fix: make sync APIs safe inside running event loop by @bonk1t in #496

Refactors / Chores

refactor: enforce codebase consistency across imports, types, and structure by @bonk1t in #483
Refactor: Consolidate File Utilities and Cleanup Imports by @bonk1t in #482
set default truncation="auto" and update openai-agents to 0.6.2 by @bonk1t in #475
chore: update default model to gpt-5.1 by @bonk1t in #477
chore: deprecate agent send_message_tool_class by @bonk1t in #472
Update model references to gpt-5.2 by @bonk1t in #495

Documentation

Improve docs by @VRSEN in #467
updated pricing section with new plans and credits system by @MykhailoShchuka in #486

Tests

These test changes focus on stability and determinism to reduce intermittent failures in CI.

Fix context sharing integration test determinism by @bonk1t in #481
Tests: stabilization and fixes by @bonk1t in #488

Full Changelog: v1.5.0...v1.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.6.0 — Jingle Bills: Usage Tracking & Stream Cancellation

Choose a tag to compare

Sorry, something went wrong.