Skip to content

v1.6.0 — Jingle Bills: Usage Tracking & Stream Cancellation

Choose a tag to compare

@nicko-ai nicko-ai released this 24 Dec 16:34
· 666 commits to main since this release

This release focuses on usage + cost tracking and stream cancellation (FastAPI + terminal demo), with improved hot reload, more reliable event/history handling (timestamps, orphan cleanup, LiteLLM ID normalization), and docs/test stabilization.

If you’re new to Agency Swarm, here’s what that means:

  • Streaming cancellation: When you stream a response (token-by-token), you can stop it early—useful when you already have the answer or want to avoid spending extra tokens.
  • Hot reload (terminal demo): Like FastAPI, the terminal demo now auto-restarts on local .py or .md file changes, so you can iterate faster.
  • Usage + cost tracking: FastAPI responses and the terminal demo now include token totals and an estimated USD cost (multi-agent runs included). We use the real model prices, including models supported by LiteLLM.
  • More reliable history: Messages keep the timestamp from when they were emitted, orphaned tool calls/outputs are removed, and LiteLLM placeholder IDs are normalized.

Features

  • Add Token Usage + Cost Tracking (FastAPI + terminal demo; multi-agent aware) by @ArtemShatokhin in #489
    • What you get: A usage summary with request count, token breakdown, and estimated cost.
    • Where you’ll see it: Returned by FastAPI endpoints and tracked in the terminal demo (use /cost, and it’s also printed on exit).
    • Multi-agent aware: Includes tokens/cost from the main agent and any sub-agent calls made during the run.
  • Add Stream Cancellation support for FastAPI and terminal demo hot reload by @ArtemShatokhin in #469
    • FastAPI streaming: The stream emits a run_id early; clients can cancel an in-flight run via a cancel endpoint. If the client disconnects, the run is cancelled automatically.
    • Cancel modes: immediate (stop right away) and after_turn (finish the current turn, then stop).
    • Terminal demo hot reload: Watches for .py and .md changes and restarts automatically during local development.
  • Add Cancel Feature to the Terminal Demo by @ArtemShatokhin in #474
    • How to use it: Press ESC during streaming to cancel the current response.
    • Cleaner history after cancel: The demo filters out duplicate/orphaned tool call messages so the saved chat stays valid and replayable.

Improvements

  • Improve Terminal Tool Formatting by @ArtemShatokhin in #471
  • Capture and assign timestamps at event emission time by @ArtemShatokhin in #485
    • What changed: Each event/message now gets its timestamp when it’s emitted during streaming (not later during persistence).
    • Why it matters: Sorting and replaying history is more consistent across streaming, saved threads, and UIs.
    • Tool timelines: Hosted tool result preservation messages inherit the tool call’s timestamp for more accurate “when did this happen?” debugging.
    • Orphan cleanup: Before saving history, the framework drops orphaned tool calls/outputs (e.g. a tool call without its matching output) to avoid replay errors.
  • Add user_context forwarding to FastAPI endpoints by @bonk1t in #470
  • Added include_search_results assignment to automatically created tools by @ArtemShatokhin in #468
  • Add nest asyncio patch to IPython tool to simplify async function execution by @ArtemShatokhin in #494

Fixes

  • LiteLLM fake id handling by @bonk1t in #479
    • Problem: Some LiteLLM / Chat Completions streams reuse a placeholder id (__fake_id__) across many output items, which breaks consumers that key by item id.
    • Fix: Placeholder ids are rewritten into stable per-item ids (tool calls prefer call_id), so deduplication and message linking behave predictably.
  • fix: normalize LiteLLM placeholder ids by @bonk1t in #491
  • fix: normalize agent template names and harden file cleanup by @bonk1t in #480
  • Updated mcp error handling to match default function tool behavior by @ArtemShatokhin in #487
  • Fix copilot demo packaging by @bonk1t in #476
  • fix: restore copilot demo streaming by @bonk1t in #484
  • fix: make sync APIs safe inside running event loop by @bonk1t in #496

Refactors / Chores

  • refactor: enforce codebase consistency across imports, types, and structure by @bonk1t in #483
  • Refactor: Consolidate File Utilities and Cleanup Imports by @bonk1t in #482
  • set default truncation="auto" and update openai-agents to 0.6.2 by @bonk1t in #475
  • chore: update default model to gpt-5.1 by @bonk1t in #477
  • chore: deprecate agent send_message_tool_class by @bonk1t in #472
  • Update model references to gpt-5.2 by @bonk1t in #495

Documentation


Tests

These test changes focus on stability and determinism to reduce intermittent failures in CI.

  • Fix context sharing integration test determinism by @bonk1t in #481
  • Tests: stabilization and fixes by @bonk1t in #488

Full Changelog: v1.5.0...v1.6.0