Skip to content

GenAI Processors v2.0

Latest

Choose a tag to compare

@aelissee aelissee released this 10 Mar 17:50
· 31 commits to main since this release

Today we are releasing GenAI Processors v2.0. With the amount of features added, it deserves a major version increase.

We've overhauled function calling support, bringing client-side async function calling and MCP tool support. Agent code can now be even simpler thanks to the new ContentStream concept. Development and triage are now much easier thanks to the new tracing infrastructure and the fully-overhauled documentation microsite.

This release also marks our initial efforts to optimize for Antigravity, though we are only scratching the surface in this direction.

🔌 Async Function Calling & MCP

  • Non-blocking tool execution: A state-of-the-art feature that allows the agent to execute tasks in the background without interrupting the conversation. Being a client-side feature, it is highly customizable. The implementation has been significantly improved since its initial release in 1.1.1.
  • Async Generators: If defined as generators, async tools can stream their responses over multiple turns.
  • MCP Session Support: Integrate Model Context Protocol (MCP) sessions as tools within processors. For real-time processors, MCP calls are executed asynchronously in the background.

✨ Core Processor & Stream Enhancements

  • ContentStream Class: A big improvement to our data model: instead of plain AsyncIterable[ProcessorPart], processors now work with ContentStream objects, which provide extra syntax sugar. For example, you can get the text output of a model with await model('Hello world!').text(). For multimodal output, there is .gather(), and for constrained decoding, we have .get_dataclass(MyDataclass).
  • Much easier to call: Processor.call used to require its input to be an AsyncIterator. Now any ProcessorContentTypes will do: processors can be invoked with str, PIL.Image, ProcessorContent, or lists of these.

🔍 Tracing

  • One-line Tracing: Enable pipeline tracing with a single line of code.
  • Visual Debugging: Generates an HTML summary of the pipeline execution, including cancellations and exceptions.
  • Multimodal support: Replay audio or view images directly within the trace, specifically designed for debugging real-time agents.

📚 Documentation & Examples

  • New Docs Site: Added comprehensive documentation on GitHub Pages covering design principles and core concepts.
  • New Examples:
    • Widgets: An agent utilizing async function calling to enrich its output with custom UI widgets.
    • Critic-reviser: Improves the model response by iteratively refining it.
  • Documentation for the AI coding agents: Added specific guardrails, instructions, and extensive docstring style improvements to help AI coding agents correctly leverage the genai_processors library.

🛠️ New processors

  • VideoExtract: Transforms a video into a sequence of audio and image Parts. Useful for emulating streaming during tests or for applying a rolling Window on a long input.
  • Hugging Face Transformers support: Run agents on top of local transformers models.
  • GlobSource doesn't block the asyncio event loop anymore.

🌐 Websocket Server

  • Live Server Module: Moved the generic logic that turns any processor into a websocket server from the Live Commentator example into a live_server.py module.
  • AI Studio Integration: Simplifies building real-time agent demos with custom UIs when combined with the AI Studio applet.