Skip to content

v0.23.0

Latest

Choose a tag to compare

@Pouyanpi Pouyanpi released this 01 Jul 16:22
dc046e4

What's Changed

This release expands tool calling and observability in IORails. Tool calling now works for streaming and non-streaming requests, including local rails that validate model-emitted tool calls and application-returned tool results. The OpenAI-compatible server also supports tool calling and adds a new /v1/checks endpoint for running input or output rails without generating a new model response.

NeMo Guardrails 0.23.0 also adds lightweight Hugging Face classifier rails, context bloat detection, and a Polygraf integration for PII detection and masking. Exact NumPy search replaces Annoy as the default embedding index, removing the native C++ dependency while preserving existing similarity-threshold semantics. Distribution wheels are now approximately ten times smaller.

IORails OpenTelemetry support now includes opt-in content capture and richer request, response, and token-usage attributes. LangChain integrations add support for the OpenAI Responses API and Harmony response format models. This release requires Pydantic >=2.5,<3.0; environments pinned to Pydantic 1.x must upgrade.

🚀 Features

  • (library) Add lightweight Hugging Face classifier rails for input, output, and retrieval, with local Transformers, vLLM, KServe, and FMS backends (#1853)
  • (embeddings) Replace Annoy with exact NumPy search and add migration benchmarks (#1957, #1958)
  • (iorails) Add opt-in OpenTelemetry content capture, request and response attributes, token usage, and span reference documentation (#1972, #2009, #2098, #2083)
  • (library) Add context bloat detection for oversized, repetitive, low-entropy, or padded input and retrieved content (#1941)
  • (iorails) Add streaming and non-streaming tool calling and local rails for validating tool calls and results (#2016, #2024, #2030, #2058, #2099)
  • (server) Add OpenAI-compatible tool calling parameters and response handling (#1942)
  • (server) Add the /v1/checks endpoint for standalone input and output rail validation, with passed, modified, or blocked status reporting (#2013)
  • (examples) Add NIM-based notebooks for content safety, topic control, GLiNER PII detection, and combined guardrails, replacing superseded examples (#1906)
  • (library) Add Polygraf PII detection and masking for input, output, and retrieval rails (#1693)

🐛 Bug Fixes

  • (library) Fix regex detection during output streaming so matches block correctly without raising TypeError (#1932, #1937)
  • (actions) Avoid an empty-string crash in create_event (#1701)
  • (iorails) Make OpenTelemetry recording best-effort so telemetry failures do not mask request, provider, or cancellation errors (#1997)
  • (iorails) Apply request-time llm_params on top of configured model parameters (#2020)
  • (llm) Handle multiline bot say responses when continuing a flow (#1650)
  • (generation) Use the correct task-specific stop tokens in generate_value (#1699)
  • (colang) Reject an incomplete or continuation at the end of a Colang 1.0 file instead of raising an index error (#1947)
  • (iorails) Provide a no-op events_history_cache for compatibility when IORails is selected (#2072)
  • (llmrails) Load library files deterministically so prompt and flow overrides do not depend on filesystem traversal order (#1975)
  • (embeddings) Preserve store_config when EmbeddingsCache is serialized and restored (#1951)
  • (eval) Use safe YAML loading and dumping in evaluation utilities (#2082)
  • (streaming) Pass user content correctly to output rails, avoid reusing resolved action parameters across chunks or requests, and emit usage metadata only once (#2081, #1943, #2079)
  • (llmrails) Preserve tool calls after tool output rails run (#2073)
  • (langchain) Support OpenAI Responses API and Harmony response formats in streaming and non-streaming mode (#2102)

💼 Other

  • (build) Stop bundling examples and repository-only development files in the wheel, reducing its size by approximately ten times (#2069)

🚜 Refactor

  • Refine the Guardrails public API with a common BaseGuardrails interface and first-class passthrough_fn, while deprecating direct access to internal LLMRails attributes (#1933)
  • [breaking] Require Pydantic >=2.5,<3.0 and migrate validators and model APIs to Pydantic 2 (#967)

New Contributors

Full Changelog: v0.22.0...v0.23.0