amd · itomek · Apr 24, 2026 · Apr 20, 2026 · Apr 20, 2026 · Apr 22, 2026
@@ -403,6 +403,7 @@
             "group": "Release Notes",
             "pages": [
               "releases/index",
+              "releases/v0.17.4",
               "releases/v0.17.3",
               "releases/v0.17.2",
               "releases/v0.17.1",
@@ -450,7 +451,7 @@
   "navbar": {
     "links": [
       {
-        "label": "v0.17.3 \u00b7 Lemonade 10.0.0",
+        "label": "v0.17.4 \u00b7 Lemonade 10.0.0",
         "href": "https://github.com/amd/gaia/releases"
       },
       {

@@ -2598,7 +2598,7 @@ Choices the spec implies but does not resolve:
 - [Whittaker & Sidner, "Email Overload" (CHI 1996)](https://dl.acm.org/doi/10.1145/238386.238530)
 - [Bellotti et al., "Taking Email to Task" / Taskmaster (CHI 2003)](https://www.semanticscholar.org/paper/Taking-email-to-task/8a28a1ee766d87ca9acbd741a7c1972d69217359)
 - [Aberdeen, Pacovsky & Slater, "Gmail Priority Inbox" (NIPS 2010)](https://research.google/pubs/pub36955/)
-- [Cohen, Carvalho & Mitchell, "Email Speech Acts" (EMNLP 2004)](https://www.cs.cmu.edu/~tom/EMNLP2004_final.pdf)
+- [Cohen, Carvalho & Mitchell, "Learning to Classify Email into 'Speech Acts'" (EMNLP 2004)](https://aclanthology.org/W04-3240/)
 - [Vellum, "Levels of Agentic Behavior"](https://www.vellum.ai/blog/levels-of-agentic-behavior)
 - [Knight Institute, "Levels of Autonomy for AI Agents"](https://knightcolumbia.org/content/levels-of-autonomy-for-ai-agents-1)
 

@@ -0,0 +1,50 @@
+---
+title: "v0.17.4"
+description: "Custom-agent model selection, C++ null-safety, and docs link fix"
+---
+
+# GAIA v0.17.4 Release Notes
+
+GAIA v0.17.4 is a patch release covering two correctness fixes in the Agent UI custom-agent path, a null-safety fix in the C++ library for smaller LLMs, and a broken docs citation.
+
+**Why upgrade:**
+- **Custom agents use their declared model** — If a custom agent sets a model via `kwargs.setdefault("model_id", ...)`, the Agent UI now respects that setting when the session is at the DB default, instead of falling back to the session model.
+- **Compatibility with smaller LLMs in the C++ library** — The C++ JSON parser now tolerates `null` values in `"tool"` and `"content"` fields, which some smaller models emit in place of omitting the field.
+
+---
+
+## What's New
+
+### Custom Agent `model_id` Respected in the Agent UI
+
+`_chat_helpers.py` previously passed `model_id=<session model>` explicitly to `registry.create_agent()`, which defeated `kwargs.setdefault("model_id", ...)` in custom agents — `setdefault` only fires when the key is absent (PR [#841](https://github.com/amd/gaia/pull/841)). The Agent UI now builds `create_kwargs` conditionally, omitting `model_id` when the session is at the DB default so the agent's `__init__` setdefault governs. Three-branch precedence is now explicit: `custom_model` setting > session-explicit model > agent's own `setdefault`.
+
+A follow-up fix (PR [#842](https://github.com/amd/gaia/pull/842)) restored the pre-construction `model_id` as the agent-cache key. The initial PR #841 landing had switched `_store_agent` to use the post-construction `_effective_model(agent, model_id)` while `_get_cached_agent` still looked up with `model_id`, so keys never matched for custom-model agents and the agent was rebuilt on every turn. A two-turn cache-hit regression test and a static guard on `_store_agent` call sites were added alongside the fix.
+
+Supporting refactor: extracted `_build_create_kwargs()` and `_effective_model()` helpers in `src/gaia/ui/_chat_helpers.py` to deduplicate the three-branch logic across streaming and non-streaming paths, and exported `SESSION_DEFAULT_MODEL` from `database.py` as the single source of truth.
+
+---
+
+### C++ Library: Null-Safety in LLM Response Parsing
+
+`parseLlmResponse()` in `cpp/src/json_utils.cpp` now guards `.get<std::string>()` calls on the `"tool"` and `"answer"` JSON fields with `.is_string()` / `.is_null()` checks (PR [#780](https://github.com/amd/gaia/pull/780)). This fixes a crash (`json.exception.type_error.302: type must be string, but is null`) when smaller LLMs (for example `qwen3.5:9b`) return `null` for those fields instead of omitting them. `json.contains()` returns `true` for `null` values, so the existing presence checks were insufficient.
+
+---
+
+## Bug Fixes
+
+- **Email-triage agent plan: broken CMU citation link** (PR [#817](https://github.com/amd/gaia/pull/817)) — Swapped the failing `www.cs.cmu.edu/~tom/EMNLP2004_final.pdf` URL in `docs/plans/email-triage-agent.mdx` for the canonical ACL Anthology record at [W04-3240](https://aclanthology.org/W04-3240/). The CMU URL was failing DNS resolution in CI, breaking the `Verify external URLs` check on every open docs PR. Restored the paper's full title ("Learning to Classify Email into 'Speech Acts'") for consistency with other citations in the same references list.
+
+---
+
+## Full Changelog
+
+**5 commits** since v0.17.3:
+
+- `8fc43f3f` — fix(cpp): add null-safety checks for JSON string fields in LLM response parsing (#780)
+- `62722de2` — fix(ui): honor custom agent model_id when session is at DB default (#841)
+- `4acfd400` — fix(ui): extract _build_create_kwargs/_effective_model, import SESSION_DEFAULT_MODEL
+- `8f5c7621` — fix(ui): restore intent-key for agent cache store to fix miss regression (#842)
+- `a0fdb109` — docs(plans): fix broken CMU link to EMNLP 2004 Email Speech Acts paper (#817)
+
+Full Changelog: [v0.17.3...v0.17.4](https://github.com/amd/gaia/compare/v0.17.3...v0.17.4)
@@ -1,6 +1,6 @@
 {
   "name": "@amd-gaia/agent-ui",
-  "version": "0.17.3",
+  "version": "0.17.4",
   "type": "module",
   "productName": "GAIA Agent UI",
   "description": "Privacy-first agentic AI interface with document Q&A - runs 100% locally on AMD Ryzen AI",

@@ -628,7 +628,10 @@ def handle_jsonrpc(self, data):
                 400,
                 {
                     "jsonrpc": "2.0",
-                    "error": {"code": -32600, "message": "Invalid Request: expected JSON object"},
+                    "error": {
+                        "code": -32600,
+                        "message": "Invalid Request: expected JSON object",
+                    },
                     "id": None,
                 },
             )

@@ -23,7 +23,7 @@
 import time as _time
 from pathlib import Path
 
-from .database import ChatDatabase
+from .database import SESSION_DEFAULT_MODEL, ChatDatabase
 from .models import ChatRequest
 from .sse_handler import (
     _ANSWER_JSON_SUB_RE,
@@ -73,6 +73,9 @@ def get_agent_registry():
 _agent_cache_lock = threading.Lock()
 _MAX_CACHED_AGENTS = 10
 
+# Alias so call-sites read naturally; the canonical value lives in database.py.
+_DB_DEFAULT_MODEL = SESSION_DEFAULT_MODEL
+
 # Last known MCP runtime status — updated after each agent setup so
 # GET /api/mcp/status can return it without needing a running chat.
 _mcp_status_cache: list[dict] = []
@@ -84,6 +87,56 @@ def get_agent_registry():
 model_load_lock = threading.Lock()
 
 
+def _build_create_kwargs(
+    *,
+    custom_model: str | None,
+    model_id: str | None,
+    streaming: bool = False,
+) -> dict:
+    """Return the kwargs dict for registry.create_agent().
+
+    Precedence (high → low):
+      1. custom_model setting (explicit user override from db)
+      2. session-explicit model (differs from SESSION_DEFAULT_MODEL)
+      3. omit model_id — lets the agent's kwargs.setdefault govern (fix #841)
+
+    Note: if registry.resolve_model() already promoted model_id before this
+    call, it is forwarded as-is via branch 2 (resolve_model result ≠ default).
+    """
+    suffix = " (streaming)" if streaming else ""
+    kwargs: dict = {"silent_mode": not streaming, "debug": False}
+    if streaming:
+        kwargs["streaming"] = True
+
+    if custom_model:
+        kwargs["model_id"] = custom_model
+        logger.info("create_agent: custom_model override -> %s%s", custom_model, suffix)
+    elif model_id and model_id != _DB_DEFAULT_MODEL:
+        kwargs["model_id"] = model_id
+        logger.info("create_agent: session-explicit model -> %s%s", model_id, suffix)
+    else:
+        # Omit model_id so kwargs.setdefault in the agent's __init__ fires.
+        # setdefault only works when the key is ABSENT. Passing the DB default
+        # (or None / empty) explicitly defeats it — this is the fix for #841.
+        logger.info(
+            "create_agent: omitting model_id kwarg (session at DB default %s); "
+            "agent's kwargs.setdefault or AgentConfig fallback will govern%s",
+            _DB_DEFAULT_MODEL,
+            suffix,
+        )
+    return kwargs
+
+
+def _effective_model(agent, fallback: str | None) -> str | None:
+    """Return agent.model_id if set, else fallback.
+
+    Uses explicit None check (not `or`) to avoid treating empty-string
+    model_id as missing — which would silently load the wrong model.
+    """
+    effective = getattr(agent, "model_id", None)
+    return effective if effective is not None else fallback
+
+
 def get_cached_mcp_status() -> list[dict]:
     """Return the last known MCP server connection status from any cached agent."""
     with _mcp_status_lock:
@@ -556,17 +609,23 @@ def _do_chat():
                 )
                 agent = registry.create_agent(
                     agent_type,
-                    model_id=model_id,
-                    silent_mode=True,
-                    debug=False,
+                    **_build_create_kwargs(
+                        custom_model=custom_model, model_id=model_id
+                    ),
                 )
                 logger.info(
                     "chat: Invoking agent %s for session %s, model=%s",
                     agent_type,
                     session_id[:8],
+                    _effective_model(agent, model_id),
+                )
+                _store_agent(
+                    session_id,
                     model_id,
+                    document_ids,
+                    agent,
+                    agent_type,
                 )
-                _store_agent(session_id, model_id, document_ids, agent, agent_type)
 
         # Restore conversation history (limited to prevent context overflow).
         # Always re-inject from DB so the history is consistent with what was
@@ -585,8 +644,11 @@ def _do_chat():
             agent.conversation_history.append({"role": "user", "content": u})
             agent.conversation_history.append({"role": "assistant", "content": a})
 
-        # Pre-flight: same fix as the streaming path — see _maybe_load_expected_model.
-        _maybe_load_expected_model(model_id)
+        # Pre-flight on agent's ACTUAL effective model. When model_id kwarg was
+        # omitted, the agent's __init__ set model_id via kwargs.setdefault —
+        # a value invisible pre-construction. Using _effective_model preserves
+        # the existing 100-900s silent-hang protection for all code paths.
+        _maybe_load_expected_model(_effective_model(agent, model_id))
 
         result = agent.process_query(request.message)
         if isinstance(result, dict):
@@ -915,17 +977,18 @@ def _run_agent():
                         t_construct = _time.monotonic()
                         agent = registry.create_agent(
                             agent_type,
-                            model_id=model_id,
-                            streaming=True,
-                            silent_mode=False,
-                            debug=False,
+                            **_build_create_kwargs(
+                                custom_model=custom_model,
+                                model_id=model_id,
+                                streaming=True,
+                            ),
                         )
                         agent.console = sse_handler
                         logger.info(
                             "chat: Invoking agent %s for session %s, model=%s took=%.3fs",
                             agent_type,
                             session_id[:8],
-                            model_id,
+                            _effective_model(agent, model_id),
                             _time.monotonic() - t_construct,
                         )
 
@@ -937,7 +1000,11 @@ def _run_agent():
                             _index_rag_with_progress(agent, rag_file_paths, sse_handler)
 
                         _store_agent(
-                            session_id, model_id, document_ids, agent, agent_type
+                            session_id,
+                            model_id,
+                            document_ids,
+                            agent,
+                            agent_type,
                         )
 
                     sse_handler._emit(
@@ -987,10 +1054,13 @@ def _run_agent():
                 if sse_handler.cancelled.is_set():
                     return
 
-                # Pre-flight: ensure a chat-capable LLM is active before sending the query.
-                # Lemonade silently hangs when no model is loaded or the embedding model is
-                # active — no error is returned, so _execute_with_auto_download never fires.
-                _maybe_load_expected_model(model_id, sse_handler)
+                # Pre-flight on agent's ACTUAL effective model. When model_id kwarg was
+                # omitted, the agent's __init__ set model_id via kwargs.setdefault — a value
+                # invisible pre-construction. Using agent.model_id preserves the existing
+                # 100-900s silent-hang protection for all code paths including setdefault.
+                _maybe_load_expected_model(
+                    _effective_model(agent, model_id), sse_handler
+                )
 
                 # -- Phase 5: Query processing --
                 t_query = _time.monotonic()

@@ -20,6 +20,10 @@
 
 DEFAULT_DB_PATH = Path.home() / ".gaia" / "chat" / "gaia_chat.db"
 
+# Default model for new sessions — kept in sync with the SQL schema DEFAULT and
+# any code that reads session["model"] and falls back when the field is NULL.
+SESSION_DEFAULT_MODEL = "Qwen3.5-35B-A3B-GGUF"
+
 SCHEMA_SQL = """
 -- Global document library
 CREATE TABLE IF NOT EXISTS documents (
@@ -230,7 +234,7 @@ def create_session(
         """Create a new chat session."""
         session_id = str(uuid.uuid4())
         now = self._now()
-        model = model or "Qwen3.5-35B-A3B-GGUF"
+        model = model or SESSION_DEFAULT_MODEL
         title = title or "New Chat"
         agent_type = agent_type or "chat"
 

@@ -1591,3 +1591,94 @@ def test_delete_messages_from_session_not_found(self, client):
         """DELETE .../and-below returns 404 for non-existent session."""
         resp = client.delete("/api/sessions/nonexistent/messages/1/and-below")
         assert resp.status_code == 404
+
+
+# ── Issue #841 regression: custom agent model_id honored through API ──────────
+
+
+class TestCustomAgentModelChoice:
+    """Verify that a custom Python agent's kwargs.setdefault model_id reaches the
+    registry.create_agent call without model_id being passed as an explicit kwarg.
+
+    This is the integration-layer pin for issue #841. It exercises the full
+    path: HTTP POST → session → _get_chat_response → registry.create_agent.
+    """
+
+    def test_custom_agent_model_id_honored_through_api(self, tmp_path):
+        import textwrap
+
+        agents_dir = tmp_path / ".gaia" / "agents" / "smallbot"
+        agents_dir.mkdir(parents=True)
+        (agents_dir / "agent.py").write_text(textwrap.dedent("""
+            from gaia.agents.base.agent import Agent
+
+            class SmallBot(Agent):
+                AGENT_ID = "smallbot"
+                AGENT_NAME = "SmallBot"
+
+                def __init__(self, **kwargs):
+                    kwargs.setdefault("model_id", "Qwen3.5-4B-GGUF")
+                    super().__init__(skip_lemonade=True, **kwargs)
+
+                def _get_system_prompt(self):
+                    return "x"
+
+                def _register_tools(self):
+                    pass
+        """))
+
+        # HOME patch must wrap the full lifespan: discover() fires on __enter__.
+        with patch("gaia.agents.registry.Path.home", return_value=tmp_path):
+            app = create_app(db_path=":memory:")
+
+            with TestClient(app) as client:
+                # Spy on create_agent AFTER lifespan fires (registry exists now).
+                captured = {}
+                original_create = app.state.agent_registry.create_agent
+
+                def _spy(agent_id, **kwargs):
+                    if agent_id == "smallbot":
+                        captured["model_id_kwarg"] = kwargs.get("model_id", "<omitted>")
+                    agent = original_create(agent_id, **kwargs)
+                    if agent_id == "smallbot":
+                        captured["agent_model_id"] = getattr(agent, "model_id", None)
+                    return agent
+
+                app.state.agent_registry.create_agent = _spy
+
+                # Create a session typed to our custom agent.
+                sess_resp = client.post(
+                    "/api/sessions",
+                    json={"title": "841-test", "agent_type": "smallbot"},
+                )
+                assert sess_resp.status_code == 200, sess_resp.text
+                sid = sess_resp.json()["id"]
+
+                # Send a chat message, bypassing Lemonade and LLM.
+                with (
+                    patch("gaia.ui._chat_helpers._maybe_load_expected_model"),
+                    patch(
+                        "gaia.ui._chat_helpers._agent_registry",
+                        app.state.agent_registry,
+                    ),
+                ):
+                    chat_resp = client.post(
+                        "/api/chat/send",
+                        json={
+                            "session_id": sid,
+                            "message": "hi",
+                            "stream": False,
+                        },
+                    )
+
+                assert chat_resp.status_code == 200, chat_resp.text
+
+        assert captured, "create_agent spy was never called for smallbot"
+        assert captured["model_id_kwarg"] == "<omitted>", (
+            f"Issue #841: model_id kwarg must be omitted when session is at DB default; "
+            f"got model_id_kwarg={captured['model_id_kwarg']!r}"
+        )
+        assert captured["agent_model_id"] == "Qwen3.5-4B-GGUF", (
+            f"Issue #841: agent.model_id must reflect kwargs.setdefault value; "
+            f"got {captured['agent_model_id']!r}"
+        )