Release v0.17.4 by itomek · Pull Request #855 · amd/gaia

itomek · 2026-04-23T23:08:10Z

GAIA v0.17.4 Release Notes

GAIA v0.17.4 is a patch release covering two correctness fixes in the Agent UI custom-agent path, a null-safety fix in the C++ library for smaller LLMs, and a broken docs citation.

Why upgrade:

Custom agents use their declared model — If a custom agent sets a model via kwargs.setdefault("model_id", ...), the Agent UI now respects that setting when the session is at the DB default, instead of falling back to the session model.
Compatibility with smaller LLMs in the C++ library — The C++ JSON parser now tolerates null values in "tool" and "content" fields, which some smaller models emit in place of omitting the field.

What's New

Custom Agent `model_id` Respected in the Agent UI

_chat_helpers.py previously passed model_id=<session model> explicitly to registry.create_agent(), which defeated kwargs.setdefault("model_id", ...) in custom agents — setdefault only fires when the key is absent (PR #841). The Agent UI now builds create_kwargs conditionally, omitting model_id when the session is at the DB default so the agent's __init__ setdefault governs. Three-branch precedence is now explicit: custom_model setting > session-explicit model > agent's own setdefault.

A follow-up fix (PR #842) restored the pre-construction model_id as the agent-cache key. The initial PR #841 landing had switched _store_agent to use the post-construction _effective_model(agent, model_id) while _get_cached_agent still looked up with model_id, so keys never matched for custom-model agents and the agent was rebuilt on every turn. A two-turn cache-hit regression test and a static guard on _store_agent call sites were added alongside the fix.

Supporting refactor: extracted _build_create_kwargs() and _effective_model() helpers in src/gaia/ui/_chat_helpers.py to deduplicate the three-branch logic across streaming and non-streaming paths, and exported SESSION_DEFAULT_MODEL from database.py as the single source of truth.

C++ Library: Null-Safety in LLM Response Parsing

parseLlmResponse() in cpp/src/json_utils.cpp now guards .get<std::string>() calls on the "tool" and "answer" JSON fields with .is_string() / .is_null() checks (PR #780). This fixes a crash (json.exception.type_error.302: type must be string, but is null) when smaller LLMs (for example qwen3.5:9b) return null for those fields instead of omitting them. json.contains() returns true for null values, so the existing presence checks were insufficient.

Bug Fixes

Email-triage agent plan: broken CMU citation link (PR #817) — Swapped the failing www.cs.cmu.edu/~tom/EMNLP2004_final.pdf URL in docs/plans/email-triage-agent.mdx for the canonical ACL Anthology record at W04-3240. The CMU URL was failing DNS resolution in CI, breaking the Verify external URLs check on every open docs PR. Restored the paper's full title ("Learning to Classify Email into 'Speech Acts'") for consistency with other citations in the same references list.

Full Changelog

5 commits since v0.17.3:

8fc43f3f — fix(cpp): add null-safety checks for JSON string fields in LLM response parsing (fix(cpp): add null-safety checks for JSON string fields in LLM response parsing #780)
62722de2 — fix(ui): honor custom agent model_id when session is at DB default (bug: GAIA AgentUI ignores custom agent model choice #841)
4acfd400 — fix(ui): extract _build_create_kwargs/_effective_model, import SESSION_DEFAULT_MODEL
8f5c7621 — fix(ui): restore intent-key for agent cache store to fix miss regression (fix(ui): honor custom agent model_id when session is at DB default #842)
a0fdb109 — docs(plans): fix broken CMU link to EMNLP 2004 Email Speech Acts paper (docs(plans): fix broken CMU link to EMNLP 2004 Email Speech Acts paper #817)

Full Changelog: v0.17.3...v0.17.4

Release checklist

util/validate_release_notes.py docs/releases/v0.17.4.mdx --tag v0.17.4 passes
src/gaia/version.py → 0.17.4
src/gaia/apps/webui/package.json → 0.17.4
Navbar label in docs/docs.json → v0.17.4 · Lemonade 10.0.0
All 5 commits in the range (v0.17.3..HEAD) are represented in the notes
Review from @kovtcharov-amd addressed

) Previously, _chat_helpers.py always passed model_id=<session model> explicitly to registry.create_agent(), defeating kwargs.setdefault("model_id", ...) in custom agents — which only fires when the key is absent. Fix: build create_kwargs conditionally, omitting model_id when the session is at the DB default so the agent's __init__ setdefault governs. Also use agent.model_id (post-construction) for both _store_agent cache key and the pre-flight _maybe_load_expected_model call. Three-branch precedence: custom_model setting > session-explicit > omit kwarg. Closes #841

…N_DEFAULT_MODEL Addresses code review feedback on PR #842: - Export SESSION_DEFAULT_MODEL from database.py (single source of truth) instead of duplicating the string literal in _chat_helpers.py - Extract _build_create_kwargs() helper to eliminate the duplicate three-branch create_kwargs logic across non-streaming and streaming code paths - Extract _effective_model() helper using explicit None check (not `or`) to safely read agent.model_id post-construction without treating empty string as missing - Fix static regression guard regex to use [^()]* so nested helper calls inside create_agent() are not falsely flagged - Update unit test to import SESSION_DEFAULT_MODEL instead of hardcoding

…ion (#842) _store_agent was changed by the #842 fix to use _effective_model(agent, model_id) as the cache key — the post-construction value set by kwargs.setdefault. _get_cached_agent still looks up using the pre-construction model_id variable. For custom agents whose setdefault model differs from the session model, the keys never match and the agent is rebuilt on every turn. Revert the two _store_agent call sites to use model_id (the pre-construction intent key), matching what the lookup uses. _effective_model stays at the two _maybe_load_expected_model sites (Lemonade pre-flight needs the actual model) and in log statements (observability). Add two regression guards: - test_cache_hit_on_second_turn_for_setdefault_agent: two-turn cache-hit test with four assertions (call count, object identity, stored-key equality, agent.model_id). Covers the builder/template.py setdefault pattern. - test_no_effective_model_in_store_agent_calls: static grep guard that asserts _store_agent never receives _effective_model(...) as a positional arg, preventing this pattern from silently returning in a future cleanup pass.

#817) ## Summary One-line fix: swap the failing `www.cs.cmu.edu/~tom/EMNLP2004_final.pdf` URL in `docs/plans/email-triage-agent.mdx:2601` for the canonical ACL Anthology record at [W04-3240](https://aclanthology.org/W04-3240/). The CMU URL fails DNS resolution in CI (see [recent run](https://github.com/amd/gaia/actions/runs/24595902571/job/72072156929)), breaking the ``Verify external URLs`` check for every open PR that touches docs. ACL Anthology is the permanent archive for ACL/EMNLP papers — stable URL, no more link rot. Also restored the paper's actual full title ("Learning to Classify Email into 'Speech Acts'") for consistency with the other full-title citations in the same references list. ## Test plan - [x] `curl -sI https://aclanthology.org/W04-3240/` returns 200 - [ ] After merge, `Verify external URLs` check should go green on downstream PRs

Patch release: custom agents now honor their declared model, and the C++ library no longer crashes on null JSON fields from smaller LLMs. - Custom Agent UI agents honor kwargs.setdefault("model_id", ...) when the session is at the DB default (#841, follow-up #842 restores cache hits). - C++ library adds null-safety guards in parseLlmResponse() to tolerate smaller LLMs that return null for "tool" or "content" (#780). - Docs: swap broken CMU link for canonical ACL Anthology URL (#817).

…17.1 style

github-actions · 2026-04-23T23:14:34Z

Summary

PR #855 is the v0.17.4 release: a version bump plus the _chat_helpers.py fix for issue #841 (custom-agent model_id ignored), its #842 follow-up (agent-cache key divergence), and new regression tests. The refactor is clean — _build_create_kwargs() / _effective_model() de-duplicate the three-branch precedence across the streaming and non-streaming paths, and the tests pin both the behavior and the source-level antipattern. Ship it. Two tiny doc nits and one optional readability suggestion below.

Issues Found

🟢 Minor — Release notes changelog undercounts commits (`docs/releases/v0.17.4.mdx:82`)

The "Full Changelog" claims 5 commits since v0.17.3, but git log v0.17.3..HEAD on main shows 7 non-release commits including 3b51ca92 style(mcp): apply Black formatting to mcp_bridge.py (CI lint fix) — which is present in this PR's diff (src/gaia/mcp/mcp_bridge.py). Either bump the count and add the lint-fix bullet, or phrase it as "notable commits" so the omission isn't a contradiction with the diff.

**6 commits** since v0.17.3:

- `8fc43f3f` — fix(cpp): add null-safety checks for JSON string fields in LLM response parsing (#780)
- `62722de2` — fix(ui): honor custom agent model_id when session is at DB default (#841)
- `4acfd400` — fix(ui): extract _build_create_kwargs/_effective_model, import SESSION_DEFAULT_MODEL
- `8f5c7621` — fix(ui): restore intent-key for agent cache store to fix miss regression (#842)
- `a0fdb109` — docs(plans): fix broken CMU link to EMNLP 2004 Email Speech Acts paper (#817)
- `3b51ca92` — style(mcp): apply Black formatting to mcp_bridge.py (CI lint fix)

🟢 Minor — Documented edge case: user explicitly picking the DB default (`src/gaia/ui/_chat_helpers.py:114`)

elif model_id and model_id != _DB_DEFAULT_MODEL treats a session whose model happens to equal SESSION_DEFAULT_MODEL as "unset", and falls through to branch 3 where the agent's setdefault governs. That's the intended fix for #841, but it does mean a user who explicitly chose Qwen3.5-35B-A3B-GGUF will silently have a custom agent's setdefault override them. The code comment at line 118-120 explains the mechanism but not this user-visible consequence. Consider adding one sentence to the branch-3 log line so the behavior is discoverable in support logs, e.g.:

        # Omit model_id so kwargs.setdefault in the agent's __init__ fires.
        # setdefault only works when the key is ABSENT. Passing the DB default
        # (or None / empty) explicitly defeats it — this is the fix for #841.
        # Note: a session whose model coincidentally equals the DB default is
        # indistinguishable from "unset" here and will surrender to setdefault.
        logger.info(
            "create_agent: omitting model_id kwarg (session at DB default %s); "
            "agent's kwargs.setdefault or AgentConfig fallback will govern%s",
            _DB_DEFAULT_MODEL,
            suffix,
        )

🟢 Minor — Test helper uses deprecated `asyncio.get_event_loop()` (`tests/unit/chat/ui/test_chat_helpers_model_resolution.py:27`)

asyncio.get_event_loop() emits DeprecationWarning under Python 3.12+ when no loop is running (GAIA requires >=3.10). There's existing precedent in tests/unit/chat/ui/test_history_limits.py:42 so this isn't a regression, but asyncio.run(coro) is the modern drop-in. Not blocking for the release — worth a follow-up issue to migrate both.

Strengths

Regression pins at three layers are what make this PR durable: a unit-level functional test for the kwarg selection, a source-level re.search guard against reintroducing create_agent(..., model_id=model_id, ...) (test_chat_helpers_model_resolution.py:759), and a matching guard against _store_agent(..., _effective_model(...)) to catch the fix(ui): honor custom agent model_id when session is at DB default #842 cache-key divergence (line 776). Future refactors that re-introduce either antipattern will fail loudly in <5ms.
_effective_model uses explicit None check, not or (_chat_helpers.py:137) — correctly avoids the footgun of treating empty-string model_id as missing, and the docstring names the silent-wrong-model failure it's preventing. This is the kind of small decision worth documenting.
Cache-hit regression test is object-identity based (test_chat_helpers_model_resolution.py:731) — second_agent is first_agent plus the create_agent.call_count == 1 assertion gives two independent signals that the cache actually hit. Much stronger than asserting on the stored key alone.
SESSION_DEFAULT_MODEL exported from database.py as a single source of truth (line 26) — correctly preferred over duplicating the literal in _chat_helpers.py. The _DB_DEFAULT_MODEL alias at _chat_helpers.py:77 keeps call-sites readable without inventing a second canonical value.

Verdict

Approve with suggestions. No blockers. The one concrete suggestion worth taking before merge is the commit count in the release notes — the rest are nits that can land in a follow-up.

itomek and others added 6 commits April 20, 2026 18:50

style(mcp): apply Black formatting to mcp_bridge.py (CI lint fix)

3b51ca9

itomek requested a review from kovtcharov-amd April 23, 2026 23:08

itomek self-assigned this Apr 23, 2026

itomek added the release label Apr 23, 2026

github-actions Bot added documentation Documentation changes mcp MCP integration changes tests Test changes labels Apr 23, 2026

docs(release): tone down v0.17.4 notes, drop install block, match v0.…

1e19ce1

…17.1 style

itomek enabled auto-merge April 23, 2026 23:54

kovtcharov-amd approved these changes Apr 24, 2026

View reviewed changes

itomek added this pull request to the merge queue Apr 24, 2026

Merged via the queue into main with commit 9a74298 Apr 24, 2026
40 of 43 checks passed

itomek deleted the v0.17.4-release branch April 24, 2026 00:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.17.4#855

Release v0.17.4#855
itomek merged 7 commits intomainfrom
v0.17.4-release

itomek commented Apr 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

itomek commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GAIA v0.17.4 Release Notes

What's New

Custom Agent model_id Respected in the Agent UI

C++ Library: Null-Safety in LLM Response Parsing

Bug Fixes

Full Changelog

Release checklist

Uh oh!

github-actions Bot commented Apr 23, 2026

Summary

Issues Found

🟢 Minor — Release notes changelog undercounts commits (docs/releases/v0.17.4.mdx:82)

🟢 Minor — Documented edge case: user explicitly picking the DB default (src/gaia/ui/_chat_helpers.py:114)

🟢 Minor — Test helper uses deprecated asyncio.get_event_loop() (tests/unit/chat/ui/test_chat_helpers_model_resolution.py:27)

Strengths

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

itomek commented Apr 23, 2026 •

edited

Loading

Custom Agent `model_id` Respected in the Agent UI

🟢 Minor — Release notes changelog undercounts commits (`docs/releases/v0.17.4.mdx:82`)

🟢 Minor — Documented edge case: user explicitly picking the DB default (`src/gaia/ui/_chat_helpers.py:114`)

🟢 Minor — Test helper uses deprecated `asyncio.get_event_loop()` (`tests/unit/chat/ui/test_chat_helpers_model_resolution.py:27`)