Skip to content

docs(plans): fix broken CMU link to EMNLP 2004 Email Speech Acts paper#817

Merged
kovtcharov merged 1 commit intomainfrom
kalin/fix-emnlp-link
Apr 20, 2026
Merged

docs(plans): fix broken CMU link to EMNLP 2004 Email Speech Acts paper#817
kovtcharov merged 1 commit intomainfrom
kalin/fix-emnlp-link

Conversation

@kovtcharov
Copy link
Copy Markdown
Collaborator

Summary

One-line fix: swap the failing www.cs.cmu.edu/~tom/EMNLP2004_final.pdf URL in docs/plans/email-triage-agent.mdx:2601 for the canonical ACL Anthology record at W04-3240. The CMU URL fails DNS resolution in CI (see recent run), breaking the Verify external URLs check for every open PR that touches docs. ACL Anthology is the permanent archive for ACL/EMNLP papers — stable URL, no more link rot.

Also restored the paper's actual full title ("Learning to Classify Email into 'Speech Acts'") for consistency with the other full-title citations in the same references list.

Test plan

  • curl -sI https://aclanthology.org/W04-3240/ returns 200
  • After merge, Verify external URLs check should go green on downstream PRs

Swap the `cs.cmu.edu/~tom/EMNLP2004_final.pdf` URL (DNS-failing in CI, see
runs/24595902571) for the canonical ACL Anthology record at W04-3240.
Also restore the paper's actual title ("Learning to Classify Email into
'Speech Acts'") for consistency with the other full-title citations in
this list.

Unblocks the `Verify external URLs` check for every open PR.
@github-actions
Copy link
Copy Markdown
Contributor

Summary

One-line doc fix that swaps a dead www.cs.cmu.edu URL for the canonical ACL Anthology record (W04-3240) and restores the paper's full title in docs/plans/email-triage-agent.mdx:2601. This unblocks the Verify external URLs check that was failing on every PR touching docs, and uses a permanent archive instead of a personal faculty page — exactly the right move.

Issues Found

None.

Strengths

  • 🟢 Picks the right URL. ACL Anthology is the canonical, archival home for ACL/EMNLP papers — more stable than faculty pages, CiteSeerX, or arXiv mirrors. This kind of link won't rot.
  • 🟢 Corrects the citation title opportunistically ("Learning to Classify Email into 'Speech Acts'"), matching the full-title style used by the surrounding references (Whittaker & Sidner, "Email Overload", Bellotti et al., "Taking Email to Task"). Keeps the list internally consistent.
  • 🟢 Scope-clean. Exactly one line changed, title and body of the PR explain the why (DNS failure in CI, link-rot prevention) rather than the what. Matches the CLAUDE.md PR-description guidance.

Verdict

Approve — ready to merge. Trivial, correct, and it unblocks CI for other open PRs.

@kovtcharov kovtcharov enabled auto-merge April 20, 2026 09:15
@kovtcharov kovtcharov added this pull request to the merge queue Apr 20, 2026
Merged via the queue into main with commit 21e1211 Apr 20, 2026
23 checks passed
@kovtcharov kovtcharov deleted the kalin/fix-emnlp-link branch April 20, 2026 21:42
itomek pushed a commit that referenced this pull request Apr 22, 2026
#817)

## Summary

One-line fix: swap the failing `www.cs.cmu.edu/~tom/EMNLP2004_final.pdf`
URL in `docs/plans/email-triage-agent.mdx:2601` for the canonical ACL
Anthology record at [W04-3240](https://aclanthology.org/W04-3240/). The
CMU URL fails DNS resolution in CI (see [recent
run](https://github.com/amd/gaia/actions/runs/24595902571/job/72072156929)),
breaking the ``Verify external URLs`` check for every open PR that
touches docs. ACL Anthology is the permanent archive for ACL/EMNLP
papers — stable URL, no more link rot.

Also restored the paper's actual full title ("Learning to Classify Email
into 'Speech Acts'") for consistency with the other full-title citations
in the same references list.

## Test plan

- [x] `curl -sI https://aclanthology.org/W04-3240/` returns 200
- [ ] After merge, `Verify external URLs` check should go green on
downstream PRs
@itomek itomek mentioned this pull request Apr 23, 2026
6 tasks
pull Bot pushed a commit to bhardwajRahul/gaia that referenced this pull request Apr 24, 2026
# GAIA v0.17.4 Release Notes

GAIA v0.17.4 is a patch release covering two correctness fixes in the
Agent UI custom-agent path, a null-safety fix in the C++ library for
smaller LLMs, and a broken docs citation.

**Why upgrade:**
- **Custom agents use their declared model** — If a custom agent sets a
model via `kwargs.setdefault("model_id", ...)`, the Agent UI now
respects that setting when the session is at the DB default, instead of
falling back to the session model.
- **Compatibility with smaller LLMs in the C++ library** — The C++ JSON
parser now tolerates `null` values in `"tool"` and `"content"` fields,
which some smaller models emit in place of omitting the field.

---

## What's New

### Custom Agent `model_id` Respected in the Agent UI

`_chat_helpers.py` previously passed `model_id=<session model>`
explicitly to `registry.create_agent()`, which defeated
`kwargs.setdefault("model_id", ...)` in custom agents — `setdefault`
only fires when the key is absent (PR
[amd#841](amd#841)). The Agent UI now builds
`create_kwargs` conditionally, omitting `model_id` when the session is
at the DB default so the agent's `__init__` setdefault governs.
Three-branch precedence is now explicit: `custom_model` setting >
session-explicit model > agent's own `setdefault`.

A follow-up fix (PR [amd#842](amd#842))
restored the pre-construction `model_id` as the agent-cache key. The
initial PR amd#841 landing had switched `_store_agent` to use the
post-construction `_effective_model(agent, model_id)` while
`_get_cached_agent` still looked up with `model_id`, so keys never
matched for custom-model agents and the agent was rebuilt on every turn.
A two-turn cache-hit regression test and a static guard on
`_store_agent` call sites were added alongside the fix.

Supporting refactor: extracted `_build_create_kwargs()` and
`_effective_model()` helpers in `src/gaia/ui/_chat_helpers.py` to
deduplicate the three-branch logic across streaming and non-streaming
paths, and exported `SESSION_DEFAULT_MODEL` from `database.py` as the
single source of truth.

---

### C++ Library: Null-Safety in LLM Response Parsing

`parseLlmResponse()` in `cpp/src/json_utils.cpp` now guards
`.get<std::string>()` calls on the `"tool"` and `"answer"` JSON fields
with `.is_string()` / `.is_null()` checks (PR
[amd#780](amd#780)). This fixes a crash
(`json.exception.type_error.302: type must be string, but is null`) when
smaller LLMs (for example `qwen3.5:9b`) return `null` for those fields
instead of omitting them. `json.contains()` returns `true` for `null`
values, so the existing presence checks were insufficient.

---

## Bug Fixes

- **Email-triage agent plan: broken CMU citation link** (PR
[amd#817](amd#817)) — Swapped the failing
`www.cs.cmu.edu/~tom/EMNLP2004_final.pdf` URL in
`docs/plans/email-triage-agent.mdx` for the canonical ACL Anthology
record at [W04-3240](https://aclanthology.org/W04-3240/). The CMU URL
was failing DNS resolution in CI, breaking the `Verify external URLs`
check on every open docs PR. Restored the paper's full title ("Learning
to Classify Email into 'Speech Acts'") for consistency with other
citations in the same references list.

---

## Full Changelog

**5 commits** since v0.17.3:

- `8fc43f3f` — fix(cpp): add null-safety checks for JSON string fields
in LLM response parsing (amd#780)
- `62722de2` — fix(ui): honor custom agent model_id when session is at
DB default (amd#841)
- `4acfd400` — fix(ui): extract _build_create_kwargs/_effective_model,
import SESSION_DEFAULT_MODEL
- `8f5c7621` — fix(ui): restore intent-key for agent cache store to fix
miss regression (amd#842)
- `a0fdb109` — docs(plans): fix broken CMU link to EMNLP 2004 Email
Speech Acts paper (amd#817)

Full Changelog:
[v0.17.3...v0.17.4](amd/gaia@v0.17.3...v0.17.4)

---

## Release checklist
- [x] `util/validate_release_notes.py docs/releases/v0.17.4.mdx --tag
v0.17.4` passes
- [x] `src/gaia/version.py` → `0.17.4`
- [x] `src/gaia/apps/webui/package.json` → `0.17.4`
- [x] Navbar label in `docs/docs.json` → `v0.17.4 · Lemonade 10.0.0`
- [x] All 5 commits in the range (v0.17.3..HEAD) are represented in the
notes
- [ ] Review from @kovtcharov-amd addressed

---------

Co-authored-by: Tomasz Iniewicz <tomasz.iniewicz@amd.com>
Co-authored-by: Kalin Ovtcharov <kalin@extropolis.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Documentation changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants