Skip to content

feat(email): LLM-assisted triage classification when heuristic confidence is low #1107

@kovtcharov-amd

Description

@kovtcharov-amd

The heuristic classifier caps at ~70% accuracy. Messages where `confident=False` are currently classified as `informational` — this is wrong. The LLM should classify these messages directly rather than guessing.

Tasks

  • Wire LLM classification into `triage_inbox_impl` — when `confident=False`, send email body to LLM with classification prompt
  • Add structured triage prompt template: `{category, confidence, reasoning}` JSON output
  • If LLM classification fails, raise an actionable error (do not silently default to `informational`)
  • Benchmark heuristic-only vs heuristic+LLM on test corpus
  • Update integration test `xfail` to `pass` (`test_heuristic_triage_meets_baseline_minus_tolerance`)

Key files: `src/gaia/agents/email/tools/triage_heuristics.py`, `src/gaia/agents/email/tools/read_tools.py`

Metadata

Metadata

Assignees

No one assigned

    Labels

    p1medium priority

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions