feat(email): voice/style-matched drafting from Sent history (#1607)#1925
Conversation
Draft replies came out as neutral boilerplate regardless of how the user actually writes. Now `build_voice_profile` samples recent Sent mail, derives a local style profile (usual greeting, sign-off, typical length, formality signals), and stores it in the agent's SQLite state.db; the system prompt carries that guidance every turn, so LLM-composed draft bodies come out in the user's own voice. `clear_voice_profile` forgets it. Local-first by construction: only derived features are persisted — never raw Sent content — and greeting/sign-off values are matched against a closed vocabulary, so hostile text in a Sent body cannot reach the system prompt. Drafts stay approval-gated (#1264); building the profile and drafting trigger no send side-effect, with a regression test.
|
Verdict: Approve ✅ This teaches the email agent to draft in the user's own voice by learning greeting/sign-off/length/formality from their Sent mail, gated behind approval as before. The design is genuinely local-first: only derived features (fixed-vocabulary phrases + a few numbers) are persisted, never raw Sent content — and there's a sentinel test proving a private sentence from a Sent body never lands in the stored profile. The headline risk for a feature like this is a hostile Sent email smuggling text into the system prompt (prompt injection). That's cleanly avoided here: the analyzer only emits matches from a closed set of greeting/sign-off words plus word-count/contraction/exclamation stats — arbitrary body text has no path into the prompt. Verified against the code and the tests. No blocking issues; one optional logging nit below. 🔍 Technical detailsStrengths
🟢 Minor (optional): expected errors log full tracebacks ( Note (no action needed): |
Closes #1607.
Draft replies came out as neutral boilerplate no matter how the user actually writes. Now the agent can learn the user's voice from their own Sent mail — usual greeting, sign-off, typical length, formality — and every drafted reply comes out sounding like them, while staying approval-gated (#1264): building the profile and drafting trigger zero send side-effects, with a regression test. Local-first by construction: only derived features are persisted to the agent's SQLite
state.db(never raw Sent content), and greeting/sign-off extraction matches a closed vocabulary, so hostile text in a Sent body cannot reach the system prompt.Per the issue notes, the profile plugs into the draft-composition prompt (the LLM composes
draft_replybodies in the agent loop), not intodraft_replyitself — the sidecar REST contract is unchanged.Test plan
PYTHONPATH=$PWD/hub/agents/python/email python -m pytest hub/agents/python/email/tests/test_email_voice_profile.py -q— 16 new tests: analyzer (greeting/sign-off/length/formality, quoted-text stripping, curly apostrophes, long signature blocks), persistence round-trip + restart survival, SENT-label-only sampling, prompt injection before/after build, no-raw-content-persisted sentinel, and the no-send-side-effect acceptance test.tests/unit/agents/test_email_agent*.py+tests/integration/test_never_auto_send.py+tests/unit/agents/email/) — 748 passed; the 2 failures are pre-existing environment skew, fail identically on a clean tree.black/isortclean on all touched files;util/lint.py --allpasses.Deliberately deferred
style_matcheval AC: needs the email eval harness extended (draft-quality judging against the feat(eval): generate + commit labelled email-triage corpus + gemma4 baseline #1230 corpus) plus a Strix Halo hardware run — shipping an unrunnable scenario file would be half-finished work. Proposing a follow-up issue for it.