Skip to content

fix(reflect): prevent directive leakage on empty banks#1190

Merged
nicoloboschi merged 1 commit intomainfrom
fix/reflect-directive-leak
Apr 22, 2026
Merged

fix(reflect): prevent directive leakage on empty banks#1190
nicoloboschi merged 1 commit intomainfrom
fix/reflect-directive-leak

Conversation

@nicoloboschi
Copy link
Copy Markdown
Collaborator

Summary

  • When a bank has directives but no memories/facts, the reflect agent echoes directive content verbatim as its answer instead of saying it has no information
  • Root cause: the LLM short-circuits (returns text, no tool calls) and parrots the "MANDATORY" directive text from its system prompt
  • Fix: when directives are present but no evidence has been gathered, skip the text-response path and fall through to the final-prompt path which uses FINAL_SYSTEM_PROMPT (no directives) and says "No data was retrieved"

Test plan

  • Integration test with real LLM: create bank with directive, no memories, reflect — assert directive text not in answer
  • Verified red-green: test fails without fix, passes with fix
  • All existing test_reflect_agent.py tests pass (no regressions)
  • Lint passes

…mpty banks

When a bank has directives but no memories, the LLM short-circuits the
reflect agent loop by returning text directly (no tool calls). Because
the system prompt includes directives marked as MANDATORY, the LLM
echoes the directive text verbatim as its answer.

Fix: when directives are present but no evidence has been gathered,
skip accepting the text response and fall through to the final-prompt
path, which uses FINAL_SYSTEM_PROMPT (no directives) and handles
"no data" gracefully.
@nicoloboschi nicoloboschi merged commit 3d877b0 into main Apr 22, 2026
53 of 54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant