fix: Add tool-honesty instructions to prevent workers from fabricating results by PureWeen · Pull Request #371 · PureWeen/PolyPilot

PureWeen · 2026-03-13T18:38:53Z

Problem

Workers in multi-agent orchestration could fabricate results when their CLI tools failed to run, instead of honestly reporting the failure. The user expectation is: if tools don't run, workers should fail and report that — never fall back to self-evaluation or making up results.

Root Cause

Three prompt construction paths lacked tool-usage honesty instructions:

Worker prompt (ExecuteWorkerAsync): No instruction about tool failures
Synthesis prompt (BuildSynthesisPrompt): No guidance to verify tool usage in worker outputs
Evaluator prompts (both BuildEvaluatorPrompt paths): No tool-verification scoring dimension

Fix

Worker prompt: Added CRITICAL: Tool Usage & Honesty Policy section requiring actual tool execution, honest failure reporting, and TOOL_FAILURE sentinel for unrecoverable errors
Synthesis prompt: Added instructions to flag fabricated results and preserve TOOL_FAILURE reports as-is
Evaluator prompts: Added Tool Verification as a 5th scoring dimension (multi-agent evaluator) and verification check (reflection cycle evaluator)
Synthesis-with-eval prompt: Added Tool Verification assessment dimension

Tests

Added 7 unit tests in WorkerToolHonestyTests.cs covering all modified prompt paths:

Worker prompt contains tool-honesty instructions
Synthesis prompt contains tool-verification guidance
Synthesis prompt preserves TOOL_FAILURE reports
Multi-agent evaluator includes Tool Verification dimension
Synthesis-with-eval prompt includes Tool Verification
ReflectionCycle evaluator includes tool verification
ReflectionCycle evaluator preserves existing functionality

All 2544 passing tests continue to pass (4 pre-existing font-sizing failures unrelated).

…r prompts Workers in multi-agent orchestration could fabricate results when CLI tools failed to run, instead of reporting the failure. This adds explicit instructions across all orchestration prompt paths: - Worker prompt: CRITICAL tool usage policy requiring actual tool execution, honest failure reporting, and TOOL_FAILURE sentinel for unrecoverable errors - Synthesis prompt: Instructions to flag fabricated results and preserve TOOL_FAILURE reports as-is without guessing missing results - Evaluator prompts (both multi-agent and reflection cycle): New 'Tool Verification' scoring dimension that penalizes results lacking evidence of actual tool execution - Synthesis-with-eval prompt: Tool Verification assessment dimension Includes 7 unit tests covering all modified prompt paths. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The test was constructing its own string and asserting against it, providing no real regression protection. Now extracts the prompt building into a BuildWorkerPrompt static method and tests it via reflection, matching the pattern of the other 6 tests in the file. Co-authored-by: GitHub Copilot <copilot@github.com>

PureWeen and others added 2 commits March 13, 2026 13:38

PureWeen merged commit 79417d5 into main Mar 13, 2026

PureWeen deleted the fix/session-uitestvalidationskill-orchestrat-20260313-1820 branch March 13, 2026 21:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Add tool-honesty instructions to prevent workers from fabricating results#371

fix: Add tool-honesty instructions to prevent workers from fabricating results#371
PureWeen merged 2 commits intomainfrom
fix/session-uitestvalidationskill-orchestrat-20260313-1820

PureWeen commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PureWeen commented Mar 13, 2026

Problem

Root Cause

Fix

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant