Skip to content

fix(prompts): LLM prompting audit fixes (8 findings)#218

Merged
dennys246 merged 4 commits intomainfrom
bug/prompting-audit-fixes
May 3, 2026
Merged

fix(prompts): LLM prompting audit fixes (8 findings)#218
dennys246 merged 4 commits intomainfrom
bug/prompting-audit-fixes

Conversation

@dennys246
Copy link
Copy Markdown
Owner

Summary

Fixes from the LLM-prompting audit (8 findings, all closed). Net: 4 commits, +422/−350 LOC, 6307 tests pass (+10 new regression tests for the contracts that were silently relied on).

  • Acting Coach prompt section bumped to CRITICAL — was IMPORTANT while the bio signals it synthesises (causal_context, valence_context, body_state) were CRITICAL, so the interpreted view dropped before its source data under token pressure.
  • Observation section bumped IMPORTANT — was NICE_TO_HAVE, dropped before conversation history despite tools needing the current state to be safely invoked.
  • sense_presence.input_schema migrated from the legacy description-as-value pattern to JSONSchema, so the export marks context as optional (matching the implementation) instead of required (matching the prose). Strict MCP / Anthropic clients no longer reject calls that omit context.
  • Auto-sense bare-except in agent_loop.py split: KeyError (tool not registered) stays silent, anything else routes through log_swallowed_exception so silent blindness can no longer hide a real bug.
  • Dead PromptAssembler scaffold deleted (−326 LOC). The substrate plan's B1 ("PromptAssembler as single composition point") was archived as shipped but the migration was never completed; prompts/assembler.py was abandoned scaffold whose only consumer-of-a-consumer (MemoryHub.get_memory_summary) had zero production callers. CLAUDE.md "legacy" label corrected.
  • Hardcoded conversation-turn cap (3 vs 12) replaced with min(12, max(3, n_ctx // 2000)) for embodied / min(20, max(6, n_ctx // 800)) for non-embodied. Embodied agents on 32K-context models no longer get clipped to 3 turns regardless of context.
  • Motor programs truncation rewritten to drop full entries (name + Steps + Known risks together) instead of slicing by line count, which previously left orphan continuation lines whose owning name had been dropped.
  • AgentPool.remove() now drops module-level bio_integration stash entries (_episode_ticks, _latest_pain_intensity, _latest_substrate_nodes) so a future agent reusing the same id can't inherit stale state. No bug today — multi-agent runs use unique ids — but cheap insurance against NPC respawn / pool recycling patterns.

Plus regression tests pinning the deliberation-transcript suppression contract (bio_enrichment_context and working_memory_thoughts are intentionally suppressed when a transcript is present) so future producer/consumer divergence surfaces loudly instead of as a stale-prompt regression.

Test plan

  • Full fast suite (6307 passed, 15 skipped, 40 deselected)
  • ruff check + ruff format on every touched file
  • New regression tests for transcript suppression (7), motor program truncation (3), AgentPool stash cleanup (1), sense_presence schema export (1)
  • Sanity-run a sim with MAXIM_LOG_FILE to confirm Acting Coach now appears under token pressure on a 4K-ctx model
  • Verify on a real Anthropic / MCP client that sense_presence calls without context no longer get rejected

🤖 Generated with Claude Code

dennys246 and others added 4 commits May 3, 2026 09:18
From the prompting audit:

#2 Acting Coach was at SectionPriority.IMPORTANT while its source signals
(causal_context, valence_context, body_state) were CRITICAL. Under token
pressure the interpreted view dropped before the raw signals it
synthesises — backwards. Bumped to CRITICAL so the agent safety
annotations survive alongside the bio data they summarise.

#4 Added regression tests for the deliberation_transcript suppression
contract in _add_perception_sections + _add_working_memory_section.
When transcript is present, bio_enrichment_context and
working_memory_thoughts are intentionally suppressed (transcript is
assumed to subsume them). The risk is silent staleness: a fresh percept
arriving after transcript construction would have its bio_enrichment
dropped. The tests pin both branches so any future producer/consumer
divergence surfaces loudly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… auto-sense logging

#7 Bumped current-observation section from NICE_TO_HAVE to IMPORTANT in
PromptBuilder._add_perception_sections. Tools cannot be safely invoked
without knowing the current state, so the observation should outlast
conversation history under token pressure.

#3 SensePresenceTool.input_schema migrated from the legacy
description-as-value pattern to JSONSchema directly. The old form
emitted required: ["context"] in the export despite the prose saying
"Optional", which strict MCP / Anthropic clients reject. execute()
already handles a missing context gracefully and the auto-sense caller
never passes one. Added regression test in test_tool_discovery.py.

Auto-sense logging in agent_loop.py:1017-1057 — split the bare
except (KeyError, Exception): pass blanket into:
- KeyError on registry.get(): silent (tool not registered is a
  legitimate config, e.g. agents without entity_map).
- Anything else from execute(): log_swallowed_exception so silent
  blindness can no longer hide a real bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e plan B1

The substrate plan's B1 ("PromptAssembler — single composition point")
was archived as shipped, but the migration was never completed. Exit
criterion (grep -r system_message=f outside the assembler returns
nothing) was never met. agents/prompt_builder.py is the actual
production builder at 1658 lines; prompts/assembler.py was a 184-line
abandoned scaffold.

What was dead:
- PromptAssembler class (no production caller, never instantiated)
- compose_memory_section, compose_observation_section methods
- MemorySummary.from_atl classmethod
- MemoryHub.get_memory_summary (only consumer of from_atl; itself had
  zero production callers, only one test asserting it returned None
  when substrate disabled)

Removed: prompts/assembler.py, prompts/__init__.py exports for the
three dead symbols, MemoryHub.get_memory_summary, related tests in
test_substrate_recognition.py (TestMemorySummary, TestPromptAssembler,
test_get_memory_summary_none_when_disabled), and the stale "PromptAssembler
will replace this" comment in dm_runtime.py.

Updated CLAUDE.md "Prompt composition" key files row: prompt_builder.py
is the canonical builder, no longer mislabelled "legacy" — the file
that had the misleading "(legacy)" annotation is the one in production,
and the file labelled canonical is what got deleted.

If a future plan wants to replace prompt_builder.py with a structured
composition layer, start fresh — the previous attempt is not a useful
foundation to build on.

Test count: 6303 passed (was 6312; 9 tests removed for the dead code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…or truncation + AgentPool stash cleanup

#5 Two prompt-builder budget fixes:

- Conversation history turn cap was hardcoded `3 if acting_coach else 12`,
  ignoring n_ctx entirely. Embodied agents on 32K-context models still got
  3 turns; non-embodied on 4K still got 12. Replaced with proportional
  formulas that scale with n_ctx and floor at sensible minimums:
  embodied min(12, max(3, n_ctx // 2000)),
  non-embodied min(20, max(6, n_ctx // 800)).

- Motor programs section truncated by line count (`m // 15`), which left
  orphan continuation lines (Steps:/Known risks:) when the owning name
  line got dropped. Each motor program is 1-3 lines; line-count slicing
  doesn't respect that structure. Replaced with an entry-aware
  truncate_fn that drops full entries from the tail using a char-budget
  estimate (`max_tok * 4`, same heuristic as the deliberation-transcript
  truncator). Also stops once only the header remains.

The deliberation-transcript budget at line 1277 (`min(2000,
available * 0.3)`) was flagged in the audit but is actually a correct
proportional pattern (cap + fraction); left unchanged.

#6 AgentPool.remove() now drops the per-agent bio_integration stash
entries (`_episode_ticks`, `_latest_pain_intensity`,
`_latest_substrate_nodes`) so a future agent reusing the same id
doesn't inherit stale tick counters / pain intensity / substrate
node refs from the removed one. No bug today since multi-agent
runs use unique ids, but the cleanup is cheap insurance against
NPC respawn or pool recycling patterns.

Tests: +4 (test_remove_clears_per_agent_stash + three motor program
truncation cases pinning no-orphans + tail-drop + head-preservation).
Total: 6307 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dennys246 dennys246 merged commit a815c71 into main May 3, 2026
5 checks passed
@dennys246 dennys246 deleted the bug/prompting-audit-fixes branch May 3, 2026 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant