Skip to content

feat(0.9.1): Wire 2 — Pavlovian percept aversion#256

Merged
dennys246 merged 2 commits into
mainfrom
feat/0-9-1-wire-2-pavlovian-aversion
May 17, 2026
Merged

feat(0.9.1): Wire 2 — Pavlovian percept aversion#256
dennys246 merged 2 commits into
mainfrom
feat/0-9-1-wire-2-pavlovian-aversion

Conversation

@dennys246
Copy link
Copy Markdown
Owner

Summary

Stage 3 of release_0_9_1.md (lifted from bio_emergent_persona_foundations.md § Wire 2). The only 0.9.1 wire with new persistence: a percept (independent of action context) acquires learned valence when a pain signal fires while that percept's entity is in scope. A dragon that burned the agent once now carries elevated salience for every subsequent dragon percept — Pavlovian fear conditioning, not prompt-driven caution.

What ships

Component Where LOC
NAc._percept_valences + record/get/decay_percept_valences + get_percept_aversions src/maxim/decisions/nac.py +290
_NAC_FORMAT_VERSION = "1.1" + percept_valences in dump/load_state same (in above)
create_percept_valence_subscriber + auto-wire in build_pain_bus src/maxim/proprioception/pain_bus.py +110
entity_name published alongside entity_type in PainSignal context src/maxim/embodiment/body.py +14
GatingContext.learned_aversions field (CC3 audit-gated) + TextSalienceScorer Pavlovian modulation + _match_learned_aversion helper src/maxim/runtime/gating.py +120
BioEnrichmentPipeline._snapshot_learned_aversions + _emit_wire_2_aversion_event for Roy-3 disambiguation src/maxim/integration/bio_enrichment.py +75
decay_percept_valences wired into agent_loop section 8.5 src/maxim/runtime/agent_loop.py +6
Tests (66 tests, 13 layers) tests/unit/test_wire_2_percept_aversion.py +900 (new file)
Existing test updates (subscriber counts, NAc format-version, persistence-compat) tests/unit/test_pain_bus.py, tests/integration/test_persistence_compat.py ±31

Total: ~1,780 LOC source + tests. 66 unit tests + 442 module regression passing. Full fast suite: 6759 passed, 9 skipped.

Frozen contract impact (CC3 audit)

  • GatingContext.learned_aversions: dict | None = None — additive optional field at end of field list. Docstring updated. Non-breaking per CC3 shape-frozen rules.
  • NAc dump payload adds percept_valences key; _format_version bumped from "1.0""1.1" via new _NAC_FORMAT_VERSION constant. Backward-compat reader (load_state) returns empty dict for missing field. Non-breaking for old payloads.

Latent-bridge × subscriber trap — plan-flagged risk

Per docs/plans/release_0_9_1.md risk register + pain_bus_unification.md:

  • Wire-A writes _cluster_reward_bias via update_cluster_reward from record_outcome, keying on (agent_id, cluster_id, tool_signature).
  • Wire 2 writes _percept_valences via create_percept_valence_subscriber from PainBus, keying on (agent_id, entity_class, failure_mode).

Different maps. Different key shapes. No double-attribution by construction. Pinned by TestLatentBridgeSubscriberTrap (3 tests) and TestPerceptValenceCallCount.test_record_percept_valence_called_exactly_once_per_pain (call-count spy regression).

Two-lens pre-merge review

Per feedback_review_before_ship.md. Both reviews ran in parallel before opening this PR; 1 Critical + 5 Important + 1 Nice-to-have folded into commit 1c1bf8d:

Architecture findings (1 Critical, 3 Important, 1 Nice-to-have):

Finding Fix
C1 (CRITICAL — production wire was dead)entity_type from SEM YAML is the CATEGORY (creature, weapon), not the noun (dragon, rusty_sword). Storing aversion under creature means the salience scorer's fragment-match never lifts on percept text containing dragon. The initial test fixtures used entity_type="dragon" which masked the bug entirely — production would have shipped a dead wire. body.py now publishes BOTH entity_name (noun) and entity_type (category). Subscriber prefers entity_name with fallback to entity_type for legacy producers. Regression: TestProductionWireShape (2 tests) walks full producer→subscriber→scorer pipeline with realistic YAML-derived context shape.
I1with_format_version raises on existing-but-different version; future load→mutate→save would hit it. TestNacPersistence.test_load_modify_save_round_trip pins the contract.
I3 — Trap class asserted current behavior, not the invariant that record_percept_valence is called exactly once per pain publish. TestPerceptValenceCallCount.test_record_percept_valence_called_exactly_once_per_pain spies the real NAc method and counts calls.
N1_match_learned_aversion split on _ and : only; LLM-generated dragon-whelp-style hyphenated names wouldn't fragment. Added - to the split character set. Regression: test_hyphen_split_match.

Bio-fidelity findings (HEADLINE + 2 Important):

Finding Fix
B1 (HEADLINE — Wire 3 precedent applied) — Wire 2 had no Roy-3 disambiguation emission. Post-hoc JSONL analysis couldn't distinguish "Pavlovian aversion lifted salience" from "the percept was inherently salient via NAc link signal." Wire 3's bio-fold closed exactly the same shape of bug. BioEnrichmentPipeline._emit_wire_2_aversion_event emits WIRE_2_AVERSION sim_log event when an aversion match would lift salience. Fields: matched_entity_class, aversion_magnitude, salience_combined, salience_base, percept_words_sample, n_aversions. Best-effort (cold-start produces no event). Regression: TestWire2AversionEmission (2 tests).
B2 — Sharing reward_bias_decay_tau = 50.0 with action-outcome reward bias contradicted Wire 2's thesis. Pavlovian fear conditioning is biologically slower-extincting. Separate NACConfig.percept_valence_decay_tau = 200.0 (4x slower). Pinned by test_decay_uses_separate_tau_from_reward_biases.
B3 — α=0.20 + floor=0.05 left a single intensity=0.3 pain with only one tick of margin above the aversion floor. Single-trial fear conditioning is well-attested biologically; the previous α was fragile. Raised percept_valence_alpha from 0.20 → 0.35. Single intensity=0.3 pain now writes magnitude=0.105 (2x margin). Regression: test_single_moderate_pain_stays_above_floor.

Cross-confirmed concerns — both lenses independently flagged the observability gap (arch as "debuggability hole," bio as "Roy-3 disambiguation"), giving the high-signal cross-confirmation per feedback_cross_confirmed_review_findings.md.

Deferred (4): positive-valence asymmetric read (B4), string-fragment-vs-EC-cluster identity (B5, post-1.0 architectural), CI grep enforcement for _NAC_FORMAT_VERSION (I2, speculative), N2-N4 minor gaps. None block 0.9.1.

Behavioral signal + Roy-3 measurement

The bio thesis Wire 2 closes: an agent burned by dragon_fire once now carries elevated salience for subsequent dragon percepts even BEFORE any action choice — bio-grounded fear, not prompt-driven caution. Substrate-attributable through the persisted _percept_valences dict.

Roy-3 disambiguation is now structurally possible via the WIRE_2_AVERSION JSONL emission — Roy analyzers can count exactly which percepts had aversion-lifted salience each turn, separating Wire 2's behavioral contribution from the existing NAc-link-driven salience signal.

Test plan

  • python -m pytest tests/unit/test_wire_2_percept_aversion.py -v66 passed (13 layers).
  • python -m pytest tests/unit/test_pain_bus.py tests/unit/test_nac.py tests/unit/test_gating.py tests/unit/test_bio_enrichment.py tests/integration/test_persistence_compat.py tests/unit/test_embodiment_failures.py tests/unit/test_embodiment_sem.py -q295 passed (no regression in adjacent surface).
  • python -m pytest tests/ -x -q -m "not slow" --ignore=tests/integration/test_memory_hub.py6759 passed, 9 skipped, 40 deselected (post-fold; pre-fold was 6748).
  • ruff check + ruff format clean on touched files.
  • mypy: pre-existing errors only (no new errors introduced).
  • Next: Wire 1 (Stage 4: risk-sensitive action annotation), then Roy-3 validation (Stage 5).

What's next in 0.9.1

Per release_0_9_1.md:

  1. ✅ Stage 0a (Roy-2c probe)
  2. ✅ Stages 0b + 0c (telemetry)
  3. ✅ Stage 1 (Wire 3 — embodiment filter)
  4. ✅ Stage 2 (Wire-A — cluster-bias annotation)
  5. Stage 3 (Wire 2 — Pavlovian percept aversion) — this PR
  6. ⏳ Stage 4 (Wire 1 — risk-sensitive action annotation)
  7. ⏳ Stage 5 (Roy-3 validation)

🤖 Generated with Claude Code

dennys246 and others added 2 commits May 17, 2026 09:00
Stage 3 of release_0_9_1.md (lifted from bio_emergent_persona_foundations.md
§ Wire 2).  The only 0.9.1 wire with new persistence: a percept (independent
of action context) acquires learned valence when a pain signal fires while
that percept's entity_class is in scope.  A dragon that burned the agent
once now carries elevated salience for every subsequent dragon percept,
even before any action choice.

Wires:

- NAc._percept_valences: dict[(agent_id, entity_class, failure_mode), float]
  with record_percept_valence / get_percept_valence / get_percept_aversions
  + decay_percept_valences.  Required keyword-only agent_id, validated
  non-empty per the CLAUDE.md per-agent stash rule.  Range
  [-max_percept_valence, +max_percept_valence] (1.0); accumulation rate
  percept_valence_alpha (0.20); decay tau shared with reward-bias cycles.
- NAc dump _format_version bumped to "1.1" via new _NAC_FORMAT_VERSION
  constant.  Backward-compat: 1.0 payloads (no percept_valences key) load
  cleanly to empty dict.  Entity-class / failure_mode values containing
  ":" (drive specs like "drive:hunger") round-trip via the \x1f
  separator encoding mirroring _cluster_reward_bias.
- decay_percept_valences wired into agent_loop section 8.5 alongside the
  three other reward-bias decay cycles.
- create_percept_valence_subscriber: new PainBus subscriber, auto-wired
  via build_pain_bus when nac is not None.  Interactive-mode gated;
  reads agent_id / entity_type / failure_mode from signal.context;
  empty fields are a silent DEBUG no-op (out-of-spec producer signal,
  not a fatal error).
- GatingContext.learned_aversions: dict[entity_class, magnitude] | None
  additive optional field per CC3 frozen-contract rules (docstring
  audit gate declares the addition).  TextSalienceScorer reads this
  on every score() call; word-fragment matching splits underscore-
  and colon-separated entity_class keys so "the rusty blade" matches
  an aversion keyed "rusty_sword".
- BioEnrichmentPipeline._snapshot_learned_aversions: pipeline-level
  read of NAc.get_percept_aversions(agent_id=self._agent_id), injected
  into the GatingContext passed to the scorer.  Falls back to None on
  no-NAc / no-agent_id / NAc-raises (DEBUG-logged, non-fatal).

Latent-bridge x subscriber trap (the plan-flagged concern from
docs/plans/pain_bus_unification.md):
  Wire-A's _cluster_reward_bias write path goes through
  update_cluster_reward from record_outcome on the
  (agent_id, cluster_id, tool_signature) triple.
  Wire 2's _percept_valences write path goes through the new subscriber
  on the (agent_id, entity_class, failure_mode) triple.
  Different maps, different key shapes - NO double-attribution by
  construction.  Pinned by TestLatentBridgeSubscriberTrap (3 tests).

Test coverage: 55 new tests across 10 layers in
tests/unit/test_wire_2_percept_aversion.py:
  1. record/get_percept_valence (11 tests) - clamp, per-agent isolation,
     agent_id validation.
  2. get_percept_aversions (6 tests) - aggregation, aversion-side only,
     floor pruning, per-agent isolation.
  3. decay_percept_valences (3 tests) - single-tick shape, prune
     threshold.
  4. NAc persistence (6 tests) - _format_version=1.1, dump/load round-
     trip, save/load file round-trip, backward-compat 1.0 read,
     drive-spec colon round-trip.
  5. create_percept_valence_subscriber (6 tests) - intensity gating,
     context derivation, interactive-mode suppression, NAc-raise
     surfaces via logger.exception.
  6. build_pain_bus auto-wire (3 tests) - direct_pain_subscribers=3
     when both subjects wired; nac=None skips it; end-to-end publish
     writes the map.
  7. Latent-bridge x subscriber trap (3 tests) - Wire-A and Wire 2
     write paths stay disjoint by key shape and producer surface.
  8. GatingContext + TextSalienceScorer (10 tests) - fragment match
     semantics, saturating-mix modulation, opt-out (None/empty).
  9. BioEnrichmentPipeline snapshot (4 tests) - fallbacks on no-NAc /
     no-agent_id / NAc-raises.
  10. Multi-agent isolation (1 test) - two distinct pain events with
      separate agent_ids stay in their own buckets.

Existing test updates:
- tests/unit/test_pain_bus.py: TestBuildPainBus
  direct_pain_subscribers counts bumped (nac-only: 1->2; both: 2->3).
- tests/integration/test_persistence_compat.py: NAc resave format
  version assertion updated to _NAC_FORMAT_VERSION.

Frozen contract impact (per CC3 audit):
- GatingContext.learned_aversions: dict | None = None additive optional
  field with default at end of field list.  Docstring updated.  Non-
  breaking per CC3 shape-frozen rules.
- NAc dump payload adds percept_valences key; backward-compat reader
  handles missing field as empty dict.  Non-breaking.

Test plan:
- [x] python -m pytest tests/unit/test_wire_2_percept_aversion.py -v
       -> 55 passed.
- [x] Targeted regression (pain_bus + nac + gating + bio_enrichment +
       all 0.9.1 stage tests + persistence_compat + build_bio_stack)
       -> 366 passed.
- [x] python -m pytest tests/ -x -q -m "not slow"
       --ignore=tests/integration/test_memory_hub.py
       -> 6748 passed, 9 skipped, 40 deselected.
- [x] ruff check + ruff format clean on touched files.
- [x] mypy: pre-existing errors only (no new errors introduced).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-lens pre-merge review on commit e89a07a surfaced 1 Critical + 5
Important + 1 Nice-to-have. All actionable findings folded into this
commit before PR; no Critical or Important findings are deferred.

## Architecture lens findings (folded)

**C1 (CRITICAL — production wire was dead).** PainSignal context's
``entity_type`` from SEM YAML is the CATEGORY (``creature``,
``weapon``), not the noun (``dragon``, ``rusty_sword``).  Storing
aversion under ``creature`` means the salience scorer's fragment-match
never lifts on percept text containing ``dragon`` — the whole wire
was structurally dead in production.  My test fixtures used
``entity_type="dragon"`` which masked the bug entirely.

Fix: ``body.py::_publish_pain`` and ``_publish_drive_pain`` now publish
BOTH ``entity_name`` (the YAML noun, ``entity.name``) AND
``entity_type`` (the YAML category, ``entity.entity_type``).  The
Wire 2 subscriber prefers ``entity_name`` with fallback to
``entity_type`` for legacy producers.  Regression guard:
``TestProductionWireShape`` (2 tests) walks the full producer →
subscriber → scorer pipeline using the post-fold context shape;
``TestPerceptValenceSubscriber.test_subscriber_prefers_entity_name_over_entity_type``
pins the precedence directly.

**I1 (Important — load-modify-save round-trip).** ``with_format_version``
raises ``ValueError`` on a pre-existing ``_format_version`` that doesn't
match the writer's version.  Fold-safe today because ``dump()`` doesn't
pre-stamp the field, but a future load→mutate→save cycle would hit it.

Fix: ``TestNacPersistence.test_load_modify_save_round_trip`` pins the
contract — must not raise.

**I3 (Important — write-disjoint invariant).** The 3-test latent-bridge
trap class asserted CURRENT disjoint behavior, not the INVARIANT that
``record_percept_valence`` is called exactly once per pain publish.

Fix: ``TestPerceptValenceCallCount.test_record_percept_valence_called_exactly_once_per_pain``
spies on the real NAc method and counts calls through a real bus
publish.  If a future refactor lifts ``record_percept_valence`` into
``ToolPainBridge._on_embodiment_pain``, this fails LOUDLY.

**N1 (Nice-to-have — hyphen split).** ``_match_learned_aversion`` split
on ``_`` and ``:`` only.  LLM-generated entity names like
``dragon-whelp`` wouldn't fragment cleanly.  Fix: added ``-`` to the
split character set.  Regression: ``test_hyphen_split_match``.

## Bio-fidelity lens findings (folded)

**B1 (HEADLINE — Wire 3 precedent applied).** The bio reviewer's
highest-priority finding (cross-confirmed with arch lens's
observability concerns): Wire 2 had no Roy-3 disambiguation emission.
Post-hoc JSONL analysis cannot distinguish "Pavlovian aversion lifted
salience" from "the percept was inherently salient via NAc link signal"
without a structured event.  Wire 3's bio-fold closed exactly the same
class of bug with ``WIRE_3_FILTER``.

Fix: ``BioEnrichmentPipeline._emit_wire_2_aversion_event`` emits a
``WIRE_2_AVERSION`` sim_log event when an aversion match would lift
salience.  Fields match the Wire 3 shape: matched_entity_class,
aversion_magnitude, salience_combined, salience_base,
percept_words_sample, n_aversions.  Best-effort emission (cold-start
agents and non-matching percepts produce no event).  Regression:
``TestWire2AversionEmission`` (2 tests: emission-when-match,
no-emission-when-cold-start).

**B2 (Important — decay tau bio-fidelity).** Sharing
``reward_bias_decay_tau = 50.0`` with action-outcome reward bias
contradicted the wire's thesis ("burned-by-dragon once → wary of
dragons for the session").  Pavlovian fear conditioning is
biologically slower-extincting than action-outcome reward bias.

Fix: new ``NACConfig.percept_valence_decay_tau = 200.0`` (4x slower).
``decay_percept_valences`` uses the new tau.  Regression:
``test_decay_uses_separate_tau_from_reward_biases`` pins the
relationship.

**B3 (Important — alpha + floor brittleness).** With α=0.20 and
floor=0.05, a single intensity=0.3 pain wrote magnitude=0.06 — one
tick above the floor.  Single-trial fear conditioning is well-attested
biologically; the previous α left the wire fragile.

Fix: raised ``percept_valence_alpha`` from 0.20 → 0.35.  Single
intensity=0.3 pain now writes magnitude=0.105 (2x margin above floor).
Regression: ``test_single_moderate_pain_stays_above_floor``.

## Cross-confirmed concerns

Both lenses independently flagged the salience-lift path's observability
gap and bio-grounding of the decay constants — exactly the
high-signal cross-confirmation per
``.claude/projects/.../feedback_cross_confirmed_review_findings.md``.

## Deferred

- B4 (positive-valence asymmetric read): docstring acknowledges
  positive valences are excluded from ``get_percept_aversions``; a
  future positive-conditioning producer should add a companion
  ``get_percept_attractions`` read.  Not folded — current ABI is
  forward-compatible.
- B5 (string-fragment identity vs EC-cluster identity): bio reviewer's
  long-term call to route through EC pattern completion on the percept
  side instead of string-fragment match.  Substrate-fidelity gap
  acknowledged in helper docstring; full fix is post-1.0 architectural
  work.
- I2 (CI grep for ``_NAC_FORMAT_VERSION`` bump enforcement):
  speculative, no current bug; defer.
- N2-N4: complexity / propagation gaps the reviewer marked acceptable.

## Updated test count

Pre-fold: 55 tests.  Post-fold: 66 tests (+11 net new across the C1
production wire shape, B1 JSONL emission, B2 decay tau, B3 alpha,
N1 hyphen, I1 load-modify-save, I3 call-count layers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dennys246 dennys246 merged commit 704acb0 into main May 17, 2026
5 checks passed
@dennys246 dennys246 deleted the feat/0-9-1-wire-2-pavlovian-aversion branch May 17, 2026 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant