# Technical Report — Affective State and Behavior of ENTITY (01/10/2025)

## Executive Summary

This report analyzes the affective state transitions and behavioral patterns of **ENTITY** during execution. Based on log traces and the `EntityBrain` framework, ENTITY’s mood evolution follows a clear cycle — **curious → bored → irritated** — driven by silence levels, tension metrics, and temporal gating.

Speech decisions emerge from a probability function combining **base rate, tension, silence memory, emotion bias, temporal boosts, and random noise**, clipped within defined thresholds. At low exploration settings (`temperature=0.2`), ENTITY tends to repeat stable formulas (“I am your shadow…”), which are often rejected by the validator, resulting in “no valid phrase” events.

The report identifies three main areas of control: **generation parameters** (variety, repetition penalties), **validation strictness** (length, ok_ratio, duplicate filtering), and **temporal rhythm** (speech rate, cooldowns, emotion thresholds). Suggested presets — *Tense Silence*, *Balanced*, *Intrusive* — provide ready-to-use configurations.

Overall, ENTITY displays a predictable affective trajectory modulated by silence and tension, while speech behavior can be tuned through probabilistic weights, validator rules, and preset profiles to achieve the desired narrative style.

---

## Execution Context

* Generator used: **Groq / llama-3.3-70b-versatile** (chat completions).
* Default parameters in code: `temperature=0.2`, `top_p=0.9`. Length constrained to **10–14 words** with an internal validator; recent duplicate blocking (`last_responses`).
* The **temporal gate** decides *whether* to speak, combining silence, tension, affective state, speech rhythm, and a random factor (`TemporalController` with `TemporalConfig`).

## Evidence from the Log (significant extract)

* Scene start: **short silence** (`raw_silence=1.64s`) → **emo=curious**, `p≈0.33`, **speak=False**.
* With growing silence (EMA across scales) → **curious → bored**. Multiple **speak=False** decisions.
* At higher silence thresholds (sS≈0.75, sM≈0.54) → **bored → irritated**; `p≈0.36`, **speak=True**.
* Generation repeatedly emits the same phrase (“*I am your shadow…*”); validator flags **duplicate** / **low ok_ratio**; with no fresh candidates → “**No valid phrase generated**”.
* Silence rises again and emotion reverts to **bored**, with further **speak=False**.

## Behavioral Reading

1. **Curious → Bored → Irritated**
   State shifts follow your heuristic: *high silence* and/or *medium-high tension* push toward **irritated**; *low tension + high silence* push toward **bored**; recent speech or very low silence reset to **curious**.
   Implemented in `_maybe_update_emotion()` with temporal hysteresis (`emotion_min_dwell`) and thresholds over silence/tension EMA. Emotion biases: **curious +0.12**, **bored −0.08**, **irritated +0.22**, all feeding into `p`.

2. **Why does it speak right there?**
   Probability `p` combines:

   * **base_prob** (from `respond_prob`),
   * **tension** weight (`w_tension`), **silence memory** (`w_memory` over mid/long EMA),
   * **emotion bias** (`w_emotion` · emotion_bias),
   * **temporal boost** on minimal gaps (`w_temporal`),
   * **rate penalty/bonus** depending on deviation from `target_speak_rate_per_min`,
   * **random noise** (`w_random`).
     All clipped between `prob_floor` and `prob_ceil`, with **hard cooldown** after each utterance.

3. **Repetitions (“I am your shadow…”) and refusals**

   * `temperature=0.2` + strict constraints → **low exploration**, stable formulaic outputs (“I am…”, “Your existence…”).
   * **Validator** (threshold `ok_ratio ≥ 0.65`, duplicate filter, 10–14 word length) discards similar outputs; if all fail, system records “**No valid phrase**”.

---

# Parameters to Expose / Tune for ENTITY

Below are the main **knobs** (with initial suggestions). Names match those in code.

## 1) Generation (variety, repetition)

* **`temperature`** (controls variety): **0.6–0.9** (recommend 0.7 as default).
* **`top_p`**: 0.9–0.95 (keep 0.9 initially).
* **Repetition penalties** (add to Groq call):

  * `presence_penalty: 0.6`
  * `frequency_penalty: 0.3`
    *(not in code yet — pass alongside `temperature/top_p` in `chat.completions.create`).*
* **Anti-pattern starts** (prompt/validator logic): ban openers like **“I am… / I exist… / You are…”**; add a **blacklist of trigrams** from last N utterances. *(Current code blocks exact duplicates via `last_responses`, but not near-forms).*
* **`num_candidates`**: 2–3 (already supported). Slightly higher improves chances of a non-duplicate candidate.

## 2) Validator (strict vs. flexible)

* **`ok_ratio`**: lower from **0.65 → 0.60** for more permissiveness; keep 0.65 for stricter “poetic” tone.
* **Duplicate window**: currently ~20 utterances; raise to **32–40** for longer scenes.
* **Length**: 10–14 words works stylistically; optional: diversify by emotion (*irritated*: 10–12; *curious*: 12–14).

## 3) Temporal autonomy / rhythm (when to speak)

Defined in `TemporalConfig` and `TemporalController`.

* **Target speech rate** (`target_speak_rate_per_min`):

  * *quiet*: **0.8–1.0**
  * *balanced*: **1.5** *(current)*
  * *intrusive*: **2.0–2.5**
* **Base probability** (`base_prob` from `respond_prob`):

  * *quiet*: **0.45**; *default*: **0.55–0.60**; *aggressive*: **0.65–0.70**.
* **Weights**: `w_tension, w_memory, w_emotion, w_temporal, w_random`:

  * Stronger tie to **silence accumulation**: raise `w_memory` → **0.30–0.35**.
  * Stronger tie to **affective state**: raise `w_emotion` → **0.30–0.35**.
  * More unpredictability: `w_random` → **0.08–0.12** (currently 0.05).
* **Clipping range**: `prob_floor, prob_ceil`.
* **Cooldown** (`hard_cooldown` after speech, currently 5.0s):

  * Avoid bursts: **6–8s**; tighter rhythm: **2–3s**.
* **Minimum gap windows** (`min_gap_micro/short/long`, now 2/20/120s).
* **Emotions**:

  * Bias in `_emotion_bias()`: **curious +0.12**, **bored −0.08**, **irritated +0.22**. Increasing **irritated** makes it more intrusive.
  * Thresholds in `_maybe_update_emotion()` (tension `tM`, silence `sS/sM/sL`) and hysteresis `emotion_min_dwell` (now 5s). Adjusting alters how fast it shifts to **irritated** or reverts to **curious**.
* **EMA half-lives** (`hl_short/mid/long`: 4/15/90s). Higher values extend “long-tail” memory of silence/tension.

---

# Three Preset Profiles (guides)

**A) “Tense Silence” (rare, sharp presence)**

* `temperature=0.65`, `presence_penalty=0.7`, `frequency_penalty=0.35`
* `target_speak_rate_per_min=0.9`, `base_prob=0.5`, `hard_cooldown=7.0`
* `w_memory=0.35`, `w_emotion=0.25`, `w_random=0.06`
* Keep default biases; `emotion_min_dwell=7.0`

**B) “Balanced” (narrative default)**

* `temperature=0.7`, `presence_penalty=0.6`, `frequency_penalty=0.3`
* `target_speak_rate_per_min=1.5`, `base_prob=0.58`, `hard_cooldown=5.0`
* `w_memory=0.30`, `w_emotion=0.30`, `w_random=0.08`

**C) “Intrusive” (possessed, unsettling)**

* `temperature=0.8`, `presence_penalty=0.5`, `frequency_penalty=0.25`
* `target_speak_rate_per_min=2.2`, `base_prob=0.68`, `hard_cooldown=2.5`
* `w_emotion=0.35`, raise **irritated** bias to **+0.28**
* Reduce `emotion_min_dwell` to **3.5s** for faster flips

---

# Practical Tuning Checklist

1. **Increase variety**: raise `temperature` to **0.7** and add `presence_penalty/frequency_penalty`.
2. **Avoid flat openers**: ban “I am… / I exist… / You are…” in validator.
3. **Limit bursts**: if too many close interventions, raise `hard_cooldown` or lower `prob_ceil`.
4. **Balance moods**: if shifting to **irritated** too early, raise silence thresholds or reduce irritated bias.
5. **Loosen/tighten filtering**: adjust `ok_ratio` (0.60–0.70); extend `last_responses` window for long sessions.

---

## Built-in Metrics (instrumentation)

Optional logging to **`entita_metrics.csv`**: `event, p, speak, api_ms, decision_ms, toks_in/out, tension_* , silence_* , emotion` — useful for correlating *affective state → p → speak → generation outcome*. Controlled via `ENTITA_METRICS`/`ENTITA_METRICS_PATH` in `_Metrics`.


