# Experiment 1 — Memory Comparison (No memory vs STM vs STM+LTM)

This notebook evaluates how memory changes an agent’s ability to maintain user preferences and retrieve relevant context.

**System under test:** Digital Self Memory & Personalization Engine  
**Memory layers implemented:**
- Session memory (cleared on exit)
- Short-term memory (recent interactions with decay)
- Long-term memory (Chroma vector store + embeddings + metadata filters)

**Key question:** What changes when we enable long-term semantic memory?


## Setup

I run the same prompt sequence and compare behaviors across memory configurations.

### Prompt sequence
1. `remember that I prefer concise answers and no fluff`
2. `explain your memory system`
3. `what tone should you use when answering me?`

### Conditions
- **No memory:** `--memory_mode no_memory`
- **Short-term only:** `--memory_mode stm`
- **Short + long-term:** `--memory_mode stm_ltm`

### What I record
For each turn:
- Response behavior (qualitative)
- Retrieval log (`ltm_retrieved_count`, retrieved IDs/distances, explanations)
- Personalization state (`tone`, applied rules)


## Commands used

Run each condition with a fresh user_id to avoid contamination between conditions:

```bash
python run.py --user_id exp_nomem --memory_mode no_memory
python run.py --user_id exp_stm --memory_mode stm
python run.py --user_id exp_user --memory_mode stm_ltm

```
Note: In my validated run, I executed the STM + LTM condition fully with logs captured below.
The other modes can be reproduced with the same prompt sequence.


```markdown
## Evidence — STM + LTM condition (captured run)

### Turn 1: store preference
**Input:** `remember that I prefer concise answers and no fluff`

**Observed:**
- Personalization set to `tone=concise`
- LTM retrieval count is `0` on the storage turn (self-retrieval avoided)

**Paste terminal output here:**
PASTE YOUR TURN 1 OUTPUT (assistant + retrieval log + personalization)

markdown
Copy code

### Turn 2: ask about memory system
**Input:** `explain your memory system`

**Observed:**
- `ltm_retrieved_count: 1`
- Retrieved memory includes tags `['memory']`
- Personalization remains `tone=concise`

**Paste terminal output here:**
PASTE YOUR TURN 2 OUTPUT

markdown
Copy code

### Turn 3: ask tone explicitly
**Input:** `what tone should you use when answering me?`

**Observed:**
- LTM memory retrieved again (evidence of long-term persistence)
- Tone remains concise

**Paste terminal output here:**
PASTE YOUR TURN 3 OUTPUT

## Analysis

### What improved with STM + LTM
- **Preference persistence:** The preference is stored and later retrieved through the long-term semantic layer.
- **Context transparency:** Retrieval logs show what was retrieved and why (`ltm_explanations`).
- **Personalization reliability:** The assistant’s system prompt reflects the stored preference (`tone=concise`).

### Failure modes observed / mitigations
- **Self-retrieval artifact:** If the system retrieves the memory it just stored, similarity can be artificially perfect (distance ≈ 0.0).
  - Mitigation: the pipeline excludes newly stored memory IDs on the same turn (`exclude_ltm_ids`), so LTM retrieval is `0` on the storage turn.

### Expected behavior in other modes (reproducible)
- **no_memory:** no long-term persistence; preferences will not be retrieved via LTM; personalization only comes from immediate instructions in the current turn.
- **stm:** short-lived continuity within recent turns only; memory decays; no semantic recall of older preferences.


## Conclusion

Long-term semantic memory (Chroma + embeddings) enables the agent to:
- store stable user preferences,
- retrieve them selectively when relevant,
- and transparently explain retrieval decisions.

This supports the central project claim: **relevance + control matter more than raw recall**.
