# Generative Agents Structure - 

A generative agent is an orchestrated LLM with an architecture that adds three fundamental modules: Memory, Reflection, and Planning.
Together, these allow the agent to perceive → remember → reason → plan → act → update itself.

---

## 1. Memory Stream

* It is a chronological database in natural language that records:

* Observations → what the agent perceives or does (“Maria drinks a coffee”).
* Reflections → more abstract summaries (“Maria is passionate about studying”).
* **Plans** → future actions (“Work on the paper tomorrow at 9”).
* Each record has: text, creation timestamp, and last access timestamp.
* Function: provides continuity and identity to the agent over time.

### Retrieval (how to select relevant memories)

The system does not pass all memory to the LLM, but uses three criteria to select the most useful memories:

1. **Recency** → priority given to recent events (exponential decay).
2. **Importance** → score 1–10 (assigned by the LLM) to distinguish trivial from salient events (e.g., “have breakfast” = 2; “ask the crush out” = 8).
3. **Relevance** → calculated with embeddings and cosine similarity between the current query and the memories.

The final score is a combination of the three factors.
The top-ranked memories enter the **prompt context** → from there the LLM produces the response/action.

---

## 2. **Reflection**

* Allows you to **abstract from the raw data** and create more general thoughts.
* Generated periodically (e.g., when the sum of recent importance scores > 150).
* Process:

1. The LLM receives the latest ~100 memories.
2. Asks high-level questions (“What is Klaus’s passion?”).
3. Retrieves the most relevant memories and produces **insights** with quotes (“Klaus is passionate about research because: \[1,2,8]”).
* Reflections become **nodes of a tree**: leaves = observations → branches = concepts → roots = more abstract inferences.

Function: Creates **stable identity** and coherent relationships.

---

## 3. **Planning**

* It helps maintain **temporal coherence** and avoid short-sighted or repetitive behavior.
* Hierarchical structure:

* **Daily plan in macro-blocks** (5–8 parts: wake up, lessons, lunch, study, dinner).
* **Hourly breakdown** (e.g., from 1 to 5 pm I work on the paper → broken down into subtasks).
* **Minutes breakdown** (e.g., 1:00 pm brainstorming → 2:30 pm writing introduction → 3:00 pm break).
* Plans are saved in memory along with observations and reflections.
* **Re-planning**: if something changes (e.g., a party invitation, an unexpected event), the agent can update the plan only as needed.

---

## 4. **Reaction (adaptation)**

* With each tick, the agent perceives new observations → decides whether to continue with the plan or react.
* Ex: John sees Eddy in the garden → instead of ignoring him, he updates the plan and engages him in a dialogue.

---

## 5. **Dialogue (conversation)**

* Generated by conditioning sentences on the **set of relevant memories** relating to the interlocutor.
* Each line is produced sequentially, but the authors suggest generating it in batches as an optimization (more fluid and less expensive dialogues).
* Maintains **relational coherence** (roles: father/son, friends, colleagues).

---

## 6. **Grounding in the environment**

* The world is represented as a **tree of areas and objects** (e.g., House → Kitchen → Stove).
* The agent only knows the subgraphs it has explored.
* For each action, the LLM decides the area → sub-area → object → the engine performs pathfinding and updates the state.
* Ex: "making an espresso" → changes the machine state from "idle" to "brewing".

---

## 7. **Optimizations**

* Cache of **agent summaries** (identity, occupation, current state).
* Plan generated only at a high level, details in real time.
* Possible **batching** of dialogues and **parallelization** of agents for speed.

---

## 8. **Operational Loop**

The cognitive cycle of an agent is:

**Perceive → Record observations → Retrieve from memory stream → Generate action/plan → Update memory with action and reflection → Act in the world → Repeat**.

---

# Generative Agent Framework
## Essential Architecture

### Basic Concept
A **generative agent** = LLM (brain) + orchestrator + persistent memory + action environment.

The LLM generates decisions but does not act directly: a system is needed to translate text into concrete actions.

---

## Main Components

### 1. LLM (Brain)
- Receives context and decides what to do/say
- Produces only textual instructions, not direct actions

### 2. Memory + Retrieval (RRI)
- **Stream**: Complete archive of observations, plans, reflections
- **Retrieval**: Filters by Relevance, Recency, Importance
- Only the most useful information goes to the LLM as context

### 3. Reflection
- Transforms events into insights (identity, preferences, relationships)
- Activates periodically or at importance thresholds

### 4. Planning
- Hierarchical plans: day → hours → minutes
- Partial replanning when conditions change

### 5. World Interface
- **Adapter**: JSON → natural language for the LLM
- **Actuator**: textual decisions → concrete actions

### 6. Personalization
- **Persona**: tone, style, preferences
- **Policy**: non-negotiable security rules

---

## Operational Cycle

1. **Sense** the environment (JSON → NL)
2. **Record** observations in memory
3. **Retrieve** relevant context (RRI)
4. **Reflect** if necessary (generate insights)
5. **Plan** or update existing plans
6. **Decide** what to do (LLM + persona + policy)
7. **Act** in the real world (tools/API)
8. **Refresh** memory and start over

---

## Implementation

### Technologies
- **RAG + vector DB** for memory and knowledge
- **LoRA/PEFT** for persistent personalization
- **Cache** for summaries and quick access

### Security
- Full audit log
- Protection from Memory/prompt hacking
- Hard-coded constraints (timetables, capabilities, roles)

---

## Evaluation

- **Believability**: consistency of identities and reactions
- **Memory accuracy**: precision/recall without hallucinations
- **Temporal consistency**: no repeated/implausible actions
- **Robustness**: resistance to manipulation and bias

---


### What's missing from Entita?

* Persistent memory (episodic/semantic stream + turn saving).
* Relevance/Recency/Importance (RRI) retrieval to build context from the memory stream.
* Reflections (abstract insights with threshold triggers) saved in memory.
* Planning (block plans + partial replanning as the context changes).
* ReAct orchestrator (perceive → recover → reflect/plan → decide → act/speak → update).
* World grounding (JSON→NL adapter for perception, NL actuator→actions/tools).
* Encoded world norms (timetables, capabilities, object usage rules).
* **Embeddings/vector DB** for robust retrieval (beyond the bag-of-words).
* **Policy/Safety layer** separated from the style to prevent improper exits.
* **Observability/robustness** (audit log, memory hygiene, anti-prompt/memory hacking).