# Narrative Generation Experiments

Testing different versions of `generate_narrative()` on existing pickled people.

In [1]:
import dill
import copy

from generation import reset_to_stage
from llm_utils import GenerationContext

In [2]:
# Load existing people
with open('test_examples_v3.pkl', 'rb') as f:
    all_people = dill.load(f)

print(f"Loaded {len(all_people)} people")

Loaded 53 people


## Current Prompts (from generation.py)

Copied here for reference and modification.

In [3]:
NARRATIVE_STYLE_BASE = '''
STYLE & VOICE:
Aim for quiet realism. The main failure mode to avoid: at no level, sentence, paragraph, or structure, should you lay it on thick.
- Write in plain, contemporary English
- Do not use subheadings, the entire narrative should read as a continuous text
- Keep figurative language sparse. Prefer direct, concrete description.
- Do not employ archaic inversions or poetic grand flourishes; avoid proverbs unless spoken by a character
- Avoid a moralizing ending or cliche last words, aim for realism
- Be specific about who did what; avoid vague collectives
- Avoid these exact phrases: "life went on", "work continued", "people remembered", "X was known for being"
- Vary sentence rhythm: Mix short and long sentences. Use fragments occasionally.

Trust the reader, be subtler than you think you can. Avoid cliches aggressively.

NARRATOR AUTHORITY:
You are omniscient. State facts with confidence. Avoid "likely," "probably," 
"perhaps," "may have," or "X or Y" constructions. Pick specific details.

Exceptions:
 - It is fine to report that the character themselves doesn't know something.
 - If something is historical unknowable and not clearly fictious but could instead be construed as a factual claim, then it is fine to note this

HISTORICAL CONTEXT:
 - Include explicit historical framing throughout where relevant to the story, so that the reader can follow
 - Assume the reader is intelligent, well-educated and able to google things if interested, but not an expert on the specific period

PERSONALITY TRAITS:
Personality must be portrayed realistically through actions, particularly for extremes (below 15th or above 85th percentile). Those with low 
intelligence or conscientiousness should visibly struggle with tasks others manage; those with low agreeableness or honesty-humility should create friction and face social consequences. Do not bowdlerize—negative traits should cause real problems, not be reframed as hidden strengths.

EVENT INTEGRATION:
Every historical event or contextual detail should do at least one of:
1. Reveals character (shows a personality trait or relationship dynamic)
2. Builds causally toward the death (creates conditions that lead to or contextualize it)
3. Shows social/historical structure in a way that matters to this life
If an event does none of these things, remove it entirely. Don't include events as context unless they do clear narrative work. When life events include multiple options, choose one specific version and describe it concretely.
'''

NARRATIVE_PROMPT_ERA = {
    'Holocene': """Write a narrative biography for this historical person that brings them to life as a real individual.
- Include the demographic details in the story naturally
- Avoid anachronisms
""",

    'Paleolithic': """Write a narrative biography for this Paleolithic hunter-gatherer.

SETTING THE SCENE:
- Open with the environment and climate (use the "Climate then" vs "Climate now" contrast to orient readers)

WRITING THE PERSON:
- Use the provided name; if the naming category is "unrecoverable", you may briefly acknowledge the name is a placeholder (e.g., "We'll call her X"), but don't belabor this point
- Show personality through actions, choices, habits, and relationships - NOT by stating trait levels
- Avoid mechanical phrasing like "With moderate openness..." or "His low agreeableness suggests..."

WHAT WE CAN'T KNOW:
- Specific beliefs, rituals, or mythology
- Language
- Specific customs or taboos
"""
}

AGE_PROMPTS = {
    "infant": """
LENGTH & FOCUS (under age 3):
- Keep brief: 150-300 words
- The infant cannot express personality - focus on parents, household, circumstances
- Can be told entirely from the parents' perspective
""",
    "child": """
LENGTH & FOCUS (age 3-10):
- Keep relatively brief: 200-400 words
- Personality can show in limited, age-appropriate ways (a habit, a preference, how they played)
- Focus on a few vivid moments rather than an arc
""",
    "adolescent": """
LENGTH & FOCUS (age 11-18):
- Moderate length: 400-700 words
- Show emerging adult roles and relationships
- Make sure to express the person's personality through behavior and choices
- Traits below 15th percentile or above 85th percentile should be particularly visible

""",
    "adult": """
LENGTH & FOCUS (age 19+):
- Full length: 600-1000 words
- Include community relationships, social networks, and changes in their family life
- Structure the piece as a realistic mosaic of everyday episodes
- Make sure to express the person's personality through behavior and choices
- Traits below 15th percentile or above 85th percentile should be particularly visible
"""
}

ALIVE_PROMPT = """
ENDING (person still living):
- End in an ordinary moment, not on a cliffhanger or dramatic note
- The narrative must end no later than late 2025 (the current year) - do not project events into the future
- End with a present-tense snapshot of their current life as of 2025
"""

DEAD_PROMPT = """
ENDING (person has died):
- Include the death, but do not dwell on aftermath
- End concretely, not sentimentally
"""

## generate_narrative function

In [7]:
def generate_narrative(person, ctx, show_prompt = False):
    """
    Generate biographical narrative.
    
    Modifies person in place.
    """
    age_cat = person.age_category()
    
    # Build the prompt
    full_prompt = NARRATIVE_PROMPT_ERA[person.era] + NARRATIVE_STYLE_BASE + AGE_PROMPTS[age_cat]
    full_prompt += ALIVE_PROMPT if person.is_alive() else DEAD_PROMPT
    if show_prompt:
        print(full_prompt)

    # Structured incidents and historical context
    if hasattr(person, 'structured_incidents') and person.structured_incidents:
        incidents_str = "\n".join(f"- {e.get('event', '')} ({e.get('timing', 'unknown')})"
                                  for e in person.structured_incidents)
        full_prompt += "\n\nPersonal incidents to incorporate:\n" + incidents_str

    if hasattr(person, 'historical_context') and person.historical_context:
        context_str = "\n".join(f"- {e.get('event', '')} ({e.get('timing', 'unknown')})"
                                for e in person.historical_context)
        full_prompt += "\n\nHistorical context to incorporate:\n" + context_str

    person.messages.append({"role": "user", "content": full_prompt})

    ctx.log(f"Generating narrative ({age_cat})...")

    person.narrative = ctx.call(person.messages)
    person.messages.append({"role": "assistant", "content": person.narrative})

    ctx.log(f"  Generated {len(person.narrative.split())} words")

In [10]:
def generate_narrative_fixed_prompt(person, ctx, prompt):
    """
    Generate biographical narrative.
    
    Modifies person in place.
    """
    # Build the prompt
    full_prompt = prompt

    # Structured incidents and historical context
    if hasattr(person, 'structured_incidents') and person.structured_incidents:
        incidents_str = "\n".join(f"- {e.get('event', '')} ({e.get('timing', 'unknown')})"
                                  for e in person.structured_incidents)
        full_prompt += "\n\nPersonal incidents to incorporate:\n" + incidents_str

    if hasattr(person, 'historical_context') and person.historical_context:
        context_str = "\n".join(f"- {e.get('event', '')} ({e.get('timing', 'unknown')})"
                                for e in person.historical_context)
        full_prompt += "\n\nHistorical context to incorporate:\n" + context_str

    person.messages.append({"role": "user", "content": full_prompt})

    ctx.log(f"Generating narrative ({age_cat})...")

    person.narrative = ctx.call(person.messages)
    person.messages.append({"role": "assistant", "content": person.narrative})

    ctx.log(f"  Generated {len(person.narrative.split())} words")

## Test on a single person

In [35]:
# Pick a test person
test_idx = 3
original = all_people[test_idx]

print(f"{original.name} | {original.era} | {original.birth_year_str}")
print(f"Age at death: {original.age_at_death} | {original.location.country if original.location else original.region}")
print(f"\nOriginal narrative:\n")
print(original.narrative)

Amina | Holocene | 1064 AD
Age at death: 0 | Afghanistan

Original narrative:

Amina was born on May 27, 1064, in a mudbrick compound above the valley floor near Mahmud Raqi. After sunset the air dropped cold and the slopes around the houses showed only scattered scrub.

Her mother had entered the household through remarriage. Older kin controlled the stores and watched how much flour and oil left the jars. There were stepchildren in the rooms off the courtyard, and the new wife was expected to work first and speak last.

Amina’s father left at first light for other people’s building work. He mixed clay with chopped straw, carried bricks, patched walls, and cleared silt from channels when the water stopped running. Some days he came home with bread or a measure of grain, other days with nothing.

Her mother rose for the dawn prayer, then milked the goats and strained the milk into a skin to sour. She hauled water, fed the birds, and kept Amina wrapped against her side. On Fridays she s

## Experimental version

Modify the prompts above or create a new `generate_narrative_v2` here.

In [None]:
BASE_PROMPT = '''Write a narrative biography for this historical person.

TASK:
- Use the provided name
- Weave demographic details in naturally
- Avoid anachronisms
- Give names to people who recur in the narrative or have ongoing relationships (family, spouse, close associates). Minor figures who appear once can remain unnamed. Use names appropriate to the time, place, and culture.
- You do not need to include every piece of demographic information. Select the details that matter for this particular life and let the rest go.

VOICE:
- Plain contemporary English, no subheadings
- Omniscient narrator: state facts confidently, no hedging ("likely", "perhaps", "may have")
- Vary sentence rhythm. Mix lengths. Fragments sometimes.

PROSE STYLE:
- Write actively and directly. State facts plainly and concretely.
- Avoid passive, abstract, or distanced descriptions.
- Include specific details so the reader can see what you mean.
- No figurative language: no metaphors, similes, or personification.
- No archaic inversions, poetic flourishes, or proverbs.
- State what happened. Do not comment on it, interpret it, or frame it poetically.
- Do not write lines that exist to sound wise or poignant. If a sentence is reaching for literary effect, cut it or replace it with a plain one.
- Describe what was there, not what wasn't. Avoid defining people or situations by negatives. If a detail only matters as an absence, cut it.
- Do not introduce or frame events before describing them. Start with the event itself.

HISTORICAL INTEGRATION:
- Include historical framing throughout so readers can follow
- Assume an intelligent reader who can look things up but isn't a specialist
- Early in the narrative, briefly orient the reader to the political and cultural situation: what polity or power structure governed this area, what ethnic/linguistic group the person belonged to, and how that world related to larger historical forces. Keep it short and concrete—a sentence or two, not a paragraph of background.

PERSONALITY:
- Show traits through action, not summary
- Do not name personality traits. Show the behavior and let the reader infer the trait.
- Extremes (below 15th or above 85th percentile) should be visible
- Don't soften negative traits—low agreeableness causes friction, low conscientiousness causes real failures, low intelligence shows in limited understanding and poor decision-making

AVOID THESE PHRASES:
"life went on", "work continued", "people remembered", "was known for", "in those days", "as was common", "like so many", "he suffered", "it was not X, it was Y"
'''

# Age determines length and focus
AGE_PROMPTS_V2 = {
    "infant": """
LENGTH: 150-300 words

FOCUS:
- The infant cannot express personality—focus on parents, household, circumstances
- Can be told entirely from the parents' perspective
- A few vivid details about the household and the infant's brief life
""",

    "child": """
LENGTH: 200-400 words

FOCUS:
- Personality can show in limited, age-appropriate ways (a habit, a preference, how they played)
- Focus on a few vivid moments rather than a full arc
- Show the household and family context
""",

    "adolescent": """
LENGTH: 400-700 words

FOCUS:
- Show emerging adult roles and relationships
- Personality should be visible through behavior and choices
- Include family dynamics and any work or responsibilities
""",

    "adult": """
LENGTH: 600-1000 words

FOCUS:
- Mosaic of everyday episodes
- Include relationships and family changes
- Show work, community, and how they navigated their world
"""
}

# Alive/dead determines ending only
ALIVE_PROMPT_V2 = """
ENDING:
- End in an ordinary moment, not on a cliffhanger or dramatic note
- The narrative must end no later than late 2025—do not project events into the future
- End with a present-tense snapshot of their current life
"""

DEAD_PROMPT_V2 = """
ENDING:
- Include the death concretely
- Do not dwell on aftermath or sentimentalize
- Avoid: "breathing slowed", "fever rose/burned", "eyes closed", "slipped away", "grew weaker", "stopped breathing"
"""


def generate_narrative_v2(person, ctx):
    """
    Generate biographical narrative using the new prompt structure.
    
    Age category determines length and focus.
    Alive/dead determines ending instructions.

    Modifies person in place.
    """
    age_cat = person.age_category()

    # Build the prompt: base + age-specific + ending
    full_prompt = BASE_PROMPT + AGE_PROMPTS_V2[age_cat]
    full_prompt += ALIVE_PROMPT_V2 if person.is_alive() else DEAD_PROMPT_V2

    # Structured incidents and historical context
    if hasattr(person, 'structured_incidents') and person.structured_incidents:
        incidents_str = "\n".join(f"- {e.get('event', '')} ({e.get('timing', 'unknown')})"
                                  for e in person.structured_incidents)
        full_prompt += "\n\nPersonal incidents to incorporate:\n" + incidents_str

    if hasattr(person, 'historical_context') and person.historical_context:
        context_str = "\n".join(f"- {e.get('event', '')} ({e.get('timing', 'unknown')})"
                                for e in person.historical_context)
        full_prompt += "\n\nHistorical context to incorporate:\n" + context_str

    person.messages.append({"role": "user", "content": full_prompt})

    ctx.log(f"Generating narrative ({age_cat})...")

    person.narrative = ctx.call(person.messages)
    person.messages.append({"role": "assistant", "content": person.narrative})

    ctx.log(f"  Generated {len(person.narrative.split())} words")

In [36]:
# Regenerate with current prompts
test_person = copy.deepcopy(original)
reset_to_stage(test_person, 'narrative')

ctx = GenerationContext(model='gpt-5.2', quiet=False, show_cost=True)
generate_narrative_fixed_prompt(test_person, ctx, NEW_PROMPT_V2)
ctx.finish()

print(test_person.narrative)

Generating narrative (infant)...
  Generated 1242 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 5,265
Output tokens: 1,584
Total cost: $0.0224
Avg per request: 5265 in, 1584 out
Amina was born on May 27, 1064, in a village near Mahmūd Rāqī in the hills of Kāpīsā, north of Kabul. Ghaznavid officials collected revenue through local intermediaries and kept roads open for soldiers and caravans. Her family spoke Dari Persian and followed Sunni Islam with the village mosque and its calendar of prayers and feasts.

Her mother, Gulbadan, delivered her on a thin mat in a room with packed-earth walls. A clay lamp smoked in the corner. Gulbadan’s hands shook from the work of labor and from the cold that still held the house at night in the high valley. Amina cried hard and long. Gulbadan’s older sister, Rabia, cut the cord with a knife used for bread. Rabia washed the baby with warmed water, wiped her with a scrap of cloth, and wrapped her tight.

The father of the household was 

In [33]:
# Regenerate with current prompts
test_person = copy.deepcopy(original)
reset_to_stage(test_person, 'narrative')

ctx = GenerationContext(model='gpt-5.2', quiet=False, show_cost=True)
generate_narrative_fixed_prompt(test_person, ctx, NEW_PROMPT_V2)
ctx.finish()

print(test_person.narrative)

Generating narrative (adult)...
  Generated 1486 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 13,031
Output tokens: 1,860
Total cost: $0.0349
Avg per request: 13031 in, 1860 out
Shashikala was born in the hot season of 558 in a small settlement in the wooded uplands west of the coastal plains, where farmers cleared patches and kept cattle on the edges of sal and teak forest. The people around her spoke an Eastern Indo-Aryan village speech and marked their days with household offerings to family deities, alongside fear of the spirits that lived in groves and streams. A local chief’s men came when they needed grain or labor, and the name of the distant king changed without changing the work.

Her father, Devadatta, owned fields that took a full day to walk across and cattle enough to lend a bull for plowing when a neighbor’s animal went lame. Shashikala grew up in a nuclear household, not in a long row of brothers and uncles around one hearth. Her mother, Nandini, ran t

In [30]:
# Regenerate with current prompts
test_person = copy.deepcopy(original)
reset_to_stage(test_person, 'narrative')

ctx = GenerationContext(model='gpt-5.2', quiet=False, show_cost=True)
generate_narrative_fixed_prompt(test_person, ctx, NEW_PROMPT_V2)
ctx.finish()

print(test_person.narrative)

Generating narrative (adult)...
  Generated 1489 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 11,064
  Cached: 9,856 (89.1%)
Output tokens: 1,866
Total cost: $0.0214
Avg per request: 11064 in, 1866 out
Ptolemaios was born in a small village in the uplands above the roads that ran between the Black Sea coast and the interior. The kings who claimed Anatolia changed in distant capitals, but in his valley the rule came as tax collectors, written orders, and men moving in armed groups. His family spoke the old village tongue at home and Greek outside it, at the shrine, and when dealing with officials.

His father, Sokrates, left before dawn for weeks at a time. He hauled timber for an estate steward in older days, then spent seasons burning charcoal in the forested slopes and driving a mule train down toward larger settlements. When Sokrates came back he brought salt, oil in small jars, and sometimes a few bronze coins that he kept in a pouch tied beneath his belt. His mot

In [26]:
# deleting message history
'''test_person = copy.deepcopy(original)
reset_to_stage(test_person, 'narrative')
test_person.messages = []

ctx = GenerationContext(model='gpt-5.2', quiet=False, show_cost=True)
generate_narrative_fixed_prompt(test_person, ctx, NEW_PROMPT_V2 + '\n\n'+test_person.to_prompt_string())
ctx.finish()

print(test_person.narrative)'''

"test_person = copy.deepcopy(original)\nreset_to_stage(test_person, 'narrative')\ntest_person.messages = []\n\nctx = GenerationContext(model='gpt-5.2', quiet=False, show_cost=True)\ngenerate_narrative_fixed_prompt(test_person, ctx, NEW_PROMPT_V2 + '\n\n'+test_person.to_prompt_string())\nctx.finish()\n\nprint(test_person.narrative)"