# Unified LLM Generation Pipeline

This notebook provides an interface to the unified generation pipeline.
All logic lives in `generation.py` - this notebook is for interactive testing and batch runs.

The pipeline handles both Paleolithic and Holocene people automatically based on the sampled birth year.

In [1]:
import dill
from tqdm import tqdm

from generation import (
    generate_person,
    generate_batch,
    generate_batch_parallel,
    # Individual steps if needed
    generate_geography,
    generate_demographics,
    generate_structured_incidents,
    generate_historical_context,
    generate_name,
    generate_narrative_plan,
    generate_narrative,
    quality_check,
    reset_to_stage
)
from llm_utils import GenerationContext, extract_json

from person import sample_year, sample_person, Person

import copy

# Batch Generation

In [2]:
'''# cost prio, $5.03 (prev $1.24)
test_people_2 = generate_batch_parallel(n=20, model="gpt-5.2", workers=20)'''

'# cost prio, $5.03 (prev $1.24)\ntest_people_2 = generate_batch_parallel(n=20, model="gpt-5.2", workers=20)'

In [3]:
'''with open('test_examples_2.pkl', 'wb') as f:
    dill.dump(test_people_2 + olds,f)
%run export.py test_examples_2.pkl'''

"with open('test_examples_2.pkl', 'wb') as f:\n    dill.dump(test_people_2 + olds,f)\n%run export.py test_examples_2.pkl"

In [2]:
with open('test_examples_2.pkl', 'rb') as f:
       examples = dill.load(f)

In [3]:
test = copy.deepcopy(examples[0])

In [4]:
reset_to_stage(test, 'narrative_plan')
ctx = GenerationContext(model="gpt-5.2", quiet=False, show_cost=True)
generate_narrative_plan(test, ctx)
generate_narrative(test, ctx)
ctx.finish()

Generating narrative plan (adult)...
  Plan: 6 siblings, 0 children, 4 life phases
Generating narrative (adult)...
  Generated 1574 words

=== Cost Summary: gpt-5.2 ===
Requests: 2
Input tokens: 27,609
  Cached: 13,952 (50.5%)
Output tokens: 4,217
Total cost: $0.0610
Avg per request: 13804 in, 2108 out


In [5]:
test.narrative_plan

{'siblings': [{'name': 'Gopala',
   'sex': 'M',
   'birth_year': -307,
   'death_year': -248,
   'death_age': 59,
   'narrative_role': 'Eldest brother; manages most household decisions after father ages; pushes for marriage alliances and discipline.'},
  {'name': 'Somadatta',
   'sex': 'M',
   'birth_year': -304,
   'death_year': -304,
   'death_age': 0,
   'narrative_role': 'Infant death remembered in household rituals/ancestor offerings; mother’s caution around later births.'},
  {'name': 'Dharmapala',
   'sex': 'M',
   'birth_year': -302,
   'death_year': -238,
   'death_age': 64,
   'narrative_role': 'Second surviving older brother; practical, field-focused; involved in disputes and later the land-rights pawn during the debt years.'},
  {'name': 'Haridatta',
   'sex': 'M',
   'birth_year': -298,
   'death_year': -298,
   'death_age': 0,
   'narrative_role': 'Infant death; used to underline high childhood mortality and household appeasement rites.'},
  {'name': 'Rudrasena',
   'sex'

In [7]:
print(test.narrative)

Rudrasena was born in a farming hamlet on the southern edge of the Middle Ganga country, where the flat fields broke into low forested ridges. Mauryan authority reached the area through tax collectors, measure-keepers, and village headmen who spoke for the state. At home, his family spoke an Indo-Aryan village dialect and kept the routines of household worship: a small clay lamp at dusk, water poured on the threshold, and food set aside for ancestors before anyone ate.

His father, Rudradatta, worked as a substantial cultivator. He owned plough cattle and kept a storehouse of grain under a thatched roof, with the bins plastered in mud and cow dung. His mother, Bhadra, ran the house. She rose before dawn, ground grain on a stone, and boiled lentils in an earthen pot. She kept track of seed grain and salt, and she knew whose hands had been in each jar. In the yard she fed a few chickens and watched over the calves. When the rains came and mosquitoes thickened, she burned leaves at the do

# Narrative Prompt Experimentation (V2)

In [4]:
for person in examples[:10]:
    reset_to_stage(person, 'narrative')
    
    ctx = GenerationContext(model="gpt-5.2", quiet=False, show_cost=True)
    generate_narrative(person, ctx)
    ctx.finish()

Generating narrative (adult)...
  Generated 1564 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 15,659
Output tokens: 1,972
Total cost: $0.0393
Avg per request: 15659 in, 1972 out
Generating narrative (adult)...
  Generated 1464 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 17,464
  Cached: 14,976 (85.8%)
Output tokens: 1,855
Total cost: $0.0235
Avg per request: 17464 in, 1855 out
Generating narrative (adult)...
  Generated 1583 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 16,832
  Cached: 14,976 (89.0%)
Output tokens: 2,032
Total cost: $0.0245
Avg per request: 16832 in, 2032 out
Generating narrative (infant)...
  Generated 266 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 6,899
Output tokens: 329
Total cost: $0.0119
Avg per request: 6899 in, 329 out
Generating narrative (adult)...
  Generated 1473 words

=== Cost Summary: gpt-5.2 ===
Requests: 1
Input tokens: 17,912
  Cached: 16,000 (89.3%)
Output tokens: 1,842
Total 

In [5]:
with open('test_examples_3.pkl', 'wb') as f:
    dill.dump(examples,f)
%run export.py test_examples_3.pkl

Loading people from test_examples_3.pkl...
Found 37 people
Removing existing markdown files from ../_lives...
  Removed 37 files
  Exported 10 people...
  Exported 20 people...
  Exported 30 people...

Successfully exported 37 people to ../_lives
