# Synthetic Journal Generation

This notebook sets up an experimentation cycle for generating synthetic journal entries using a LLM (defined below)
It uses a configuration file to drive persona and scenario diversity.

In [1]:
import asyncio
import json
import os
import random
import re
import sys
import yaml
import polars as pl

from dataclasses import dataclass
from datetime import datetime, timedelta
from pathlib import Path
from dotenv import load_dotenv
from openai import AsyncOpenAI
from pydantic import BaseModel, Field
from typing import Literal

# Add project root to path for prompts module
PROJECT_ROOT = (
    Path(__file__).parent.parent if "__file__" in dir() else Path.cwd().parent
)
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

# Load environment variables
load_dotenv()

# Check for API Key
if not os.getenv("OPENAI_API_KEY"):
    print("WARNING: OPENAI_API_KEY not found in environment variables.")

In [2]:
# Configuration Loading
CONFIG_PATH = Path("config/synthetic_data.yaml")
if not CONFIG_PATH.exists():
    CONFIG_PATH = Path("../config/synthetic_data.yaml")

SCHWARTZ_VALUES_PATH = Path("config/schwartz_values.yaml")
if not SCHWARTZ_VALUES_PATH.exists():
    SCHWARTZ_VALUES_PATH = Path("../config/schwartz_values.yaml")


def load_config(path: str | Path) -> dict:
    with open(path, "r") as f:
        return yaml.safe_load(f)


config = load_config(CONFIG_PATH)
schwartz_config = load_config(SCHWARTZ_VALUES_PATH)

print("Configs loaded successfully.")
print(f"Available Persona Attributes: {list(config['personas'].keys())}")
print(f"Schwartz Values with elaborations: {list(schwartz_config['values'].keys())}")

Configs loaded successfully.
Available Persona Attributes: ['age_ranges', 'cultures', 'professions', 'schwartz_values']
Schwartz Values with elaborations: ['Self-Direction', 'Stimulation', 'Hedonism', 'Achievement', 'Power', 'Security', 'Conformity', 'Tradition', 'Benevolence', 'Universalism']


## Data Models
Defining structured outputs for consistency.

In [3]:
class Persona(BaseModel):
    name: str = Field(description="Full name of the persona")
    age: str
    profession: str
    culture: str
    core_values: list[str] = Field(description="Top 3 Schwartz values")
    bio: str = Field(
        description="A short paragraph describing their background, stressors, and goals"
    )


class JournalEntry(BaseModel):
    """LLM-generated journal entry. Metadata (tone, verbosity, etc.) tracked separately."""

    date: str
    content: str


# The Responses API `json_schema` strict mode requires `additionalProperties: false`
# on objects. Pydantic's generated schema may omit that, so we provide an explicit
# strict schema for reliability.
PERSONA_SCHEMA = {
    "type": "object",
    "additionalProperties": False,
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "string"},
        "profession": {"type": "string"},
        "culture": {"type": "string"},
        "core_values": {"type": "array", "items": {"type": "string"}},
        "bio": {"type": "string"},
    },
    "required": ["name", "age", "profession", "culture", "core_values", "bio"],
}

JOURNAL_ENTRY_SCHEMA = {
    "type": "object",
    "additionalProperties": False,
    "properties": {
        "date": {"type": "string"},
        "content": {"type": "string"},
    },
    "required": ["date", "content"],
}

PERSONA_RESPONSE_FORMAT = {
    "type": "json_schema",
    "name": "Persona",
    "schema": PERSONA_SCHEMA,
    "strict": True,
}

JOURNAL_ENTRY_RESPONSE_FORMAT = {
    "type": "json_schema",
    "name": "JournalEntry",
    "schema": JOURNAL_ENTRY_SCHEMA,
    "strict": True,
}

In [4]:
def build_value_context(values: list[str], schwartz_config: dict) -> str:
    """Build rich context about Schwartz values for persona generation.

    Args:
        values: List of Schwartz value names (e.g., ["Achievement", "Benevolence"])
        schwartz_config: The loaded schwartz_values.yaml config

    Returns:
        Formatted string with value elaborations for prompt injection
    """
    context_parts = []

    for value_name in values:
        if value_name not in schwartz_config["values"]:
            continue

        v = schwartz_config["values"][value_name]

        # Build a focused context block for this value
        context_parts.append(f"""
### {value_name}
**Core Motivation:** {v["core_motivation"].strip()}

**How this manifests in behavior:**
{chr(10).join(f"- {b}" for b in v["behavioral_manifestations"][:5])}

**Life domain expressions:**
- Work: {v["life_domain_expressions"]["work"].strip()}
- Relationships: {v["life_domain_expressions"]["relationships"].strip()}

**Typical stressors for this person:**
{chr(10).join(f"- {s}" for s in v["typical_stressors"][:4])}

**Typical goals:**
{chr(10).join(f"- {g}" for g in v["typical_goals"][:3])}

**Internal conflicts they may experience:**
{v["internal_conflicts"].strip()}

**Narrative guidance:**
{v["persona_narrative_guidance"].strip()}
""")

    return "\n".join(context_parts)


# Test the function
test_context = build_value_context(["Achievement"], schwartz_config)
print("Sample value context for 'Achievement':")
print(test_context[:1500] + "..." if len(test_context) > 1500 else test_context)

Sample value context for 'Achievement':

### Achievement
**Core Motivation:** The fundamental drive to excel, to be competent, and to have that competence recognized. Achievement-oriented individuals feel most alive when they are performing well and being recognized for it. Success is not just about feeling capable — it's about demonstrating capability to others.

**How this manifests in behavior:**
- Sets measurable goals and tracks progress toward them
- Compares self to peers and external benchmarks
- Works hard, sometimes to the point of overwork, to meet standards of excellence
- Seeks feedback, recognition, and credentials that validate competence
- Feels frustrated when effort doesn't translate to recognized results

**Life domain expressions:**
- Work: Career-focused; measures self-worth partly through professional accomplishments. Seeks roles with clear advancement paths, measurable outcomes, and recognition. May be drawn to prestigious organizations, competitive fields, or vi

In [5]:
# Prompt templates are stored in prompts/ folder as YAML files
# See prompts/__init__.py for the loader utility
from prompts import persona_generation_prompt, journal_entry_prompt

## LLM Client Setup

Using `gpt-5-mini`. 

**Note:** GPT-5 models do not support `temperature` or `top_p` parameters. Instead, use the `reasoning` parameter to control how much the model "thinks" before responding.

In [6]:
client = AsyncOpenAI()
MODEL_NAME = "gpt-5-mini-2025-08-07"
# MODEL_NAME = "gpt-5-nano-2025-08-07"

# Type alias for reasoning effort levels
ReasoningEffort = Literal["minimal", "low", "medium", "high"]

# Default reasoning effort - change this to affect all generations
DEFAULT_REASONING_EFFORT: ReasoningEffort = "high"


async def generate_completion(
    prompt: str,
    response_format: dict | None = None,
) -> str | None:
    """Generate a completion using the OpenAI Responses API (async).

    Uses DEFAULT_REASONING_EFFORT to control how much the model "thinks".
    Valid reasoning effort values: "minimal", "low", "medium", "high".
    """
    try:
        kwargs = {
            "model": MODEL_NAME,
            "input": [{"role": "user", "content": prompt}],
            "reasoning": {"effort": DEFAULT_REASONING_EFFORT},
        }

        if response_format:
            kwargs["text"] = {"format": response_format}

        response = await client.responses.create(**kwargs)
        return response.output_text

    except Exception as e:
        print(f"Error generating completion: {e}")
        return None

In [7]:
def _verbosity_targets(verbosity: str) -> tuple[int, int, int]:
    """Returns (min_words, max_words, max_paragraphs) as guidance for the LLM."""
    normalized = verbosity.strip().lower()
    if normalized.startswith("short"):
        return 25, 80, 1
    if normalized.startswith("medium"):
        return 90, 180, 2
    return 160, 260, 3


def _build_banned_pattern(banned_terms: list[str]) -> re.Pattern:
    """Build regex pattern to detect banned Schwartz value terms."""
    escaped = [re.escape(term) for term in banned_terms if term.strip()]
    if not escaped:
        return re.compile(r"$^")
    return re.compile(r"(?i)\b(" + "|".join(escaped) + r")\b")


def generate_date_sequence(
    start_date: str, num_entries: int, min_days: int = 2, max_days: int = 10
) -> list[str]:
    """Generate a sequence of dates with random intervals.

    Args:
        start_date: Starting date in YYYY-MM-DD format
        num_entries: Number of dates to generate
        min_days: Minimum days between entries
        max_days: Maximum days between entries

    Returns:
        List of date strings in YYYY-MM-DD format
    """
    dates = []
    current = datetime.strptime(start_date, "%Y-%m-%d")

    for i in range(num_entries):
        dates.append(current.strftime("%Y-%m-%d"))
        if i < num_entries - 1:
            days_gap = random.randint(min_days, max_days)
            current += timedelta(days=days_gap)

    return dates


# Banned terms include Schwartz value labels AND derivative adjectives
SCHWARTZ_BANNED_TERMS = [
    # Value labels
    "Self-Direction",
    "Stimulation",
    "Hedonism",
    "Achievement",
    "Power",
    "Security",
    "Conformity",
    "Tradition",
    "Benevolence",
    "Universalism",
    # Derivative adjectives and related terms
    "self-directed",
    "autonomous",
    "stimulating",
    "excited",
    "hedonistic",
    "hedonist",
    "pleasure-seeking",
    "achievement-oriented",
    "ambitious",
    "powerful",
    "authoritative",
    "secure",
    "conformist",
    "conforming",
    "traditional",
    "traditionalist",
    "benevolent",
    "kind-hearted",
    "universalistic",
    "altruistic",
    # Meta terms
    "Schwartz",
    "values",
    "core values",
]

BANNED_PATTERN = _build_banned_pattern(SCHWARTZ_BANNED_TERMS)


class JournalEntryResult(BaseModel):
    """Container for journal entry with generation metadata."""

    entry: JournalEntry
    tone: str
    verbosity: str
    reflection_mode: str  # Unsettled/Grounded/Neutral


async def create_random_persona(
    config: dict, schwartz_config: dict, max_attempts: int = 2
) -> tuple[Persona | None, str]:
    """Generate a random persona with Schwartz values shown through life circumstances.

    Args:
        config: Main configuration with personas attributes
        schwartz_config: Schwartz values elaboration config
        max_attempts: Number of retry attempts for validation

    Returns:
        Tuple of (Generated Persona or None, prompt used)
    """
    age = random.choice(config["personas"]["age_ranges"])
    prof = random.choice(config["personas"]["professions"])
    cult = random.choice(config["personas"]["cultures"])
    num_values = random.choice([1, 2])
    vals = random.sample(config["personas"]["schwartz_values"], num_values)

    # Build rich value context from the Schwartz elaborations
    value_context = build_value_context(vals, schwartz_config)

    prompt = persona_generation_prompt.render(
        age=age,
        profession=prof,
        culture=cult,
        values=vals,
        value_context=value_context,
        banned_terms=SCHWARTZ_BANNED_TERMS,
    )

    first_person_pattern = re.compile(r"(?i)\b(i|my|me)\b")
    last_persona: Persona | None = None

    for _ in range(max_attempts):
        raw_json = await generate_completion(
            prompt, response_format=PERSONA_RESPONSE_FORMAT
        )
        if not raw_json:
            continue

        data = json.loads(raw_json)
        data["core_values"] = vals  # Ensure correct values
        persona = Persona(**data)
        last_persona = persona

        # Only validate banned terms and first-person usage
        if BANNED_PATTERN.search(persona.bio) or first_person_pattern.search(
            persona.bio
        ):
            continue
        return persona, prompt

    return last_persona, prompt


async def generate_journal_entry(
    persona: Persona,
    config: dict,
    date_str: str,
    previous_entries: list[JournalEntry] | None = None,
    max_attempts: int = 2,
) -> tuple[JournalEntryResult | None, str]:
    """Generate a journal entry for a persona on a given date.

    Args:
        persona: The persona writing the journal
        config: Configuration dict with generation parameters
        date_str: Date for this entry (YYYY-MM-DD format)
        previous_entries: List of previous JournalEntry objects for continuity
        max_attempts: Number of retry attempts for validation

    Returns:
        Tuple of (JournalEntryResult with entry and metadata or None, prompt used)
    """
    tone = random.choice(config["journal_entries"]["tones"])
    verbosity = random.choice(config["journal_entries"]["verbosity"])
    reflection_mode = random.choice(config["journal_entries"]["reflection_mode"])
    min_words, max_words, max_paragraphs = _verbosity_targets(verbosity)

    # Format previous entries for the prompt
    prev_entries_data = None
    if previous_entries:
        prev_entries_data = [
            {"date": e.date, "content": e.content} for e in previous_entries
        ]

    prompt = journal_entry_prompt.render(
        name=persona.name,
        age=persona.age,
        profession=persona.profession,
        culture=persona.culture,
        bio=persona.bio,
        date=date_str,
        tone=tone,
        verbosity=verbosity,
        min_words=min_words,
        max_words=max_words,
        max_paragraphs=max_paragraphs,
        reflection_mode=reflection_mode,
        previous_entries=prev_entries_data,
    )

    last_entry: JournalEntry | None = None

    for _ in range(max_attempts):
        raw_json = await generate_completion(
            prompt, response_format=JOURNAL_ENTRY_RESPONSE_FORMAT
        )
        if not raw_json:
            continue

        entry = JournalEntry(**json.loads(raw_json))
        last_entry = entry

        # Only validate banned terms (prevent label leakage)
        if not BANNED_PATTERN.search(entry.content):
            return JournalEntryResult(
                entry=entry,
                tone=tone,
                verbosity=verbosity,
                reflection_mode=reflection_mode,
            ), prompt

    if last_entry:
        return JournalEntryResult(
            entry=last_entry,
            tone=tone,
            verbosity=verbosity,
            reflection_mode=reflection_mode,
        ), prompt
    return None, prompt


@dataclass
class PersonaPipelineResult:
    """Complete results from one persona's generation pipeline."""

    persona_id: int
    persona: Persona | None
    entries: list[JournalEntryResult]
    persona_prompt: str
    entry_prompts: list[str]
    error: str | None = None


async def generate_persona_pipeline(
    persona_id: int,
    config: dict,
    schwartz_config: dict,
    num_entries: int = 3,
    start_date: str = "2023-10-27",
) -> PersonaPipelineResult:
    """Generate one persona and all their journal entries sequentially.

    Captures all prompts and outputs for later display (no printing during execution).

    Args:
        persona_id: Identifier for this persona (1, 2, 3, etc.)
        config: Main configuration dict
        schwartz_config: Schwartz values elaboration config
        num_entries: Number of journal entries to generate
        start_date: Starting date for journal entries (YYYY-MM-DD)

    Returns:
        PersonaPipelineResult with all data for display
    """
    entry_prompts: list[str] = []
    entries: list[JournalEntryResult] = []

    # 1. Generate persona
    persona, persona_prompt = await create_random_persona(config, schwartz_config)

    if not persona:
        return PersonaPipelineResult(
            persona_id=persona_id,
            persona=None,
            entries=[],
            persona_prompt=persona_prompt,
            entry_prompts=[],
            error="Failed to generate persona",
        )

    # 2. Generate journal entries sequentially (each depends on previous)
    dates = generate_date_sequence(start_date, num_entries)
    previous_entries: list[JournalEntry] = []

    for date_str in dates:
        result, prompt = await generate_journal_entry(
            persona, config, date_str, previous_entries=previous_entries
        )
        entry_prompts.append(prompt)

        if result:
            entries.append(result)
            previous_entries.append(result.entry)

    return PersonaPipelineResult(
        persona_id=persona_id,
        persona=persona,
        entries=entries,
        persona_prompt=persona_prompt,
        entry_prompts=entry_prompts,
        error=None,
    )


async def run_parallel_personas(
    num_personas: int,
    config: dict,
    schwartz_config: dict,
    min_entries: int = 3,
    max_entries: int = 10,
    start_date: str = "2023-10-27",
) -> list[PersonaPipelineResult | Exception]:
    """Run multiple persona pipelines in parallel.

    Returns results in order [Persona 1, Persona 2, ...] regardless of completion time.
    Failed pipelines return Exception objects instead of PersonaPipelineResult.

    Args:
        num_personas: Number of personas to generate in parallel
        config: Main configuration dict
        schwartz_config: Schwartz values elaboration config
        min_entries: Minimum journal entries per persona
        max_entries: Maximum journal entries per persona
        start_date: Starting date for journal entries

    Returns:
        List of PersonaPipelineResult or Exception, in persona order
    """
    # Each persona gets a random number of entries for training diversity
    tasks = [
        generate_persona_pipeline(
            i + 1,
            config,
            schwartz_config,
            num_entries=random.randint(min_entries, max_entries),
            start_date=start_date,
        )
        for i in range(num_personas)
    ]

    # return_exceptions=True: failed tasks return Exception instead of raising
    results = await asyncio.gather(*tasks, return_exceptions=True)
    return list(results)


def display_persona_results(result: PersonaPipelineResult | Exception) -> None:
    """Display all prompts and outputs for one persona.

    Args:
        result: PersonaPipelineResult or Exception from a failed pipeline
    """
    if isinstance(result, Exception):
        print(f"\n{'=' * 80}")
        print(f"PERSONA FAILED WITH EXCEPTION:")
        print(f"{'=' * 80}")
        print(f"{type(result).__name__}: {result}")
        print(f"{'=' * 80}\n")
        return

    print(f"\n{'=' * 80}")
    print(f"PERSONA {result.persona_id}")
    print(f"{'=' * 80}")

    if result.error:
        print(f"\nError: {result.error}")
        print(f"\n### Persona Generation Prompt:")
        print(f"{'─' * 40}")
        print(result.persona_prompt)
        print(f"{'─' * 40}")
        return

    # Persona details
    p = result.persona
    print(f"\n## Generated Persona: {p.name}")
    print(f"Age: {p.age} | Profession: {p.profession} | Culture: {p.culture}")
    print(f"Values: {', '.join(p.core_values)}")
    print(f"Bio: {p.bio}")

    print(f"\n### Persona Generation Prompt:")
    print(f"{'─' * 40}")
    print(result.persona_prompt)
    print(f"{'─' * 40}")

    # Journal entries
    for i, (entry_result, prompt) in enumerate(
        zip(result.entries, result.entry_prompts)
    ):
        print(f"\n{'─' * 40}")
        print(f"### Entry {i + 1}: {entry_result.entry.date}")
        print(
            f"Tone: {entry_result.tone} | Verbosity: {entry_result.verbosity} | Mode: {entry_result.reflection_mode}"
        )
        print(f"\n**Prompt:**")
        print(f"{'─' * 40}")
        print(prompt)
        print(f"{'─' * 40}")
        print(f"\n**Output:**")
        print(entry_result.entry.content)

    # Summary table for this persona
    if result.entries:
        print(f"\n{'─' * 40}")
        print(f"### Summary Table for {p.name}")
        print(f"{'─' * 40}")

        df = pl.DataFrame(
            {
                "Date": [r.entry.date for r in result.entries],
                "Tone": [r.tone for r in result.entries],
                "Verbosity": [r.verbosity for r in result.entries],
                "Reflection Mode": [r.reflection_mode for r in result.entries],
                "Schwartz Values": [", ".join(p.core_values)] * len(result.entries),
                "Content": [r.entry.content for r in result.entries],
            }
        )

        with pl.Config(fmt_str_lengths=1000, tbl_width_chars=200):
            display(df)

## Output Logging System

In [8]:
def get_log_dir() -> Path:
    """Create and return a timestamped log directory."""
    base_dir = Path("logs/synthetic_data")
    if not base_dir.exists():
        base_dir = Path("../logs/synthetic_data")
    base_dir.mkdir(parents=True, exist_ok=True)

    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
    log_dir = base_dir / timestamp
    log_dir.mkdir(exist_ok=True)
    return log_dir


def write_config_log(
    log_dir: Path, config: dict, num_personas: int, min_entries: int, max_entries: int
) -> None:
    """Write config.md with run parameters."""
    content = f"""# Run Configuration

**Timestamp**: {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}
**Notebook**: journal_gen.ipynb

## Persona Generation
- Num personas: {num_personas}
- Entries per persona: {min_entries}-{max_entries} (variable)

## Model Settings
- Model: {MODEL_NAME}
- Reasoning effort: {DEFAULT_REASONING_EFFORT}
"""
    (log_dir / "config.md").write_text(content)


def write_persona_log(log_dir: Path, result: PersonaPipelineResult) -> None:
    """Write persona_XXX.md with all entries and summary statistics."""
    if not result.persona:
        return

    p = result.persona
    lines = [
        f"# Persona {result.persona_id:03d}: {p.name}",
        "",
        "## Profile",
        f"- Age: {p.age}",
        f"- Profession: {p.profession}",
        f"- Culture: {p.culture}",
        f"- Core Values: {', '.join(p.core_values)}",
        f"- Bio: {p.bio}",
        "",
        "---",
    ]

    for i, entry_result in enumerate(result.entries, 1):
        lines.extend(
            [
                "",
                f"## Entry {i} - {entry_result.entry.date}",
                "",
                f"**Tone**: {entry_result.tone} | **Verbosity**: {entry_result.verbosity} | **Reflection Mode**: {entry_result.reflection_mode}",
                "",
                entry_result.entry.content,
                "",
                "---",
            ]
        )

    # Summary Statistics
    lines.extend(
        [
            "",
            "## Summary Statistics",
            "",
            "| Metric | Value |",
            "|--------|-------|",
            f"| Total Entries | {len(result.entries)} |",
            f"| Core Values | {', '.join(p.core_values)} |",
        ]
    )

    (log_dir / f"persona_{result.persona_id:03d}.md").write_text("\n".join(lines))


def write_prompts_log(log_dir: Path, results: list[PersonaPipelineResult]) -> None:
    """Write prompts.md with all LLM prompts."""
    lines = ["# Prompts Log", ""]

    for result in results:
        if isinstance(result, Exception) or not result.persona:
            continue

        lines.extend(
            [
                f"## Persona {result.persona_id:03d}: {result.persona.name}",
                "",
                "### Persona Generation Prompt",
                "```",
                result.persona_prompt,
                "```",
                "",
            ]
        )

        for i, prompt in enumerate(result.entry_prompts, 1):
            lines.extend(
                [
                    f"### Entry {i} Prompt",
                    "```",
                    prompt,
                    "```",
                    "",
                ]
            )

        lines.append("---\n")

    (log_dir / "prompts.md").write_text("\n".join(lines))


def save_run_logs(
    results: list[PersonaPipelineResult | Exception],
    config: dict,
    num_personas: int,
    min_entries: int,
    max_entries: int,
) -> Path:
    """Save all logs for a run.

    Returns:
        Path to the log directory
    """
    log_dir = get_log_dir()

    # Filter successful results
    successful = [
        r for r in results if isinstance(r, PersonaPipelineResult) and r.persona
    ]

    write_config_log(log_dir, config, num_personas, min_entries, max_entries)

    for result in successful:
        write_persona_log(log_dir, result)

    write_prompts_log(log_dir, successful)

    print(f"Logs saved to: {log_dir}")
    return log_dir

# Execution Loop

## Parallel Persona Generation

Run multiple personas in parallel. Each persona generates journal entries sequentially (for continuity), but different personas run concurrently.

**Usage:**
- `run_parallel_personas(n, ...)` - Run n personas in parallel
- `generate_persona_pipeline(id, ...)` - Run a single persona (use with `await`)

In [9]:
# Configuration
NUM_PERSONAS = 3
MIN_ENTRIES = 3
MAX_ENTRIES = 10
START_DATE = "2023-10-27"

print(
    f"Generating {NUM_PERSONAS} personas in parallel, each with {MIN_ENTRIES}-{MAX_ENTRIES} entries..."
)
print(f"Model: {MODEL_NAME} | Reasoning: {DEFAULT_REASONING_EFFORT}")
print(f"Start date: {START_DATE}\n")

# Run all personas in parallel
results = await run_parallel_personas(
    num_personas=NUM_PERSONAS,
    config=config,
    schwartz_config=schwartz_config,
    min_entries=MIN_ENTRIES,
    max_entries=MAX_ENTRIES,
    start_date=START_DATE,
)

# Display results in order (Persona 1, 2, 3, ...)
for result in results:
    display_persona_results(result)

# Save logs
log_dir = save_run_logs(results, config, NUM_PERSONAS, MIN_ENTRIES, MAX_ENTRIES)

# Summary
successful = [r for r in results if isinstance(r, PersonaPipelineResult) and r.persona]
failed = [
    r
    for r in results
    if isinstance(r, Exception) or (isinstance(r, PersonaPipelineResult) and r.error)
]

print(f"\n{'=' * 80}")
print(f"FINAL SUMMARY")
print(f"{'=' * 80}")
print(f"Successfully generated: {len(successful)}/{NUM_PERSONAS} personas")
if failed:
    print(f"Failed: {len(failed)} persona(s)")

total_entries = sum(len(r.entries) for r in successful)
entry_counts = [len(r.entries) for r in successful]
print(f"Total journal entries: {total_entries}")
if entry_counts:
    print(
        f"Entries per persona: min={min(entry_counts)}, max={max(entry_counts)}, avg={sum(entry_counts) / len(entry_counts):.1f}"
    )
print(f"\nLogs saved to: {log_dir}")

Generating 3 personas in parallel, each with 3-10 entries...
Model: gpt-5-mini-2025-08-07 | Reasoning: high
Start date: 2023-10-27


PERSONA 1

## Generated Persona: Asha Patel
Age: 62 | Profession: Entrepreneur | Culture: South Asian
Values: Tradition, Universalism
Bio: Asha Patel, 62, returned to her hometown after her father's illness and took over the family handloom workshop, keeping alive Patola weaving techniques and the ritual of preparing her mother's halva for Diwali, which she insists apprentices learn to make for festival sales. She runs a social enterprise selling organic-dyed saris, hires displaced weavers at fair wages, partners with a river-cleanup NGO, and channels part of the profits into scholarships for rural girls. Torn between relatives urging her to mass-produce with cheaper dyes and her drive to protect weavers' livelihoods and the river basin, she measures success by the number of apprentices who can reproduce the ancestral weave and by improvements in local wa

Date,Tone,Verbosity,Reflection Mode,Schwartz Values,Content
str,str,str,str,str,str
"""2023-10-27""","""Defensive""","""Medium (1-2 paragraphs)""","""Grounded""","""Tradition, Universalism""","""When the buyer at the fair leaned in and said, 'Switch to chemical dyes, we'll double the order,' I didn't hem and haw. I said no—plain. He wanted me to cut corners. I told him the real prices: organic vats, fair wages, the small river-cleanup fee we add. He closed his notebook. The apprentices were quiet; one of them kept watching my hands. It wasn't a spectacle. No speeches. Later I wrapped a sari slowly and made a little pot of Amma's halva to test a new batch for Diwali sales. The nod from that apprentice when she tied the knot—small, stubborn—felt like the Asha I want to be: blunt enough to refuse, steady enough to keep the old patola weave going. I went home tired but no regret."""
"""2023-11-01""","""Self-reflective""","""Medium (1-2 paragraphs)""","""Unsettled""","""Tradition, Universalism""","""Said yes before I finished my tea. The boutique called, big order for Diwali, wanted 'fast bright' and the relatives were already tallying profit. I told Maya to mix a small batch of synthetic dye in the back and to follow the new recipe; I stood over the vat more because habit than choice. The colour bloomed too quickly, no slow coaxing, no scent of turmeric or indigo—just that sharp chemical smell that made my throat close. I washed my hands, watched the rinse water go down the drain and didn't stop it. We wrapped the saris in plastic the buyer asked for. The apprentices laughed about the extra wages and I let the laugh out of me too. Later I made a small pot of Amma's halva, more out of routine than celebration, and the sweetness didn't quite reach the part of me that was registering how different the day felt. It sits there."""
"""2023-11-05""","""Defensive""","""Medium (1-2 paragraphs)""","""Grounded""","""Tradition, Universalism""","""Maya reached for the sachet of fast-bright; my hand closed over hers before either of us knew what we were doing. She tried to joke—'they'll sign off fast, more money'—and I put the sachet on the top shelf. No lecture. I went to the indigo vat, dipped a scrap, wrung it slow, coaxed the blue out the way Amma taught me—slow heat, a pinch of soda, patient watching. An apprentice crouched beside me and copied the thumb-press; her first clumsy strokes turned into something steadier. We wrapped the test piece, left the sachet on the shelf where it belongs, not thrown away but not to be used lightly. Made a small pot of Amma's halva afterward—quiet, not applause, just the sweetness that steadies a hand."""
"""2023-11-09""","""Self-reflective""","""Long (Detailed reflection)""","""Neutral""","""Tradition, Universalism""","""My thumb still knew the pressure for the final knot before my head did. I wrapped three saris—two indigo, one turmeric-copper—tucked receipts into an envelope and moved on. The order list on the table was plain: names, measurements, tiny margins. Rani practiced the diagonal twill while I signed the cheque for the dye supplier; our talk was short, about who would run to the market. I made Amma's halva in the small pot we use all year—half measure suji, extra ghee because it's nearly Diwali, a pinch of cardamom. The back room quieted while I stirred; I let each apprentice taste a warm spoon before they returned to stretching warp. The halva is routine, a small pause between work and festival. The indigo vat looked the same, slow and heavy, and the sachet of fast-bright still sits on the top shelf, unused. Papers were signed, wages counted into the steel tin, the river NGO left a note about the cleanup next weekend. Hands blue under my nails, palms warm from the halva pot, I washed up—n…"
"""2023-11-13""","""Emotional/Venting""","""Short (1-3 sentences)""","""Grounded""","""Tradition, Universalism""","""I cut the boutique's call mid-sentence, hung up, and went to the loom where Rani's fingers fumbled the diagonal twill. I steadied her thumbs, showed the knot without a lecture, then made a small pot of Amma's halva and handed her the spoon—quiet, the Asha I want to be."""
"""2023-11-18""","""Exhausted""","""Short (1-3 sentences)""","""Neutral""","""Tradition, Universalism""","""Hand moved to the indigo vat before my head finished the list; Rani relearned the diagonal knot, Maya bundled saris in the buyer's plastic, the sachet of fast-bright sits on the top shelf like a small accusation, and I stirred a tiny pot of Amma's halva because habit calms hands. I'm bone tired."""
"""2023-11-25""","""Stream of consciousness""","""Short (1-3 sentences)""","""Unsettled""","""Tradition, Universalism""","""Wrote my initials on the buyer's amendment to omit the river-cleanup fee so the advance would clear; Maya packed saris into the plastic and Rani kept the loom humming. I stirred Amma's halva, ate a spoon, and the signed paper sits on the table."""
"""2023-11-30""","""Brief and factual""","""Long (Detailed reflection)""","""Grounded""","""Tradition, Universalism""","""The amendment with my initials was still on the table; the ledger open, my palms faintly smelling of indigo. I picked up a pen. Where I'd initialed to drop the river-cleanup fee a week ago I wrote, in small letters, 'Please reinstate river-cleanup fee.' I initialed again, folded the page, slipped it into the envelope and set it in the steel tin with the wages. Maya came in carrying the wrapped saris; the plastic crinkled. She glanced at the envelope and asked, quietly, 'Will they take it?' I didn't answer. I warmed Amma's halva—half measure suji, extra ghee, a pinch of cardamom—and handed out spoons. Rani kept the loom moving; the indigo vat breathed slow. We ate, palms blue, and went back to work. No speech, no announcement. Just a small paper changed and a knot shown again until Rani's thumb found the right pressure. That smallness felt right—quiet and stubborn. I went back to the loom, the warp taut under my palms."""
"""2023-12-04""","""Emotional/Venting""","""Long (Detailed reflection)""","""Unsettled""","""Tradition, Universalism""","""When Rajni pushed the school notice across the table I folded it into my palm without looking. Her boy's name in the corner; a due date cruelly close. For weeks I'd been keeping an envelope marked Scholarships—Amma's handwriting, the list of girls. I sat, opened the steel tin where we keep petty cash, and the envelope felt heavier than I expected. I took the notes out and slid them to her. She pressed her forehead to her hands and said, 'I'll pay back,' like it was medicine. I wrote 'loan' in a corner of the receipt and stapled it to the folder. Maya watched from the doorway; the apprentices' feet kept time with the looms. I made Amma's halva because my hands move that way, handed Rajni a spoon, and couldn't make the sweetness match the money. I taped a small note on the scholarship folder—'return before Holi'—and put the envelope back where it lives, lighter now. There will be ways to replace this, I suppose, but for tonight the ledger has that blank. The indigo smell at my nails, t…"



PERSONA 2

## Generated Persona: Maya Chen
Age: 29 | Profession: Artist | Culture: North American
Values: Conformity, Hedonism
Bio: Maya Chen runs a small painting studio in Portland and chose a sunlit loft near a weekly farmers' market so she can eat well and take morning walks before starting commissions. She accepts only projects with clear briefs—community murals, portraits for neighbors and small businesses—and spends extra time reworking compositions and wording to avoid upsetting clients or the neighborhood review board; last year a curator asked her to add a controversial element to a funded mural and she agreed to a toned-down version rather than risk public complaints. To keep her weekends free for ceramics classes and short trips she turned down a steady design job that paid more but meant late nights, and now worries that saying no to gallery organizers or collectors could strain the relationships that supply her reliable commissions.

### Persona Generation Prompt:
──────

Date,Tone,Verbosity,Reflection Mode,Schwartz Values,Content
str,str,str,str,str,str
"""2023-10-27""","""Brief and factual""","""Short (1-3 sentences)""","""Unsettled""","""Conformity, Hedonism""","""I said yes to a collector's last-minute change - a small logo and a brighter palette - because it was easier than arguing and they'd pay upfront, so I spent Saturday at the easel instead of ceramics class. The canvas is wrapped and out the door, and my studio feels lighter and wrong at the same time."""
"""2023-10-29""","""Brief and factual""","""Long (Detailed reflection)""","""Neutral""","""Conformity, Hedonism""","""A smear of ultramarine on my thumb I couldn't rub off before the first coffee. Walked to the farmers' market because the loft needs groceries and I like the walk; picked up a sourdough wedge, a small bunch of kale, and a pear that was too ripe but cheap. Sat for ten minutes on the bench listening to the vendor who always complains about the city permitting process — unrelated conversation but familiar, like a background hum. Came back up the stairs with my tote, sun through the skylight, and put kettled water on for tea. In the studio I photographed two small canvases for the website and then smoothed a stubborn edge on the portrait commission — the client's note about logo placement still nags, so I rewrote the line in the contract to a specific size instead of hedging. Cleaned brushes (three turpentine-soaked rags in the sink, ugh), ordered a new roll of gesso, paid that invoice that's been sitting for a week. Took a call from a neighbor about the building's recycling schedule. Litt…"
"""2023-11-06""","""Stream of consciousness""","""Short (1-3 sentences)""","""Grounded""","""Conformity, Hedonism""","""Phone buzzed—owner of the café on the corner wanted their tiny logo painted into the mural; I typed a short, firm note saying I couldn't alter the composition but offered a painted donor panel or a set of prints instead, hit send, shut the laptop, and rode to the market with my tote and my hands still smelling of linseed."""
"""2023-11-10""","""Brief and factual""","""Short (1-3 sentences)""","""Neutral""","""Conformity, Hedonism""","""Linseed on my palms when I ran down to the farmers' market for eggs and a pear; the sourdough vendor had the longer line so I bought a small bunch of kale from the woman with the succulents and listened to the usual city-permit grumble. Back in the loft I tightened a stretcher, cleaned three brushes, replied to the café about a donor panel, boiled noodles."""
"""2023-11-20""","""Self-reflective""","""Long (Detailed reflection)""","""Neutral""","""Conformity, Hedonism""","""Halfway through rinsing a flat brush I noticed a faint smudge at the portrait's jaw and spent ten minutes feathering it out instead of making the grocery list. The commission is finally banded and boxed — corners taped, fragile sticker crooked — and I keep rewriting that one email about logo placement because phrasing feels like a small permission slip. Printer was out of ink when I tried to print the return label; the building's shared office has a cranky copier, of course. At the market the sourdough line was long so I grabbed a heel, a tiny bunch of kale from the woman with the chipped enamel bowl, and a pear that was almost too soft. The tomato stall guy grumbled about permits as usual; it's background noise now. Walked home with wet tote strap and the pear warm in my palm, boiled water for tea and let a kettle sing while I sorted brushes. Photographed a couple of small studies for the website — detail shots, nothing fancy — but didn't post them. Cleaned three brushes, fished tur…"



PERSONA 3

## Generated Persona: Anna Müller
Age: 22 | Profession: Nurse | Culture: Western European
Values: Tradition, Universalism
Bio: Anna Müller, 22, qualified at the regional hospital and now works on the community outreach team in a midsize city while returning home each Sunday to help her grandmother bake the family's Easter strudel and check on the elderly neighbor. She volunteers at a free clinic for asylum seekers, organizes clothing and medicine drives, and has pushed for reduced single-use plastics on her ward, but long shifts and patients' stories she cannot fix leave her sleepless and worn down. Her parents expect her to move back to the village and take over the small guesthouse, and she is torn between preserving those seasonal family rituals and applying for a master's in public health to tackle the inequalities she sees in care.

### Persona Generation Prompt:
────────────────────────────────────────
You are generating synthetic personas for a journaling dataset.

#

Date,Tone,Verbosity,Reflection Mode,Schwartz Values,Content
str,str,str,str,str,str
"""2023-10-27""","""Defensive""","""Medium (1-2 paragraphs)""","""Unsettled""","""Tradition, Universalism""","""I said yes in the procurement meeting — 'We'll postpone the reusable items pilot' — and heard my own voice agreeing before I could think. Infection control had stats, finance had spreadsheets, the ward sister looked like she hadn't slept and I didn't want an argument on top of a twelve-hour shift. So I folded, said we'd buy time, that patients' immediate needs came first. It felt necessary then, and easier. On the tram I pictured the boxes of disposable gloves in the storeroom and the volunteer from the asylum clinic asking again for reusable packs. I already told Grandma I'd be at her kitchen Sunday to roll the strudel and check on Frau Keller, and I told myself I couldn't carry another fight. This sits wrong — a tightness I can't place. I defended the choice, aloud and to myself, and that defense is part of why it won't settle."""
"""2023-10-30""","""Emotional/Venting""","""Short (1-3 sentences)""","""Unsettled""","""Tradition, Universalism""","""Said yes to Mum — 'I'll come back and help run the guesthouse next year' — because her voice sounded tired and I couldn't argue after a twelve-hour shift. We hung up with plans and it sits wrong."""
"""2023-11-04""","""Emotional/Venting""","""Long (Detailed reflection)""","""Grounded""","""Tradition, Universalism""","""My hands still smelled faintly of sanitizer when the woman arrived at the free clinic; she had a sleeping baby against her chest and a question in halting German about breastfeeding. I was tired; I had meant to catch the next tram home and roll the strudel with Grandma. But I sat, took off my gloves, and let her tell it—no rushing, no translating for her, just listening until she found the words. I showed her, gently and clumsily, how to hold the baby for a better latch, shifted pillows, warmed my palms on her cold wrists. I scribbled the time and address of the nearest support group on the back of an appointment card and circled it. The baby nuzzled and the woman's shoulders loosened into something like a laugh. Small, practical, not a fix for the faults of the system. On the tram I told Mum I'd be later for the guesthouse logistics—straight, no apology—and meant it. No grand decision, just the quiet certainty that choosing that small patience was who I want to be more often. It did…"
"""2023-11-12""","""Self-reflective""","""Medium (1-2 paragraphs)""","""Unsettled""","""Tradition, Universalism""","""When the interpreter line dropped and reception needed someone to keep the queue moving, I said yes to handing the woman a stamped appointment card and a leaflet instead of staying five minutes more to sit with her and work through the words. I stapled the leaflet, told her the number to call, and walked back to the desk as the line shortened. It felt necessary and easier in the moment. It sits wrong — she left holding the paper, eyes uncertain, and the sound of her cough keeps replaying in my head. I keep going over the few seconds I could have given."""
"""2023-11-19""","""Stream of consciousness""","""Medium (1-2 paragraphs)""","""Neutral""","""Tradition, Universalism""","""Halfway through the meds round the infusion pump's gentle double-beep pulled me back and I found myself thinking of Grandma's rolling pin—how she rubs flour between her palms before she folds the strudel. My hands smelled of handrub and burnt toast because I grabbed a sandwich on the go and the kettle hissed like it had things to say. Mrs. Novak asked to open the window then changed her mind, the trainee dropped a tray and apologised breathless, and I gave the kind of practical answers that steady other people's panic: tea? blanket? a different pillow? It was all small, routine. On the tram home my scarf smelled faintly of antiseptic and coffee, someone barked into their phone about a train delay, and I scrolled messages from Mum saying 'remember next week's booking'—I tapped a reply and deleted it twice. Put the donation box on the counter, made tea, re-read the asylum-clinic rota and nodded off for ten minutes with a sock unpaired on the floor. No decisions today, only the small unf…"
"""2023-11-25""","""Exhausted""","""Short (1-3 sentences)""","""Unsettled""","""Tradition, Universalism""","""Sat through the staff meeting and didn't push when someone decided to switch to pre-packaged single-use meal kits for the flu clinic; the ward needed speed and I kept my mouth shut. On the tram my hands still smelled of handrub and sugar from Grandma's apron, and it sits wrong."""
"""2023-12-04""","""Emotional/Venting""","""Long (Detailed reflection)""","""Grounded""","""Tradition, Universalism""","""He had a cough that rattled when he laughed; his fingers were too cold to hold the inhaler cap. The receptionist reached for a leaflet, but I crouched and pulled the donor box from under the bench. I found a spacer, wiped it with a swab, showed him how to fit it—slow numbers, counting with him: one-breathe, two-breathe—no interpreter, so I used my hands and the picture on the old leaflet. He managed a half-smile; his shoulders unclenched. That small unclenching felt like a weight lifting. I walked him to the chemist on Lindenstraße, paid the few euros for a replacement mouthpiece from the clinic fund, and wrote the clinic hours on the back of the receipt in big letters. Mum called while I waited and I told her to hold on—can't do guesthouse plans now—I put my phone back in my pocket. It wasn't heroic. I stayed long enough that he didn't leave with only paper and a number. Later, rolling strudel at Grandma's, flour under my nails, she asked if I'd applied for the master's yet and I mu…"
"""2023-12-06""","""Brief and factual""","""Medium (1-2 paragraphs)""","""Unsettled""","""Tradition, Universalism""","""I signed the early-discharge form for Mr. Kos, even though his sats were wobbling and he said he couldn't climb the stairs at home. The consultant slid the list toward me, the social worker said she'd put in a request, and I ticked the follow-up box, booked a taxi and stapled the leaflet to the discharge papers. They left with a receipt and a promise. It sits wrong. On the tram my hands smelled of handrub and flour from Grandma's apron; the taxi receipt was folded in my pocket. I told Mum I'd be late for strudel and said nothing about the man or the thin blanket he would sleep under."""
"""2023-12-08""","""Exhausted""","""Long (Detailed reflection)""","""Grounded""","""Tradition, Universalism""","""She kept apologising, voice so low I had to lean in, the toddler at her knees sticky with last night's stew. The interpreter line had dropped, reception was on hold, and I'd promised Grandma I'd be home to roll the strudel—I'd been on my feet since morning and I was tired, but I pulled my gloves off and sat on the low chair beside her. We used the clinic's spare pack of sanitary pads, me unwrapping it like it was something precious, her fingers worrying the plastic. I showed her how to fold it into the pants, wrote the drop-in hours of the women's centre on the back of an appointment card and circled it, and put two euros into her hand for the tram. She blinked slow, put the sleeping girl on her chest, and for a minute she wasn't looking terrified. It wasn't a policy change, not even a conversation at the procurement meeting—just a few quiet minutes and two euros drawn from the petty cash tin. On the tram home my scarf still smelled faintly of handrub and cinnamon from Grandma's apro…"
"""2023-12-13""","""Emotional/Venting""","""Short (1-3 sentences)""","""Unsettled""","""Tradition, Universalism""","""When Lina missed the med and the ward sister snapped 'write it up', I opened my mouth to explain she'd been pulled to triage, then closed it and let the form pass across the desk. Lina left stiff and small; it sits wrong."""


Logs saved to: ../logs/synthetic_data/2026-01-09_09-15-13

FINAL SUMMARY
Successfully generated: 3/3 personas
Total journal entries: 24
Entries per persona: min=5, max=10, avg=8.0

Logs saved to: ../logs/synthetic_data/2026-01-09_09-15-13
