# WikiGen Agent Workflow Testing

This notebook provides an environment to test the WikiGen agent-based story processing workflow using the new **BaseAgent architecture**. Each agent can be tested individually with different LLM models to demonstrate the flexibility and modularity of the system.

## 🏗️ BaseAgent Architecture

All WikiGen agents now inherit from `BaseAgent`, providing:
- **Default Models**: Each agent has configurable default provider/model
- **Override Flexibility**: Can override provider/model per call when needed  
- **Centralized LLM Logic**: Common error handling and validation
- **Clean API**: Simplified method signatures with optional parameters

## 📋 WikiGen Agents

1. **ArcSplitter Agent** - Analyzes story structure and determines arc boundaries
2. **WikiPlanner Agent** - Plans wiki structure and article organization  
3. **ArticleWriter Agent** - Generates actual wiki article content
4. **GeneralSummarizer Agent** - Creates summaries of various content types
5. **ChapterBacklinker Agent** - Creates bidirectional links between chapters and articles
6. **WikiGenOrchestrator** - Coordinates the complete workflow

Each agent demonstrates the BaseAgent pattern with different default models to show architectural flexibility.

## ⚙️ Setup

Make sure the Portkey Gateway is running to use the LLM Service:

```bash
docker run \
  --name portkey-gateway \
  -p 8787:8787 \
  portkeyai/gateway:latest
```

The following cells will:
- use the Portkey Gateway to test the LLM Service
- initialize the LLM Service
- import the WikiGen workflow agents
- load a test story from the `tests/resources/pokemon_amber/story` directory


In [1]:
# Additional imports for unified Story model
from src.schemas.story import Story, Chapter

print("✅ Added unified Story model imports")


✅ Added unified Story model imports


In [2]:
# Cell 1: Setup and Configuration
%load_ext autoreload
%autoreload 2

import sys
import os
from pathlib import Path
import asyncio
import logging
from uuid import uuid4
from typing import Dict, Any, List, cast

# Set environment to skip database for testing
os.environ["SKIP_DATABASE"] = "true"
os.environ["PORTKEY_BASE_URL"] = "http://localhost:8787/v1"  # Default Portkey Gateway

# Add the backend/src directory to sys.path
notebook_dir = Path.cwd()
if (notebook_dir / 'src').is_dir() and (notebook_dir / 'pyproject.toml').is_file():
    # This means we're likely in the backend/ directory itself
    sys.path.insert(0, str(notebook_dir / 'src'))
elif (notebook_dir.parent / 'src').is_dir() and (notebook_dir.parent / 'pyproject.toml').is_file():
    # This means we're likely in the backend/notebooks/ directory
    sys.path.insert(0, str(notebook_dir.parent / 'src'))
else:
    print("Warning: Could not automatically add 'src/' to Python path.")

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[logging.StreamHandler(sys.stdout)]
)
# Reduce noise from third-party loggers
logging.getLogger('httpx').setLevel(logging.WARNING)
logging.getLogger('uvicorn.access').setLevel(logging.WARNING)

print("🚀 WikiGen Agent Workflow Testing Environment")
print("=" * 60)
print(f"Current working directory: {os.getcwd()}")
print(f"Database mode: {'IN-MEMORY' if os.environ.get('SKIP_DATABASE') == 'true' else 'SUPABASE'}")
print(f"Portkey Gateway: {os.environ.get('PORTKEY_BASE_URL', 'Not configured')}")
print("Autoreload enabled. Changes to .py files in src/ will be reloaded.")
print("\n💡 This notebook tests WikiGen agents with real LLM calls.")
print("   Each agent can use different models to demonstrate flexibility.")


🚀 WikiGen Agent Workflow Testing Environment
Current working directory: /home/jimnix/gitrepos/shuscribe/backend/notebooks
Database mode: IN-MEMORY
Portkey Gateway: http://localhost:8787/v1
Autoreload enabled. Changes to .py files in src/ will be reloaded.

💡 This notebook tests WikiGen agents with real LLM calls.
   Each agent can use different models to demonstrate flexibility.


In [3]:
# Cell 2: Import Modules (Updated for Supabase)
from src.config import settings
from src.services.llm.llm_service import LLMService
from src.database.repositories import get_user_repository, get_story_repository
from src.schemas.llm.models import LLMMessage, LLMResponse, ThinkingEffort
from src.core.story_loading import StoryLoaderFactory
from src.core.encryption import encrypt_api_key  # 
from dotenv import dotenv_values


# WikiGen agent imports
from src.agents.wikigen import (
    WikiGenOrchestrator,
    ArcSplitterAgent,
    WikiPlannerAgent,
    ArticleWriterAgent,
    GeneralSummarizerAgent,
    ChapterBacklinkerAgent
)

print("✅ Modules imported successfully.")

# Display current settings
print("\n--- Current Settings ---")
print(f"DEBUG: {settings.DEBUG}")
print(f"ENVIRONMENT: {settings.ENVIRONMENT}")
print(f"SKIP_DATABASE: {settings.SKIP_DATABASE}")
print(f"PORTKEY_BASE_URL: {settings.PORTKEY_BASE_URL}")
if not settings.SKIP_DATABASE:
    print(f"SUPABASE_URL: {settings.SUPABASE_URL}")
    print(f"SUPABASE_KEY: {'***' + settings.SUPABASE_KEY[-4:] if len(settings.SUPABASE_KEY) > 4 else 'Not set'}")
else:
    print("DATABASE_MODE: In-Memory (Supabase skipped)")
print("------------------------")

print("\n🔧 Import Summary:")
print("✅ LLMService - Ready for testing")
print("✅ Repository Factory - Automatic in-memory/Supabase switching") 
print("✅ Pydantic Models - Type-safe user schemas")
print("✅ Encryption - API key encryption/decryption")  # Updated
print("✅ Supabase Connection - Available when SKIP_DATABASE=false")


2025-06-29 19:27:56,119 - src.config - INFO - Pydantic Settings 'extra' mode set to: 'ignore' for environment: 'development'
✅ Modules imported successfully.

--- Current Settings ---
DEBUG: True
ENVIRONMENT: development
SKIP_DATABASE: True
PORTKEY_BASE_URL: http://localhost:8787/v1
DATABASE_MODE: In-Memory (Supabase skipped)
------------------------

🔧 Import Summary:
✅ LLMService - Ready for testing
✅ Repository Factory - Automatic in-memory/Supabase switching
✅ Pydantic Models - Type-safe user schemas
✅ Encryption - API key encryption/decryption
✅ Supabase Connection - Available when SKIP_DATABASE=false


In [4]:
# Cell 3: Initialize LLM Service and Store Encrypted API Keys

print("🚀 Initializing LLM Service...")

# The factory automatically chooses in-memory or Supabase based on SKIP_DATABASE
llm_service = LLMService(user_repository=get_user_repository())

# Ensure repository is available
if not llm_service.user_repository:
    raise RuntimeError("Failed to initialize user repository")

print("✅ LLM Service initialized successfully!")
print(f"📁 Repository type: {type(llm_service.user_repository).__name__}")
print(f"🛡️  Database mode: {'In-Memory' if settings.SKIP_DATABASE else 'Supabase'}")

if settings.SKIP_DATABASE:
    print("\n💡 Running in database-free mode:")
    print("   • Perfect for testing with direct API keys")
    print("   • No Supabase setup required")
    print("   • User API keys stored in memory only")
else:
    print("\n🗄️  Connected to Supabase:")
    print("   • User API keys stored encrypted in database")
    print("   • Full multi-user support enabled")
    print("   • Row-level security active")

print("\n🔑 Loading and storing encrypted API keys...")

# Load environment variables once
env_values = dotenv_values()

# Create a test user for API key storage
TEST_USER_ID = uuid4()
test_user = await llm_service.user_repository.create({
    "id": TEST_USER_ID,
    "email": "test@example.com",
    "name": "Test User"
})

print(f"👤 Created test user: {test_user.email} (ID: {TEST_USER_ID})")

# Store API keys from environment in the repository (properly encrypted)
stored_keys = 0
available_providers = []
for provider in LLMService.get_all_llm_providers():
    provider_id = provider.provider_id
    api_key = env_values.get(f"{provider_id.upper()}_API_KEY")
    
    if api_key:
        # Properly encrypt the API key before storing
        encrypted_key = encrypt_api_key(api_key)
        
        await llm_service.user_repository.store_api_key(
            user_id=TEST_USER_ID,
            provider=provider_id,
            encrypted_key=encrypted_key,  # Now properly encrypted
            validation_status="pending"
        )
        stored_keys += 1
        available_providers.append(provider.display_name)
        print(f"   🔐 Stored and encrypted {provider.display_name} API key")

print(f"✅ Stored {stored_keys} encrypted API keys in repository")
if available_providers:
    print(f"📋 Available providers: {', '.join(available_providers)}")
print("\n🎯 Ready for testing!")

🚀 Initializing LLM Service...
✅ LLM Service initialized successfully!
📁 Repository type: InMemoryUserRepository
🛡️  Database mode: In-Memory

💡 Running in database-free mode:
   • Perfect for testing with direct API keys
   • No Supabase setup required
   • User API keys stored in memory only

🔑 Loading and storing encrypted API keys...
👤 Created test user: test@example.com (ID: 99d4f6d3-4138-4a6f-9363-4eb90d51a3fb)
   🔐 Stored and encrypted OpenAI API key
   🔐 Stored and encrypted Google API key
   🔐 Stored and encrypted Anthropic API key
✅ Stored 3 encrypted API keys in repository
📋 Available providers: OpenAI, Google, Anthropic

🎯 Ready for testing!


In [9]:
# Cell 4: Load Test Story

print("🚀 Initializing Services and Loading Test Story...")
print("=" * 60)

STORY_REPO = get_story_repository()

print("\n📖 Loading Test Story...")
story_directory_path = Path("../tests/resources/pokemon_amber/story")

try:
    input_story = StoryLoaderFactory.load_story(story_directory_path)
    print(f"✅ Loaded: {input_story.title}")
    print(f"   Author: {input_story.author}")
    print(f"   Chapters: {input_story.total_chapters}")
    print(f"   Genres: {', '.join(input_story.genres)}")
    
    # Store in repository for testing
    stored_story = await STORY_REPO.store_story(input_story, TEST_USER_ID)
    AMBER_STORY_ID = stored_story.id
    print(f"   Stored with ID: {AMBER_STORY_ID}")
    
    # Prepare story content for agent testing
    story_content = "\n\n".join([
        f"# {chapter.title}\n{chapter.content}" 
        for chapter in input_story.chapters
    ])
    
    print(f"\n📊 Test Data Summary:")
    print(f"   Total characters: {len(story_content):,}")
    print(f"   First chapter: {input_story.chapters[0].title}")
    print(f"   Last chapter: {input_story.chapters[-1].title}")
    
except Exception as e:
    print(f"❌ Error loading story: {e}")
    raise



🚀 Initializing Services and Loading Test Story...

📖 Loading Test Story...
✅ Loaded: Pokemon: Ambertwo
   Author: ChronicImmortality
   Chapters: 8
   Genres: Drama, Action, Adventure, Fantasy
   Stored with ID: d6b0d4ce-a257-43fd-922d-2ce0cae543e7

📊 Test Data Summary:
   Total characters: 104,667
   First chapter: [Chapter 1] Truck-kun Strikes Again
   Last chapter: [Chapter 8] Start of an Unpaid Side Quest


# Run individual agents

## 📋 WikiGen Agents

1. **ArcSplitter Agent** - Analyzes story structure and determines arc boundaries
2. **WikiPlanner Agent** - Plans wiki structure and article organization  
3. **ArticleWriter Agent** - Generates actual wiki article content
4. **GeneralSummarizer Agent** - Creates summaries of various content types
5. **ChapterBacklinker Agent** - Creates bidirectional links between chapters and articles
6. **WikiGenOrchestrator** - Coordinates the complete workflow

## ArcSplitter Agent - Streaming Analysis

**🔄 Single LLM Call with Real-time Streaming!**

The ArcSplitter agent now supports **streaming analysis** that provides real-time feedback while accumulating the final structured result. This approach uses **only one LLM call** for both user experience and final parsing.

- ⚡ **Real-time Feedback** - See analysis progress as it happens
- 🚀 **Single LLM Call** - No wasteful duplicate API calls  
- 📊 **Live Updates** - Stream response chunks as they're generated
- 🎯 **Smart Accumulation** - Parse final accumulated result into structured data
- 💰 **Cost Efficient** - One call gives you both streaming UX and structured output


In [15]:
# Cell: 4 Test ArcSplitter Agent - New Streaming Architecture

print("\n🔄 Testing ArcSplitter Agent - NEW STREAMING ARCHITECTURE")
print("=" * 60)
print("This demonstrates the new clean separation of concerns:")
print("📡 Streaming: Raw LLM responses with proper chunk type labels")
print("🎯 State Management: Internal accumulation and parsing")
print("🔗 Result Access: Clean final result via get_final_result()")

# provider = "google"  # Use Google for streaming demo    
# api_key = AVAILABLE_PROVIDERS[provider]['api_key']
# model = LLMService.get_default_test_model_name_for_provider(provider)
# model = "claude-sonnet-4-20250514"
# model = "gemini-2.5-flash-lite-preview-06-17"
# model = "o4-mini"

# print(f"🔧 Streaming with: {provider} / {model}")

if AMBER_STORY_ID is not None:
    STORY = await STORY_REPO.get_story(AMBER_STORY_ID)
    print(f"📊 Story: {STORY.title} ({len(STORY.chapters)} chapters)")

try:
    # Initialize the agent
    arc_splitter_streaming = ArcSplitterAgent(
        llm_service=llm_service,
        # default_provider=provider,
        # default_model=model,
        temperature=0.7,
        max_tokens=16000,
        thinking=ThinkingEffort.LOW
    )
    
    print(f"\n⚡ Starting streaming analysis...")
    print("📡 Live stream output (by chunk type):")
    print("-" * 40)
    
    # Import ChunkType for proper enum usage
    from src.schemas.llm.models import ChunkType
    
    # Track streaming responses by type - Using enum constants
    chunk_counts = {ChunkType.THINKING: 0, ChunkType.CONTENT: 0, ChunkType.UNKNOWN: 0}
    total_content_chars = 0
    
    async for chunk in arc_splitter_streaming.analyze_story_streaming(
        story=STORY,
        user_id=TEST_USER_ID,
    ):
        # Use the enum directly - no need for .value
        chunk_type = chunk.chunk_type
        chunk_counts[chunk_type] += 1
        
        # Get metadata for debugging
        metadata = chunk.metadata or {}
        chunk_num = metadata.get("chunk_number", "?")
        chapters = metadata.get("chapters", "?")
        
        # Show chunk info - use .value for display
        print(f"[{chunk_type.value.upper()}] Chunk {chunk_num} ({chapters}): {len(chunk.content)} chars")
        
        # For CONTENT chunks, show a preview of the actual content
        if chunk_type == ChunkType.CONTENT and chunk.content:
            total_content_chars += len(chunk.content)
            # Show first bit of content for CONTENT chunks
            preview = chunk.content[:100].replace("\n", "\\n")
            print(f"  📝 Content preview: {preview}...")
        
        # For THINKING chunks, just show that we got thinking
        elif chunk_type == ChunkType.THINKING and chunk.content:
            print(f"  🤔 Thinking: {len(chunk.content)} chars")
    
    print("-" * 40)
    print(f"✅ Streaming completed!")
    print(f"📊 Chunk counts: {chunk_counts}")
    print(f"📝 Total content characters: {total_content_chars}")
    
    # Get internal chunk processing details
    chunk_results = arc_splitter_streaming.get_window_results()
    print(f"\n🔍 Internal Chunk Processing Summary:")
    for chunk_result in chunk_results:
        print(f"  Chunk {chunk_result.window_number}: {chunk_result.chapters_range}")
        raw = chunk_result.raw_content
        print(f"    📝 Accumulated: T={len(raw.thinking)}, C={len(raw.content)}, U={len(raw.unknown)} chars")
        if chunk_result.parsed_result:
            print(f"    ✅ Parsed: {len(chunk_result.parsed_result.arcs)} arcs")
        if chunk_result.error:
            print(f"    ❌ Error: {chunk_result.error}")
    
    # Get the final merged result (no manual parsing needed!)
    print(f"\n🎯 Getting final result...")
    try:
        final_result = arc_splitter_streaming.get_final_result()
        
        print(f"✅ Final result ready!")
        print(f"📖 Total arcs generated: {len(final_result.arcs)}")
        
        # Show all arcs
        for i, arc in enumerate(final_result.arcs):
            print(f"\n🏛️  Arc {i+1}: {arc.title}")
            print(f"   📖 Chapters: {arc.start_chapter}-{arc.end_chapter}")
            print(f"   📝 Summary: {arc.summary[:100]}...")
            print(f"   🔒 Finalized: {arc.is_finalized}")
            
    except Exception as parse_error:
        print(f"❌ Could not get final result: {parse_error}")
        print("💡 Check the chunk processing summary above for errors")
    
except Exception as e:
    print(f"❌ Streaming test failed: {e}")
    import traceback
    traceback.print_exc()


🔄 Testing ArcSplitter Agent - NEW STREAMING ARCHITECTURE
This demonstrates the new clean separation of concerns:
📡 Streaming: Raw LLM responses with proper chunk type labels
🎯 State Management: Internal accumulation and parsing
🔗 Result Access: Clean final result via get_final_result()
📊 Story: Pokemon: Ambertwo (8 chapters)

⚡ Starting streaming analysis...
📡 Live stream output (by chunk type):
----------------------------------------
2025-06-29 19:33:17,408 - src.agents.wikigen.arc_splitter - INFO - 🔄 Starting new analysis: 1f2473ae-c4ff-49d9-9ad0-b9fc60181c02
2025-06-29 19:33:17,409 - src.agents.wikigen.arc_splitter - INFO - 📐 Model Context Window: 1,048,576 tokens
2025-06-29 19:33:17,409 - src.agents.wikigen.arc_splitter - INFO - 📐 Chunk Limit: 1,043,576 tokens (after 5,000 overhead)
2025-06-29 19:33:17,410 - src.agents.wikigen.arc_splitter - INFO - 🔍 ArcSplitter Analysis Starting:
2025-06-29 19:33:17,410 - src.agents.wikigen.arc_splitter - INFO -    📖 Story: Pokemon: Ambertwo
202

In [16]:
print(chunk_results[0].raw_content.content)

I'm currently solidifying the final arc structure and making sure the information presented is the most up-to-date and complete. Given that this is the concluding section, all arcs will have their `is_finalized` value set to `true` to indicate the completion of this initial structure. This approach is intended to provide a robust framework that can readily expand as the story grows in the future, encompassing the core themes and overarching plot threads.




{
  "arc_strategy": "The arc division strategy prioritizes the core principle of fewer, more comprehensive arcs to accommodate significant future growth. With only 8 chapters provided, the focus is on identifying the most pivotal narrative transitions. The story begins with a fundamental reset of the protagonist's existence and ends with a significant reveal about her origins and the overarching antagonists. Arc 1, \"Rebirth and Revelation\" (Chapters 1-6), covers the initial phase of the protagonist's awakening in the Pokemon worl

In [17]:
print(final_result.model_dump_json(indent=2))

{
  "story_prediction": "The story is poised for a long-term narrative focused on the protagonist's journey of self-discovery and her role in combating Team Rocket's ambitions. Future plotlines will likely involve her mastering her abilities as a trainer, exploring different regions of the Pokemon world, and uncovering the secrets behind Mewtwo and other genetically engineered Pokemon. Her unique meta-knowledge will be a recurring advantage, but the reality of the world will challenge her assumptions. Team Rocket's pursuit of powerful Pokemon and their exploitation of Pokemon biology will drive central conflicts, potentially leading to direct confrontations with their leadership and research divisions. Dr. Fuji's complex role as a scientist, father, and Team Rocket operative will likely lead to moral dilemmas and potential betrayals or alliances. The protagonist's connection to Amber's mother, Delia, may also become a significant plot thread.",
  "growth_assessment": "This story has im

## WikiPlanner Agent

In [22]:
#
#
#

## ArticleWriterAgent

In [23]:
#
#
#

## GeneralSummarizerAgent

In [24]:
#
#
#

## WikiGenOrchestrator

In [25]:
#
#
#