# WikiGen Agent Workflow Testing

This notebook provides an environment to test the WikiGen agent-based story processing workflow using the new **BaseAgent architecture**. Each agent can be tested individually with different LLM models to demonstrate the flexibility and modularity of the system.

## 🏗️ BaseAgent Architecture

All WikiGen agents now inherit from `BaseAgent`, providing:
- **Default Models**: Each agent has configurable default provider/model
- **Override Flexibility**: Can override provider/model per call when needed  
- **Centralized LLM Logic**: Common error handling and validation
- **Clean API**: Simplified method signatures with optional parameters

## 📋 WikiGen Agents

1. **ArcSplitter Agent** - Analyzes story structure and determines arc boundaries
2. **WikiPlanner Agent** - Plans wiki structure and article organization  
3. **ArticleWriter Agent** - Generates actual wiki article content
4. **GeneralSummarizer Agent** - Creates summaries of various content types
5. **ChapterBacklinker Agent** - Creates bidirectional links between chapters and articles
6. **WikiGenOrchestrator** - Coordinates the complete workflow

Each agent demonstrates the BaseAgent pattern with different default models to show architectural flexibility.

## ⚙️ Setup

Make sure the Portkey Gateway is running to use the LLM Service:

```bash
docker run \
  --name portkey-gateway \
  -p 8787:8787 \
  portkeyai/gateway:latest
```

The following cells will:
- use the Portkey Gateway to test the LLM Service
- initialize the LLM Service
- import the WikiGen workflow agents
- load a test story from the `tests/resources/pokemon_amber/story` directory


In [5]:
# Cell 1: Setup and Configuration
%load_ext autoreload
%autoreload 2

import sys
import os
from pathlib import Path
import asyncio
import logging
from uuid import uuid4
from typing import Dict, Any, List, cast

# Set environment to skip database for testing
os.environ["SKIP_DATABASE"] = "true"
os.environ["PORTKEY_BASE_URL"] = "http://localhost:8787/v1"  # Default Portkey Gateway

# Add the backend/src directory to sys.path
notebook_dir = Path.cwd()
if (notebook_dir / 'src').is_dir() and (notebook_dir / 'pyproject.toml').is_file():
    # This means we're likely in the backend/ directory itself
    sys.path.insert(0, str(notebook_dir / 'src'))
elif (notebook_dir.parent / 'src').is_dir() and (notebook_dir.parent / 'pyproject.toml').is_file():
    # This means we're likely in the backend/notebooks/ directory
    sys.path.insert(0, str(notebook_dir.parent / 'src'))
else:
    print("Warning: Could not automatically add 'src/' to Python path.")

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[logging.StreamHandler(sys.stdout)]
)
# Reduce noise from third-party loggers
logging.getLogger('httpx').setLevel(logging.WARNING)
logging.getLogger('uvicorn.access').setLevel(logging.WARNING)

print("🚀 WikiGen Agent Workflow Testing Environment")
print("=" * 60)
print(f"Current working directory: {os.getcwd()}")
print(f"Database mode: {'IN-MEMORY' if os.environ.get('SKIP_DATABASE') == 'true' else 'SUPABASE'}")
print(f"Portkey Gateway: {os.environ.get('PORTKEY_BASE_URL', 'Not configured')}")
print("Autoreload enabled. Changes to .py files in src/ will be reloaded.")
print("\n💡 This notebook tests WikiGen agents with real LLM calls.")
print("   Each agent can use different models to demonstrate flexibility.")


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
🚀 WikiGen Agent Workflow Testing Environment
Current working directory: /home/jimnix/gitrepos/shuscribe/backend/notebooks
Database mode: IN-MEMORY
Portkey Gateway: http://localhost:8787/v1
Autoreload enabled. Changes to .py files in src/ will be reloaded.

💡 This notebook tests WikiGen agents with real LLM calls.
   Each agent can use different models to demonstrate flexibility.


In [6]:
# Cell 2: Import Modules
from src.config import settings
from src.services.llm.llm_service import LLMService
from src.database.repositories import get_user_repository, get_story_repository
from src.schemas.llm.models import LLMMessage, LLMResponse, ThinkingEffort
from src.core.story_loading import StoryLoaderFactory
from dotenv import dotenv_values

# WikiGen agent imports
from src.agents.wikigen import (
    WikiGenOrchestrator,
    ArcSplitterAgent,
    WikiPlannerAgent,
    ArticleWriterAgent,
    GeneralSummarizerAgent,
    ChapterBacklinkerAgent
)

print("✅ Modules imported successfully.")

# Display current settings
print("\n--- Current Settings ---")
print(f"DEBUG: {settings.DEBUG}")
print(f"ENVIRONMENT: {settings.ENVIRONMENT}")
print(f"SKIP_DATABASE: {settings.SKIP_DATABASE}")
print(f"PORTKEY_BASE_URL: {settings.PORTKEY_BASE_URL}")
print("DATABASE_MODE: In-Memory (Supabase skipped)")
print("------------------------")

print("\n🔧 Import Summary:")
print("✅ LLMService - Ready for testing")
print("✅ WikiGen Agents - All 5 agents + orchestrator imported") 
print("✅ Story Loading - Test story will be loaded")
print("✅ Repository Factory - In-memory repositories for testing")
print("✅ Pydantic Models - Type-safe schemas")


✅ Modules imported successfully.

--- Current Settings ---
DEBUG: True
ENVIRONMENT: development
SKIP_DATABASE: True
PORTKEY_BASE_URL: http://localhost:8787/v1
DATABASE_MODE: In-Memory (Supabase skipped)
------------------------

🔧 Import Summary:
✅ LLMService - Ready for testing
✅ WikiGen Agents - All 5 agents + orchestrator imported
✅ Story Loading - Test story will be loaded
✅ Repository Factory - In-memory repositories for testing
✅ Pydantic Models - Type-safe schemas


In [10]:
# Cell 3: Initialize Services and Load Test Story

print("🚀 Initializing Services and Loading Test Story...")
print("=" * 60)

# Initialize repositories and LLM service
user_repo = get_user_repository()
story_repo = get_story_repository()
llm_service = LLMService(user_repository=user_repo)

print("✅ Services initialized successfully!")
print(f"📁 Repository types: {type(user_repo).__name__}, {type(story_repo).__name__}")

# Load API keys from .env
env_values = dotenv_values()
print("\n🔑 Loading API Keys from .env...")

# Check which providers have API keys available
AVAILABLE_PROVIDERS: dict[str, dict[str, str]] = {}
for provider in LLMService.get_all_llm_providers():
    provider_id = provider.provider_id
    api_key = env_values.get(f"{provider_id.upper()}_API_KEY")
    if api_key:
        AVAILABLE_PROVIDERS[provider_id] = {
            'name': provider.display_name,
            'api_key': api_key
        }
        print(f"✅ {provider.display_name}: API key found")
    else:
        print(f"⏭️  {provider.display_name}: No API key found")

print(f"\n📊 Available Providers: {len(AVAILABLE_PROVIDERS)}")

# Load test story
print("\n📖 Loading Test Story...")
story_directory_path = Path("../tests/resources/pokemon_amber/story")

try:
    input_story = StoryLoaderFactory.load_story(story_directory_path)
    print(f"✅ Loaded: {input_story.metadata.title}")
    print(f"   Author: {input_story.metadata.author}")
    print(f"   Chapters: {input_story.total_chapters}")
    print(f"   Genres: {', '.join(input_story.metadata.genres)}")
    
    # Store in repository for testing
    fake_owner_id = uuid4()
    stored_story = await story_repo.store_input_story(input_story, fake_owner_id)
    print(f"   Stored with ID: {stored_story.id}")
    
    # Prepare story content for agent testing
    story_content = "\n\n".join([
        f"# {chapter.title}\n{chapter.content}" 
        for chapter in input_story.chapters
    ])
    
    print(f"\n📊 Test Data Summary:")
    print(f"   Total characters: {len(story_content):,}")
    print(f"   First chapter: {input_story.chapters[0].title}")
    print(f"   Last chapter: {input_story.chapters[-1].title}")
    
except Exception as e:
    print(f"❌ Error loading story: {e}")
    raise

print(f"\n🎯 Ready for WikiGen agent testing with {len(AVAILABLE_PROVIDERS)} LLM providers!")


🚀 Initializing Services and Loading Test Story...
✅ Services initialized successfully!
📁 Repository types: InMemoryUserRepository, InMemoryStoryRepository

🔑 Loading API Keys from .env...
✅ OpenAI: API key found
✅ Google: API key found
✅ Anthropic: API key found

📊 Available Providers: 3

📖 Loading Test Story...
✅ Loaded: Pokemon: Ambertwo
   Author: ChronicImmortality
   Chapters: 8
   Genres: Drama, Action, Adventure, Fantasy
   Stored with ID: d2e1f736-a230-4363-8cef-c3def1ba9b0a

📊 Test Data Summary:
   Total characters: 104,667
   First chapter: [Chapter 1] Truck-kun Strikes Again
   Last chapter: [Chapter 8] Start of an Unpaid Side Quest

🎯 Ready for WikiGen agent testing with 3 LLM providers!


# Run individual agents

## 📋 WikiGen Agents

1. **ArcSplitter Agent** - Analyzes story structure and determines arc boundaries
2. **WikiPlanner Agent** - Plans wiki structure and article organization  
3. **ArticleWriter Agent** - Generates actual wiki article content
4. **GeneralSummarizer Agent** - Creates summaries of various content types
5. **ChapterBacklinker Agent** - Creates bidirectional links between chapters and articles
6. **WikiGenOrchestrator** - Coordinates the complete workflow

## ArcSplitter Agent - Streaming Analysis

**🔄 Single LLM Call with Real-time Streaming!**

The ArcSplitter agent now supports **streaming analysis** that provides real-time feedback while accumulating the final structured result. This approach uses **only one LLM call** for both user experience and final parsing.

- ⚡ **Real-time Feedback** - See analysis progress as it happens
- 🚀 **Single LLM Call** - No wasteful duplicate API calls  
- 📊 **Live Updates** - Stream response chunks as they're generated
- 🎯 **Smart Accumulation** - Parse final accumulated result into structured data
- 💰 **Cost Efficient** - One call gives you both streaming UX and structured output


In [15]:
# Cell: 4 Test ArcSplitter Agent - New Streaming Architecture

print("\n🔄 Testing ArcSplitter Agent - NEW STREAMING ARCHITECTURE")
print("=" * 60)
print("This demonstrates the new clean separation of concerns:")
print("📡 Streaming: Raw LLM responses with proper chunk type labels")
print("🎯 State Management: Internal accumulation and parsing")
print("🔗 Result Access: Clean final result via get_final_result()")

if not AVAILABLE_PROVIDERS:
    print("❌ No API keys available. Please add API keys to your .env file.")
else:
    # provider = "google"  # Use Google for streaming demo    
    # api_key = AVAILABLE_PROVIDERS[provider]['api_key']
    # model = LLMService.get_default_test_model_name_for_provider(provider)
    # model = "claude-sonnet-4-20250514"
    # model = "gemini-2.5-flash-lite-preview-06-17"
    # model = "o4-mini"

    
    # print(f"🔧 Streaming with: {provider} / {model}")
    print(f"📊 Story: {input_story.metadata.title} ({len(input_story.chapters)} chapters)")
    
    try:
        # Initialize the agent
        arc_splitter_streaming = ArcSplitterAgent(
            llm_service=llm_service,
            # default_provider=provider,
            # default_model=model,
            temperature=0.7,
            max_tokens=16000,
            thinking=ThinkingEffort.LOW
        )
        
        print(f"\n⚡ Starting streaming analysis...")
        print("📡 Live stream output (by chunk type):")
        print("-" * 40)
        
        # Import ChunkType for proper enum usage
        from src.schemas.llm.models import ChunkType
        
        # Track streaming responses by type - Using enum constants
        chunk_counts = {ChunkType.THINKING: 0, ChunkType.CONTENT: 0, ChunkType.UNKNOWN: 0}
        total_content_chars = 0
        
        async for chunk in arc_splitter_streaming.analyze_story_streaming(
            story=input_story,
            user_id=fake_owner_id,
            api_key=api_key,
        ):
            # Use the enum directly - no need for .value
            chunk_type = chunk.chunk_type
            chunk_counts[chunk_type] += 1
            
            # Get metadata for debugging
            metadata = chunk.metadata or {}
            chunk_num = metadata.get("chunk_number", "?")
            chapters = metadata.get("chapters", "?")
            
            # Show chunk info - use .value for display
            print(f"[{chunk_type.value.upper()}] Chunk {chunk_num} ({chapters}): {len(chunk.content)} chars")
            
            # For CONTENT chunks, show a preview of the actual content
            if chunk_type == ChunkType.CONTENT and chunk.content:
                total_content_chars += len(chunk.content)
                # Show first bit of content for CONTENT chunks
                preview = chunk.content[:100].replace("\n", "\\n")
                print(f"  📝 Content preview: {preview}...")
            
            # For THINKING chunks, just show that we got thinking
            elif chunk_type == ChunkType.THINKING and chunk.content:
                print(f"  🤔 Thinking: {len(chunk.content)} chars")
        
        print("-" * 40)
        print(f"✅ Streaming completed!")
        print(f"📊 Chunk counts: {chunk_counts}")
        print(f"📝 Total content characters: {total_content_chars}")
        
        # Get internal chunk processing details
        chunk_results = arc_splitter_streaming.get_window_results()
        print(f"\n🔍 Internal Chunk Processing Summary:")
        for chunk_result in chunk_results:
            print(f"  Chunk {chunk_result.window_number}: {chunk_result.chapters_range}")
            raw = chunk_result.raw_content
            print(f"    📝 Accumulated: T={len(raw.thinking)}, C={len(raw.content)}, U={len(raw.unknown)} chars")
            if chunk_result.parsed_result:
                print(f"    ✅ Parsed: {len(chunk_result.parsed_result.arcs)} arcs")
            if chunk_result.error:
                print(f"    ❌ Error: {chunk_result.error}")
        
        # Get the final merged result (no manual parsing needed!)
        print(f"\n🎯 Getting final result...")
        try:
            final_result = arc_splitter_streaming.get_final_result()
            
            print(f"✅ Final result ready!")
            print(f"📖 Total arcs generated: {len(final_result.arcs)}")
            
            # Show all arcs
            for i, arc in enumerate(final_result.arcs):
                print(f"\n🏛️  Arc {i+1}: {arc.title}")
                print(f"   📖 Chapters: {arc.start_chapter}-{arc.end_chapter}")
                print(f"   📝 Summary: {arc.summary[:100]}...")
                print(f"   🔒 Finalized: {arc.is_finalized}")
                
        except Exception as parse_error:
            print(f"❌ Could not get final result: {parse_error}")
            print("💡 Check the chunk processing summary above for errors")
        
    except Exception as e:
        print(f"❌ Streaming test failed: {e}")
        import traceback
        traceback.print_exc()


🔄 Testing ArcSplitter Agent - NEW STREAMING ARCHITECTURE
This demonstrates the new clean separation of concerns:
📡 Streaming: Raw LLM responses with proper chunk type labels
🎯 State Management: Internal accumulation and parsing
🔗 Result Access: Clean final result via get_final_result()
📊 Story: Pokemon: Ambertwo (8 chapters)

⚡ Starting streaming analysis...
📡 Live stream output (by chunk type):
----------------------------------------
2025-06-29 16:17:57,477 - src.agents.wikigen.arc_splitter - INFO - 🔄 Starting new analysis: fe0f32e1-bad3-4b5e-8f49-4b8acf5c4b38
2025-06-29 16:17:57,478 - src.agents.wikigen.arc_splitter - INFO - 📐 Model Context Window: 1,048,576 tokens
2025-06-29 16:17:57,479 - src.agents.wikigen.arc_splitter - INFO - 📐 Chunk Limit: 1,043,576 tokens (after 5,000 overhead)
2025-06-29 16:17:57,480 - src.agents.wikigen.arc_splitter - INFO - 🔍 ArcSplitter Analysis Starting:
2025-06-29 16:17:57,480 - src.agents.wikigen.arc_splitter - INFO -    📖 Story: Pokemon: Ambertwo
202

In [None]:
print(chunk_results[0].raw_content.thinking)

In [None]:
print(final_result.model_dump_json(indent=2))

## WikiPlanner Agent

In [22]:
#
#
#

## ArticleWriterAgent

In [23]:
#
#
#

## GeneralSummarizerAgent

In [24]:
#
#
#

## WikiGenOrchestrator

In [25]:
#
#
#