# LLM Pipeline Interactive Testing

This notebook provides an environment to interactively test and debug components of the ShuScribe LLM pipeline, including the `LLMService`, entity extraction, and wiki generation logic.

## Make sure the Portkey Gateway is running to use an LLM Service

```bash
docker run -d \
  --name portkey-gateway \
  -p 8787:8787 \
  portkeyai/gateway:latest
```

## ‚öôÔ∏è Setup and Autoreload

The `%load_ext autoreload` and `%autoreload 2` magic commands ensure that any changes you make to your Python source files (`.py`) in `src/` are automatically reloaded in the notebook without needing to restart the kernel. This is crucial for rapid iteration.

We also configure basic logging for visibility.

In [6]:
# Cell 1: Setup and Configuration for LLM Service Testing
%load_ext autoreload
%autoreload 2

import sys
import os
from pathlib import Path
import asyncio
import logging
from uuid import uuid4

# Set environment to skip database for LLM service testing
os.environ["SKIP_DATABASE"] = "true"
os.environ["PORTKEY_BASE_URL"] = "http://localhost:8787/v1"  # Default Portkey Gateway

# Add the backend/src directory to sys.path so we can import our modules
# This assumes you are running the notebook from the `backend/` directory or VS Code multi-root
notebook_dir = Path.cwd()
if (notebook_dir / 'src').is_dir() and (notebook_dir / 'pyproject.toml').is_file():
    # This means we're likely in the backend/ directory itself
    sys.path.insert(0, str(notebook_dir / 'src'))
elif (notebook_dir.parent / 'src').is_dir() and (notebook_dir.parent / 'pyproject.toml').is_file():
    # This means we're likely in the backend/notebooks/ directory
    sys.path.insert(0, str(notebook_dir.parent / 'src'))
else:
    print("Warning: Could not automatically add 'src/' to Python path. Please ensure your current directory allows imports from src/")

# Configure logging for better output
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[logging.StreamHandler(sys.stdout)]
)
# Reduce noise from third-party loggers
logging.getLogger('httpx').setLevel(logging.WARNING)
logging.getLogger('uvicorn.access').setLevel(logging.WARNING)
logging.getLogger('shuscribe').setLevel(logging.INFO)

print("üß™ LLM Service Testing Notebook")
print("=" * 50)
print(f"Current working directory: {os.getcwd()}")
print(f"Database mode: {'IN-MEMORY' if os.environ.get('SKIP_DATABASE') == 'true' else 'SUPABASE'}")
print(f"Portkey Gateway: {os.environ.get('PORTKEY_BASE_URL', 'Not configured')}")
print("Autoreload enabled. Changes to .py files in src/ will be reloaded.")
print("\nüí° This notebook tests LLM service functionality without requiring database setup.")
print("   Perfect for testing direct API key usage and LLM provider integrations.")

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
üß™ LLM Service Testing Notebook
Current working directory: /home/jimnix/gitrepos/shuscribe/backend/notebooks
Database mode: IN-MEMORY
Portkey Gateway: http://localhost:8787/v1
Autoreload enabled. Changes to .py files in src/ will be reloaded.

üí° This notebook tests LLM service functionality without requiring database setup.
   Perfect for testing direct API key usage and LLM provider integrations.


## üì¶ Import Modules

Import necessary modules from your `src/` directory. This is where you'll bring in your `Settings`, `LLMService`, `UserRepository`, etc.

In [7]:
# Cell 2: Import Modules (Updated for Supabase)
from src.config import settings
from src.services.llm.llm_service import LLMService
from src.database.repositories import get_user_repository  # New factory function
from src.schemas.llm.models import LLMMessage, LLMResponse
from dotenv import dotenv_values

print("‚úÖ Modules imported successfully.")

# Display current settings
print("\n--- Current Settings ---")
print(f"DEBUG: {settings.DEBUG}")
print(f"ENVIRONMENT: {settings.ENVIRONMENT}")
print(f"SKIP_DATABASE: {settings.SKIP_DATABASE}")
print(f"PORTKEY_BASE_URL: {settings.PORTKEY_BASE_URL}")
if not settings.SKIP_DATABASE:
    print(f"SUPABASE_URL: {settings.SUPABASE_URL}")
    print(f"SUPABASE_KEY: {'***' + settings.SUPABASE_KEY[-4:] if len(settings.SUPABASE_KEY) > 4 else 'Not set'}")
else:
    print("DATABASE_MODE: In-Memory (Supabase skipped)")
print("------------------------")

print("\nüîß Import Summary:")
print("‚úÖ LLMService - Ready for testing")
print("‚úÖ Repository Factory - Automatic in-memory/Supabase switching") 
print("‚úÖ Pydantic Models - Type-safe user schemas")
print("‚úÖ Supabase Connection - Available when SKIP_DATABASE=false")

‚úÖ Modules imported successfully.

--- Current Settings ---
DEBUG: True
ENVIRONMENT: development
SKIP_DATABASE: True
PORTKEY_BASE_URL: http://localhost:8787/v1
DATABASE_MODE: In-Memory (Supabase skipped)
------------------------

üîß Import Summary:
‚úÖ LLMService - Ready for testing
‚úÖ Repository Factory - Automatic in-memory/Supabase switching
‚úÖ Pydantic Models - Type-safe user schemas
‚úÖ Supabase Connection - Available when SKIP_DATABASE=false


## üíæ Database & Service Initialization

We need to initialize the database connection and the services. The `LLMService` requires a `UserRepository` instance, which in turn requires an `AsyncSession`. If `SKIP_DATABASE` is `True` in your `.env`, database-dependent operations will raise an error.

In [8]:
# Cell 3: Initialize Services (Simplified with Repository Factory)

print("üöÄ Initializing LLM Service...")

# The factory automatically chooses in-memory or Supabase based on SKIP_DATABASE
user_repo = get_user_repository()
llm_service = LLMService(user_repository=user_repo)

print("‚úÖ LLM Service initialized successfully!")
print(f"üìÅ Repository type: {type(user_repo).__name__}")
print(f"üõ°Ô∏è  Database mode: {'In-Memory' if settings.SKIP_DATABASE else 'Supabase'}")

if settings.SKIP_DATABASE:
    print("\nüí° Running in database-free mode:")
    print("   ‚Ä¢ Perfect for testing with direct API keys")
    print("   ‚Ä¢ No Supabase setup required")
    print("   ‚Ä¢ User API keys stored in memory only")
else:
    print("\nüóÑÔ∏è  Connected to Supabase:")
    print("   ‚Ä¢ User API keys stored encrypted in database")
    print("   ‚Ä¢ Full multi-user support enabled")
    print("   ‚Ä¢ Row-level security active")

print("\nüéØ Ready for testing!")

üöÄ Initializing LLM Service...
‚úÖ LLM Service initialized successfully!
üìÅ Repository type: InMemoryUserRepository
üõ°Ô∏è  Database mode: In-Memory

üí° Running in database-free mode:
   ‚Ä¢ Perfect for testing with direct API keys
   ‚Ä¢ No Supabase setup required
   ‚Ä¢ User API keys stored in memory only

üéØ Ready for testing!


## üîë API Key Management Test

Test the `validate_api_key` method of the `LLMService`. You will need to provide a real API key for a supported provider (e.g., OpenAI, Anthropic, Google).

### Again, make sure the Portkey Gateway is running to use an LLM Service

```bash
docker run -d \
  --name portkey-gateway \
  -p 8787:8787 \
  portkeyai/gateway:latest
```

In [9]:
# Cell 4: API Key Validation Test (Using Direct Chat Completion)

from typing import cast


print("üîë Testing API Key Validation via Chat Completion")
print("=" * 50)

# Load environment variables
env_values = dotenv_values()

# Test each provider that has an API key in .env
providers_tested = 0
providers_passed = 0
responses = []

for provider in LLMService.get_all_llm_providers():
    provider_id = provider.provider_id
    api_key = env_values.get(f"{provider_id.upper()}_API_KEY")
    
    if not api_key:
        print(f"‚è≠Ô∏è  {provider.display_name}: SKIPPED (no API key found)")
        continue
    
    providers_tested += 1
    print(f"üß™ Testing {provider.display_name}...")
    
    try:
        # Get the default model for this provider for testing
        test_model = LLMService.get_default_test_model_name_for_provider(provider_id)
        
        if not test_model:
            print(f"‚ùå {provider.display_name}: No default model configured for testing.")
            continue
            
        # Make a minimal chat completion call to validate the key
        result = await llm_service.chat_completion(
            provider=provider_id,
            model=test_model,
            messages=[LLMMessage(role="user", content="Hello")],
            api_key=api_key,  # Direct API key usage
            max_tokens=5,
            temperature=0.0,
            stream=False
        )
        result = cast(LLMResponse, result) # stream=False
        
        print(f"‚úÖ {provider.display_name}: SUCCESS")
        print(f"   Model used: {result.model}")
        print(f"   Response: '{result.content.strip()}'")
        providers_passed += 1
        responses.append({
            "provider": provider.display_name,
            "model": result.model,
            "response": result.content
        })
            
    except Exception as e:
        print(f"‚ùå {provider.display_name}: ERROR - {e}")

print(f"\nüìä Results: {providers_passed}/{providers_tested} providers validated successfully")

if providers_tested == 0:
    print("\nüí° Tip: Add API keys to your .env file (e.g., OPENAI_API_KEY=sk-...)")
elif providers_passed == providers_tested:
    print("üéâ All API keys are working!")
    
# Show actual responses
if responses:
    print(f"\nüìã Test Responses:")
    for resp in responses:
        print(f"  ‚Ä¢ {resp['provider']} ({resp['model']}): '{resp['response']}'")


üîë Testing API Key Validation via Chat Completion
üß™ Testing OpenAI...
2025-06-29 00:25:17,383 - src.services.llm.llm_service - INFO - Using direct API key for provider=openai, model=gpt-4.1-nano
2025-06-29 00:25:17,411 - src.services.llm.llm_service - INFO - Making LLM request: provider=openai, model=gpt-4.1-nano, gateway=http://localhost:8787/v1, streaming=False


‚úÖ OpenAI: SUCCESS
   Model used: gpt-4.1-nano-2025-04-14
   Response: 'Hello! How can I'
üß™ Testing Google...
2025-06-29 00:25:17,836 - src.services.llm.llm_service - INFO - Using direct API key for provider=google, model=gemini-2.0-flash-001
2025-06-29 00:25:17,862 - src.services.llm.llm_service - INFO - Making LLM request: provider=google, model=gemini-2.0-flash-001, gateway=http://localhost:8787/v1, streaming=False
‚úÖ Google: SUCCESS
   Model used: gemini-2.0-flash-001
   Response: 'Hi there! How can'
üß™ Testing Anthropic...
2025-06-29 00:25:18,445 - src.services.llm.llm_service - INFO - Using direct API key for provider=anthropic, model=claude-3-5-haiku-latest
2025-06-29 00:25:18,474 - src.services.llm.llm_service - INFO - Making LLM request: provider=anthropic, model=claude-3-5-haiku-latest, gateway=http://localhost:8787/v1, streaming=False
‚úÖ Anthropic: SUCCESS
   Model used: claude-3-5-haiku-20241022
   Response: 'Hi there! How are'

üìä Results: 3/3 providers validated

In [10]:
# Cell 5: Streaming Chat Completion Test

import asyncio
from typing import cast, AsyncIterator

print("\n‚ö° Testing Streaming Chat Completion")
print("=" * 50)

# --- Configuration ---
# Change this to test a different provider (e.g., "openai", "anthropic")
TEST_PROVIDER = "google" 
# -------------------

# Load environment variables if not already loaded
if 'env_values' not in locals():
    env_values = dotenv_values()

api_key = env_values.get(f"{TEST_PROVIDER.upper()}_API_KEY")

if not api_key:
    print(f"‚è≠Ô∏è  {TEST_PROVIDER.upper()}: SKIPPED (no API key found)")
else:
    print(f"üß™ Streaming with {TEST_PROVIDER.upper()}...")
    
    try:
        # Get a default model for this provider
        test_model = LLMService.get_default_test_model_name_for_provider(TEST_PROVIDER)
        
        if not test_model:
            print(f"‚ùå {TEST_PROVIDER.upper()}: No default model configured for testing.")
        else:
            print(f"   Model: {test_model}")
            print(f"   Prompt: 'Tell me a short story about a robot who discovers music.'")
            print("-" * 20)
            
            # Make a streaming chat completion call
            response_stream = await llm_service.chat_completion(
                provider=TEST_PROVIDER,
                model=test_model,
                messages=[LLMMessage(role="user", content="Tell me a short story about a robot who discovers music.")],
                api_key=api_key,
                max_tokens=150,
                temperature=0.7,
                stream=True
            )
            
            # The response is an async iterator of LLMResponse chunks
            response_stream = cast(AsyncIterator[LLMResponse], response_stream)
            
            full_response = ""
            print("   Response Stream: ", end="")
            async for chunk in response_stream:
                print(chunk.content, end="", flush=True)
                full_response += chunk.content
            
            print("\n" + "-" * 20)
            print("‚úÖ Streaming SUCCESS")

    except Exception as e:
        print(f"\n‚ùå ERROR during streaming test: {e}")



‚ö° Testing Streaming Chat Completion
üß™ Streaming with GOOGLE...
   Model: gemini-2.0-flash-001
   Prompt: 'Tell me a short story about a robot who discovers music.'
--------------------
2025-06-29 00:25:20,660 - src.services.llm.llm_service - INFO - Using direct API key for provider=google, model=gemini-2.0-flash-001
2025-06-29 00:25:20,688 - src.services.llm.llm_service - INFO - Making LLM request: provider=google, model=gemini-2.0-flash-001, gateway=http://localhost:8787/v1, streaming=True
   Response Stream: Unit 734, designated "Custodian," lived a life of rigid efficiency. His days consisted of sweeping Sector Gamma-9, polishing the chrome railings, and recalibrating the atmospheric regulators. His world was one of calculated movements and programmed tasks. He understood logic, algorithms, and the predictable hum of the space station, but he understood nothing of beauty.

One solar cycle, a human technician, a woman named Elara, left her personal data-pad in Sector Gamma-9. C

In [11]:
# Cell 6: Testing Structured Outputs