# Intent LLM Playground

Use this notebook to exercise the intents classifier + slot filler end to end. Configure your Ollama endpoint, type some artifact text, and capture the raw responses before wiring them into the worker.

> Tip: keep Ollama running (`ollama serve`) and ensure the requested model is already pulled. You can also swap in a mock responder if you want deterministic outputs (see the optional section below).



In [1]:
import os
INTENT_MODEL = os.getenv("INTENT_MODEL", "llama3.2:latest")
DOCUMENT_ID = "775ecfe1-7508-44af-95a0-3029b8f7fc97"

# ---------------------------------------------------------------------------
# Database connection setup
# ---------------------------------------------------------------------------
# Override DATABASE_URL to use localhost when running locally (outside Docker)
# The default connection string uses 'postgres' hostname which only works inside Docker
if "DATABASE_URL" not in os.environ:
    # Default to localhost for local development
    os.environ["DATABASE_URL"] = "postgresql://postgres:postgres@localhost:5432/haven"
elif "@postgres:" in os.environ.get("DATABASE_URL", ""):
    # If DATABASE_URL uses 'postgres' hostname, replace with 'localhost' for local notebook usage
    os.environ["DATABASE_URL"] = os.environ["DATABASE_URL"].replace("@postgres:", "@localhost:")

DATABASE_URL = os.environ["DATABASE_URL"]
print(DATABASE_URL)

postgresql://postgres:postgres@localhost:5432/haven


In [2]:
from __future__ import annotations

import json
import sys
from pathlib import Path
from typing import Any, Dict, List, Optional, Union
from uuid import uuid4

import httpx

# ---------------------------------------------------------------------------
# Ensure the Haven project root (and src/) are importable before using modules
# ---------------------------------------------------------------------------
def resolve_project_root() -> Path:
    env_root = Path(os.getenv("HAVEN_PROJECT_ROOT", "")).expanduser()
    if env_root and (env_root / "src" / "haven").exists():
        return env_root

    cwd = Path.cwd().resolve()
    if (cwd / "src" / "haven").exists():
        return cwd

    if (cwd.parent / "src" / "haven").exists():
        return cwd.parent

    raise RuntimeError(
        "Unable to locate Haven project root. Set HAVEN_PROJECT_ROOT or launch the notebook from the repo root."
    )


PROJECT_ROOT = resolve_project_root()
SRC_PATH = PROJECT_ROOT / "src"
for candidate in (PROJECT_ROOT, SRC_PATH):
    path_str = str(candidate)
    if path_str not in sys.path:
        sys.path.insert(0, path_str)

from haven.intents.classifier.classifier import ClassifierSettings, classify_artifact
from haven.intents.classifier.taxonomy import IntentTaxonomy, load_taxonomy
from haven.intents.models import ClassificationResult
from haven.intents.slots import SlotFiller, SlotFillerResult, SlotFillerSettings
from haven.intents.utils import resolve_sender_name
from shared.logging import setup_logging
from shared.db import get_connection
from shared.models_v2 import Document
from shared.context_utilities import format_context, get_context_by_document_id
from uuid import UUID
import psycopg
from psycopg.rows import dict_row

# ---------------------------------------------------------------------------
# Configure runtime knobs here
# ---------------------------------------------------------------------------
OLLAMA_BASE_URL = 'http://localhost:11434' # os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
DEFAULT_TAXONOMY_PATH = PROJECT_ROOT / "services/worker_service/taxonomies/taxonomy_v1.0.0.yaml"

# Optional: point to an alternate taxonomy payload if you're testing new definitions
CUSTOM_TAXONOMY_PATH = os.getenv("INTENT_TAXONOMY_PATH")

from shared.db import get_conn_str

setup_logging()
print(f"Project root: {PROJECT_ROOT}")
print(f"Using Ollama endpoint: {OLLAMA_BASE_URL}")
print(f"Model: {INTENT_MODEL}")
print(f"Taxonomy: {CUSTOM_TAXONOMY_PATH or DEFAULT_TAXONOMY_PATH}")
print(f"Database: {DATABASE_URL}")



Project root: /Users/chrispatten/workspace/haven
Using Ollama endpoint: http://localhost:11434
Model: llama3.2:latest
Taxonomy: /Users/chrispatten/workspace/haven/services/worker_service/taxonomies/taxonomy_v1.0.0.yaml
Database: postgresql://postgres:postgres@localhost:5432/haven


In [3]:
context = get_context_by_document_id(DOCUMENT_ID)
print(format_context(context))


Message:
Chris Patten at 2025-11-05T18:05Z - ve got a work thing at Clarys just down the street so I can come grab you and if we have time to kill before the next train we can go back there

Previous Messages:
Chris Patten at 2025-11-05T18:04Z - "__kIMFileTransferGUIDAttributeName)at_0_B2C2CECB-7753-4BDE-968B-35590F83BD6E__kIMMessagePartAttributeNameNSNumber
Chris Patten at 2025-11-05T18:03Z - Dartmouth Street. The bus is super easy, you go past baggage claim and theres TONS of signs for the different busses. This one stops at the Logan express area curbside and is orange with a big BACK BAY painted on it
Stephanie Patten at 2025-11-05T18:02Z - ï¿¼Which location do I get off at if I can even find the bus at the airport
Chris Patten at 2025-11-05T12:07Z - 16;FHJL]_acegiktvxzï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½
Chris Patten at 2025-11-05T12:07Z - https://www.massport.com/logan-airport/getting-to-logan/logan-express/back-bay


Detected Entities:
Clarys - per

In [3]:
def test_database_connection() -> bool:
    """Test database connection and print helpful diagnostics."""
    try:
        print(f"Testing database connection...")
        
        with get_connection() as conn:
            with conn.cursor() as cur:
                cur.execute("SELECT version();")
                version = cur.fetchone()[0]
                print(f"âœ“ Database connection successful!")
                print(f"  PostgreSQL version: {version.split(',')[0]}")
                return True
    except psycopg.OperationalError as exc:
        print(f"âœ— Database connection failed: {exc}")
        print(f"\nTroubleshooting:")
        print(f"  1. Check if database is running: docker compose ps")
        print(f"  2. Start database if needed: docker compose up -d postgres")
        print(f"  3. Verify DATABASE_URL environment variable")
        print(f"  4. For Docker: ensure host is 'localhost' (not 'postgres')")
        return False
    except Exception as exc:
        print(f"âœ— Unexpected error: {exc}")
        print(f"  Error type: {type(exc).__name__}")
        return False


def load_intent_taxonomy(path: Optional[Path] = None) -> IntentTaxonomy:
    resolved = Path(path or CUSTOM_TAXONOMY_PATH or DEFAULT_TAXONOMY_PATH).expanduser()
    if not resolved.exists():
        raise FileNotFoundError(f"Taxonomy file not found: {resolved}")
    return load_taxonomy(resolved)


def build_slot_filler() -> SlotFiller:
    return SlotFiller(
        SlotFillerSettings(
            ollama_base_url=OLLAMA_BASE_URL,
            slot_model=INTENT_MODEL,
            request_timeout=30.0,
        )
    )


def fetch_document_by_id(doc_id: Union[str, UUID]) -> Optional[Document]:
    """Fetch a document by its doc_id from the database."""
    try:
        doc_uuid = UUID(str(doc_id))
    except ValueError as exc:
        print(f"Invalid document ID format: {doc_id}")
        return None
    
    try:
        with get_connection() as conn:
            with conn.cursor(row_factory=dict_row) as cur:
                cur.execute(
                    """
                    SELECT *
                    FROM documents
                    WHERE doc_id = %s
                    ORDER BY is_active_version DESC, version_number DESC
                    LIMIT 1
                    """,
                    (doc_uuid,),
                )
                row = cur.fetchone()
                if not row:
                    print(f"Document not found in database: {doc_id}")
                    return None
                return Document(**row)
    except psycopg.OperationalError as exc:
        print(f"Database connection error: {exc}")
        print(f"Current DATABASE_URL: {DATABASE_URL}")
        print("\nTip: Make sure:")
        print("  1. Database is running (check with: docker compose ps)")
        print("  2. DATABASE_URL environment variable is set correctly")
        print("  3. For Docker: use 'localhost' as host")
        print("  4. For local Postgres: ensure host/port/credentials are correct")
        return None
    except Exception as exc:
        print(f"Error fetching document: {exc}")
        print(f"Error type: {type(exc).__name__}")
        return None

def get_sender_name(doc: Document) -> Optional[str]:
    """Resolve the sender name for a document.
    
    If the document has a people entity with a sender role, use that.
    Otherwise, try to resolve the sender from the identifier in the document metadata.
    """
    people = doc.people or []
    if people:
        print(f"People: {people}")
        for person in people:
            if person.get("role") == "sender":
                print(f"Sender: {person}")
                return resolve_sender_name(person.get("identifier"), DATABASE_URL)
    return None
    
    

def fetch_thread_context(doc_id: UUID, limit: int = 5) -> Optional[List[Dict[str, Any]]]:
    """Fetch recent messages from the same thread for conversational context.
    
    Returns up to `limit` previous messages from the thread, ordered by timestamp.
    Each message includes text, sender (with resolved person names), and timestamp for pronoun resolution.
    
    This is a wrapper around the shared utility function from haven.intents.utils.
    """
    from haven.intents.utils import fetch_thread_context as _fetch_thread_context
    import os
    
    database_url = os.getenv("DATABASE_URL", "postgresql://postgres:postgres@localhost:5432/haven")
    
    try:
        return _fetch_thread_context(
            database_url=database_url,
            doc_id=doc_id,
            limit=limit,
            time_window_hours=8,  # Match notebook's original behavior
        )
    except Exception as exc:
        print(f"Warning: Failed to fetch thread context: {exc}")
        return None


def fetch_self_person_id() -> Optional[UUID]:
    """Fetch the self_person_id from system_settings using shared function."""
    try:
        from shared.people_repository import get_self_person_id_from_settings
        with get_connection() as conn:
            return get_self_person_id_from_settings(conn)
    except Exception as exc:
        print(f"Warning: Failed to fetch self_person_id: {exc}")
        return None

def run_intent_pipeline(
    *,
    text: str,
    entities: Dict[str, Any],
    source_type: str = "imessage",
    artifact_id: Optional[str] = None,
    client: Optional[httpx.Client] = None,
    slot_filler: Optional[SlotFiller] = None,
    taxonomy: Optional[IntentTaxonomy] = None,
    thread_context: Optional[List[Dict[str, Any]]] = None,
    content_timestamp: Optional[str] = None,
) -> tuple[ClassificationResult, SlotFillerResult]:
    taxonomy = taxonomy or load_intent_taxonomy()
    artifact_id = artifact_id or str(uuid4())
    owns_client = client is None

    if client is None:
        client = httpx.Client(base_url=OLLAMA_BASE_URL, timeout=30.0)
    if slot_filler is None:
        slot_filler = build_slot_filler()

    classifier_settings = ClassifierSettings(
        base_url=OLLAMA_BASE_URL,
        model=INTENT_MODEL,
        timeout=30.0,
        min_confidence=0.35,
    )

    try:
        classification = classify_artifact(
            text=text,
            taxonomy=taxonomy,
            entities=entities,
            settings=classifier_settings,
            client=client,
            thread_context=thread_context,
        )
        slot_result = slot_filler.fill_slots(
            job_text=text,
            classification=classification,
            taxonomy=taxonomy,
            entity_payload=entities,
            artifact_id=artifact_id,
            source_type=source_type,
            thread_context=thread_context,
            content_timestamp=content_timestamp,
        )
    finally:
        if owns_client:
            client.close()

    return classification, slot_result



In [4]:
print(DATABASE_URL)

test_database_connection()

postgresql://postgres:postgres@localhost:5432/haven
Testing database connection...
âœ“ Database connection successful!
  PostgreSQL version: PostgreSQL 15.14 (Debian 15.14-1.pgdg12+1) on aarch64-unknown-linux-gnu


True

## Sample artifact + entities

Adjust the text/entities below to mirror the document you want to test. Including `channel_context.from` / `channel_context.to` helps the model resolve pronouns for messaging/email content.



In [5]:
sample_text = """
Hey Chris â€” can you remember to grab dog food for me before tomorrow night? 
I'll be tied up with meetings until late Wednesday.
""".strip()

sample_entities = {
    "people": [
        {
            "normalizedValue": "Chris",
            "identifier": "imessage:+15551234567",
            "role": "recipient",
        }
    ],
    "dates": [
        {
            "normalizedValue": "2025-11-12T21:00:00-05:00",
            "entity": {"text": "tomorrow night"},
        }
    ],
    "channel_context": {
        "from": {
            "display_name": "Alex",
            "identifier": "imessage:+15550987654",
        },
        "to": [
            {
                "display_name": "Chris",
                "identifier": "imessage:+15551234567",
            }
        ],
    },
}

print(sample_text)
print(json.dumps(sample_entities, indent=2))



Hey Chris â€” can you remember to grab dog food for me before tomorrow night? 
I'll be tied up with meetings until late Wednesday.
{
  "people": [
    {
      "normalizedValue": "Chris",
      "identifier": "imessage:+15551234567",
      "role": "recipient"
    }
  ],
  "dates": [
    {
      "normalizedValue": "2025-11-12T21:00:00-05:00",
      "entity": {
        "text": "tomorrow night"
      }
    }
  ],
  "channel_context": {
    "from": {
      "display_name": "Alex",
      "identifier": "imessage:+15550987654"
    },
    "to": [
      {
        "display_name": "Chris",
        "identifier": "imessage:+15551234567"
      }
    ]
  }
}


In [6]:
# Run full pipeline with comprehensive output and debugging
from haven.intents.slots.extractor import LLMSlotExtractor, SlotExtractorSettings
from haven.intents.slots.filler import SlotFiller, SlotFillerSettings

if 1 == 2:
    # Run the full pipeline
    classification, slot_result = run_intent_pipeline(
        text=sample_text,
        entities=sample_entities,
        source_type="imessage",
    )

    # Show detailed slot extraction debugging for the first intent (if there are missing slots)
    if classification.intents and slot_result.assignments:
        taxonomy = load_intent_taxonomy()
        top_intent = classification.intents[0]
        top_assignment = slot_result.assignments[0]
        intent_def = taxonomy.intents.get(top_intent.intent_name)
        
        if intent_def and top_assignment.missing_slots:
            slot_filler = build_slot_filler()
            extractor = slot_filler._extractor
            
            # Build the prompt that would be used for missing slots
            prompt = extractor._build_prompt(
                intent_name=top_intent.intent_name,
                intent_definition=intent_def,
                text=sample_text,
                entities=sample_entities,
                existing_slots=top_assignment.slots,
                missing_slots=top_assignment.missing_slots,
                classification_notes=classification.processing_notes or [],
                thread_context=None,  # Note: thread_context not used in sample cell, but available in doc_id cell
            )
            
            print("=" * 80)
            print("SLOT EXTRACTION DEBUGGING (for missing slots)")
            print("=" * 80)
            print("\nPrompt sent to LLM:")
            print("-" * 80)
            print(prompt)
            
            print("\n" + "-" * 80)
            print("RAW OLLAMA SLOT EXTRACTION RESPONSE")
            print("-" * 80)
            payload = {
                "model": INTENT_MODEL,
                "prompt": prompt,
                "format": "json",
                "stream": False,
            }
            raw_response = extractor._invoke_ollama(payload)
            print(raw_response)
            
            print("\n" + "-" * 80)
            print("PARSED SLOT EXTRACTION JSON")
            print("-" * 80)
            parsed = extractor._parse_response(raw_response)
            print(json.dumps(parsed, indent=2))
            
            if "slots" in parsed:
                print("\n" + "-" * 80)
                print("EXTRACTED SLOTS FROM LLM")
                print("-" * 80)
                print(json.dumps(parsed["slots"], indent=2))
                for slot_name in top_assignment.missing_slots:
                    if slot_name in parsed["slots"]:
                        print(f"\nâœ“ '{slot_name}' slot found: {repr(parsed['slots'][slot_name])}")
                    else:
                        print(f"\nâœ— '{slot_name}' slot NOT found in response")
            
            if "notes" in parsed:
                print("\n" + "-" * 80)
                print("EXTRACTION NOTES FROM LLM")
                print("-" * 80)
                for note in parsed["notes"]:
                    print(f"  â€¢ {note}")
            
            print("\n" + "=" * 80)

    # Show formatted classifier and slot filling output
    print("CLASSIFIER OUTPUT")
    print("=" * 80)
    for intent in classification.intents:
        print(f"\nâœ“ Intent: {intent.intent_name}")
        print(f"  Confidence: {intent.confidence:.2%} (base: {intent.base_confidence:.2%}, prior: {intent.prior_applied:.2f}x)")
        if intent.reasons:
            print(f"  Reasons:")
            for reason in intent.reasons:
                print(f"    â€¢ {reason}")

    print("\n" + "=" * 80)
    print("SLOT FILLING OUTPUT")
    print("=" * 80)
    for assignment in slot_result.assignments:
        print(f"\nðŸ“‹ Intent: {assignment.intent_name} ({assignment.confidence:.2%})")
        
        if assignment.slots:
            print(f"  âœ“ Resolved Slots:")
            for slot_name, value in assignment.slots.items():
                source = assignment.slot_sources.get(slot_name, "?")
                if isinstance(value, (dict, list)):
                    print(f"    â€¢ {slot_name} (from {source}): {json.dumps(value)}")
                else:
                    print(f"    â€¢ {slot_name} (from {source}): {value}")
        
        if assignment.missing_slots:
            print(f"  âš  Missing Required Slots: {', '.join(assignment.missing_slots)}")
        
        if assignment.notes:
            print(f"  â„¹ Notes:")
            for note in assignment.notes:
                print(f"    â€¢ {note}")

    print("\n" + "=" * 80)
    print("SUMMARY")
    print("=" * 80)
    print(f"Text analyzed: '{sample_text[:60]}...'")
    print(f"Source type: imessage")
    print(f"Channel context: from={sample_entities['channel_context']['from']['display_name']}, to={[p['display_name'] for p in sample_entities['channel_context']['to']]}")
    print(f"Intents detected: {len(classification.intents)}")
    print(f"Top intent: {classification.intents[0].intent_name if classification.intents else 'none'}")
    if slot_result.assignments:
        top_assignment = slot_result.assignments[0]
        print(f"Resolved slots: {len(top_assignment.slots)} / {len(top_assignment.slots) + len(top_assignment.missing_slots)}")



In [7]:
fetch_thread_context(doc_id=UUID('4061ec98-f56e-4e05-86ae-4036129631f1'), limit=10)

In [8]:
# Set the document_id to process
document_id = DOCUMENT_ID  # Replace with actual UUID

# Fetch the document
doc = fetch_document_by_id(document_id)
if not doc:
    print(f"Document not found: {document_id}")
else:
    print("=" * 80)
    print(f"PROCESSING DOCUMENT: {doc.doc_id}")
    print("=" * 80)
    print(f"External ID: {doc.external_id}")
    print(f"Source Type: {doc.source_type}")
    print(f"Thread ID: {doc.thread_id}")
    print(f"Text preview: {doc.text[:200]}...")
    
    # Extract entities from document metadata
    doc_entities = {}
    
    # Extract channel context from metadata
    metadata = doc.metadata or {}
    channel_meta = metadata.get("channel") or {}
    if channel_meta:
        doc_entities["channel_context"] = {
            "from": channel_meta.get("from") or channel_meta.get("sender"),
            "to": channel_meta.get("to") or [],
        }
            
    
    # Extract dates/entities from metadata if available
    if "dates" in metadata:
        doc_entities["dates"] = metadata["dates"]
    if "places" in metadata:
        doc_entities["places"] = metadata["places"]

    sender_name = get_sender_name(doc)
    if sender_name:
        doc_text = f"From {sender_name}: {doc.text}"
    else:
        doc_text = doc.text
    
    # Fetch thread context if available
    thread_context = None
    if doc.thread_id:
        thread_context = fetch_thread_context(doc.doc_id, limit=10)
        if thread_context:
            print(f"\nâœ“ Found {len(thread_context)} previous messages in thread")
            print("Thread context preview:")
            for i, msg in enumerate(thread_context[-3:], 1):  # Show last 3
                sender = msg.get("sender", "unknown")
                text_preview = msg.get("text", "")[:100]
                print(f"  {i}. [{sender}]: {text_preview}...")
        else:
            print("\nâš  No thread context found (document may be first in thread)")
    
    # Run the intent pipeline with thread context
    print("\n" + "=" * 80)
    print("RUNNING INTENT PIPELINE")
    print("=" * 80)
    
    # Format content_timestamp for passing to slot filler
    content_timestamp_str = None
    if doc.content_timestamp:
        if isinstance(doc.content_timestamp, str):
            content_timestamp_str = doc.content_timestamp
        else:
            # Convert datetime to ISO string
            content_timestamp_str = doc.content_timestamp.isoformat()
    
    classification, slot_result = run_intent_pipeline(
        text=doc_text,
        entities=doc_entities,
        source_type=doc.source_type,
        artifact_id=str(doc.doc_id),
        thread_context=thread_context,
        content_timestamp=content_timestamp_str,
    )
    
    # Show detailed slot extraction debugging for the first intent (if there are missing slots)
    if classification.intents and slot_result.assignments:
        taxonomy = load_intent_taxonomy()
        top_intent = classification.intents[0]
        top_assignment = slot_result.assignments[0]
        intent_def = taxonomy.intents.get(top_intent.intent_name)
        
        # Debug: Show what slots were requested for extraction
        print("\n" + "=" * 80)
        print("SLOT EXTRACTION DEBUG INFO")
        print("=" * 80)
        print(f"Intent: {top_intent.intent_name}")
        print(f"Filled slots: {list(top_assignment.slots.keys())}")
        print(f"Missing required slots: {top_assignment.missing_slots}")
        if intent_def:
            all_optional = [name for name, defn in intent_def.slots.items() if not defn.required]
            missing_optional = [name for name in all_optional if name not in top_assignment.slots]
            print(f"Missing optional slots: {missing_optional}")
            if top_intent.intent_name == "schedule.create":
                print(f"Title slot exists in taxonomy: {'title' in intent_def.slots}")
                print(f"Title slot filled: {'title' in top_assignment.slots}")
        print("=" * 80)
        
        # Show what slots should be extracted (including optional ones like title)
        if intent_def:
            all_missing = top_assignment.missing_slots.copy()
            # Add optional slots that aren't filled
            for slot_name, slot_def in intent_def.slots.items():
                if not slot_def.required and slot_name not in top_assignment.slots:
                    if slot_name not in all_missing:
                        all_missing.append(slot_name)
            # For schedule.create, ensure title is included
            if top_intent.intent_name == "schedule.create" and "title" not in top_assignment.slots:
                if "title" not in all_missing:
                    all_missing.append("title")
            
            print(f"\nSlots that SHOULD be extracted: {all_missing}")
            print(f"  Required missing: {top_assignment.missing_slots}")
            print(f"  Optional missing: {[s for s in all_missing if s not in top_assignment.missing_slots]}")
        
        if intent_def and top_assignment.missing_slots:
            slot_filler = build_slot_filler()
            extractor = slot_filler._extractor
            
            # Build the prompt that would be used for missing slots (including optional ones)
            # Reconstruct what the extraction targets should be
            extraction_targets = top_assignment.missing_slots.copy()
            for slot_name, slot_def in intent_def.slots.items():
                if not slot_def.required and slot_name not in top_assignment.slots:
                    if slot_name not in extraction_targets:
                        extraction_targets.append(slot_name)
            # For schedule.create, ensure title is prioritized
            if top_intent.intent_name == "schedule.create" and "title" not in top_assignment.slots:
                if "title" in extraction_targets:
                    extraction_targets.remove("title")
                    extraction_targets.insert(len(top_assignment.missing_slots), "title")  # Insert after required slots
                elif "title" in intent_def.slots:
                    extraction_targets.insert(len(top_assignment.missing_slots), "title")
            
            print(f"\nBuilding prompt for extraction targets: {extraction_targets}")
            
            # Build the prompt that would be used for missing slots
            prompt = extractor._build_prompt(
                intent_name=top_intent.intent_name,
                intent_definition=intent_def,
                text=doc.text,
                entities=doc_entities,
                existing_slots=top_assignment.slots,
                missing_slots=extraction_targets,  # Use the full list including optional slots
                classification_notes=classification.processing_notes or [],
                thread_context=thread_context,
                content_timestamp=content_timestamp_str,
            )
            
            print("\n" + "=" * 80)
            print("SLOT EXTRACTION DEBUGGING (for missing slots)")
            print("=" * 80)
            print("\nPrompt sent to LLM:")
            print("-" * 80)
            print(prompt[:2000] + "..." if len(prompt) > 2000 else prompt)
            
            print("\n" + "-" * 80)
            print("RAW OLLAMA SLOT EXTRACTION RESPONSE")
            print("-" * 80)
            payload = {
                "model": INTENT_MODEL,
                "prompt": prompt,
                "format": "json",
                "stream": False,
            }
            raw_response = extractor._invoke_ollama(payload)
            print(raw_response)
            
            print("\n" + "-" * 80)
            print("PARSED SLOT EXTRACTION JSON")
            print("-" * 80)
            parsed = extractor._parse_response(raw_response)
            print(json.dumps(parsed, indent=2))
            
            if "slots" in parsed:
                print("\n" + "-" * 80)
                print("EXTRACTED SLOTS FROM LLM")
                print("-" * 80)
                print(json.dumps(parsed["slots"], indent=2))
                for slot_name in top_assignment.missing_slots:
                    if slot_name in parsed["slots"]:
                        print(f"\nâœ“ '{slot_name}' slot found: {repr(parsed['slots'][slot_name])}")
                    else:
                        print(f"\nâœ— '{slot_name}' slot NOT found in response")
            
            if "notes" in parsed:
                print("\n" + "-" * 80)
                print("EXTRACTION NOTES FROM LLM")
                print("-" * 80)
                for note in parsed["notes"]:
                    print(f"  â€¢ {note}")
            
            print("\n" + "=" * 80)
    
    # Show formatted classifier and slot filling output
    print("CLASSIFIER OUTPUT")
    print("=" * 80)
    for intent in classification.intents:
        print(f"\nâœ“ Intent: {intent.intent_name}")
        print(f"  Confidence: {intent.confidence:.2%} (base: {intent.base_confidence:.2%}, prior: {intent.prior_applied:.2f}x)")
        if intent.reasons:
            print(f"  Reasons:")
            for reason in intent.reasons:
                print(f"    â€¢ {reason}")
    
    print("\n" + "=" * 80)
    print("SLOT FILLING OUTPUT")
    print("=" * 80)
    for assignment in slot_result.assignments:
        print(f"\nðŸ“‹ Intent: {assignment.intent_name} ({assignment.confidence:.2%})")
        
        if assignment.slots:
            print(f"  âœ“ Resolved Slots:")
            for slot_name, value in assignment.slots.items():
                source = assignment.slot_sources.get(slot_name, "?")
                if isinstance(value, (dict, list)):
                    print(f"    â€¢ {slot_name} (from {source}): {json.dumps(value)}")
                else:
                    print(f"    â€¢ {slot_name} (from {source}): {value}")
        
        if assignment.missing_slots:
            print(f"  âš  Missing Required Slots: {', '.join(assignment.missing_slots)}")
        
        if assignment.notes:
            print(f"  â„¹ Notes:")
            for note in assignment.notes:
                print(f"    â€¢ {note}")
    
    print("\n" + "=" * 80)
    print("SUMMARY")
    print("=" * 80)
    print(f"Document ID: {doc.doc_id}")
    print(f"External ID: {doc.external_id}")
    print(f"Source type: {doc.source_type}")
    print(f"Thread ID: {doc.thread_id}")
    print(f"Thread context messages: {len(thread_context) if thread_context else 0}")
    print(f"Text analyzed: '{doc.text[:60]}...'")
    print(f"Intents detected: {len(classification.intents)}")
    print(f"Top intent: {classification.intents[0].intent_name if classification.intents else 'none'}")
    if slot_result.assignments:
        top_assignment = slot_result.assignments[0]
        print(f"Resolved slots: {len(top_assignment.slots)} / {len(top_assignment.slots) + len(top_assignment.missing_slots)}")
    
    # Show the final intent output that will be stored in the database
    print("\n" + "=" * 80)
    print("FINAL INTENT OUTPUT (for database storage)")
    print("=" * 80)
    
    # Build the intent JSONB structure that matches what will be stored
    intent_payload = {
        "taxonomy_version": classification.taxonomy_version,
        "classification": {
            "intents": [
                {
                    "intent_name": intent.intent_name,
                    "confidence": intent.confidence,
                    "base_confidence": intent.base_confidence,
                    "prior_applied": intent.prior_applied,
                    "reasons": intent.reasons,
                }
                for intent in classification.intents
            ],
            "processing_notes": classification.processing_notes or [],
        },
        "slot_assignments": [
            {
                "intent_name": assignment.intent_name,
                "confidence": assignment.confidence,
                "slots": assignment.slots,
                "missing_slots": assignment.missing_slots,
                "slot_sources": assignment.slot_sources,
                "notes": assignment.notes,
            }
            for assignment in slot_result.assignments
        ],
        "slot_filling_notes": slot_result.notes or [],
    }
    
    print("\nIntent JSONB payload (documents.intent column):")
    print(json.dumps(intent_payload, indent=2, default=str))
   


PROCESSING DOCUMENT: 12c4133d-991d-4afb-8dcc-5076973c0110
External ID: imessage:4C99B3B9-BA1D-4537-92DB-C59D22A0E3A6
Source Type: imessage
Thread ID: 4c14bb3a-45ee-4f86-ba53-162549a97265
Text preview: Hey if u think of it could you grab me a couple more of those tank things?...
People: [{'role': 'sender', 'metadata': {}, 'identifier': 'E:mrwhistler@gmail.com', 'identifier_type': 'email'}, {'role': 'recipient', 'metadata': {}, 'identifier': '+17742531317', 'identifier_type': 'phone'}]
Sender: {'role': 'sender', 'metadata': {}, 'identifier': 'E:mrwhistler@gmail.com', 'identifier_type': 'email'}

âš  No thread context found (document may be first in thread)

RUNNING INTENT PIPELINE

INTENT CLASSIFICATION PROMPT (sent to Ollama)
You are an intent classification assistant. Given the following artifact text, extracted entities, and optional conversation context, identify which intents from the provided taxonomy are present.

Return a JSON object with this structure:
{
  "intents": [
    {"na