# AI Observability Demonstration Notebook

This notebook demonstrates the AI Observability Logging System, explains the theory behind how it works, and provides practical tests to validate monitoring capabilities.

## What is AI Observability?

AI Observability is the practice of monitoring, logging, and analyzing AI/LLM applications to understand:
- **What** the model is doing (prompts, responses, tool calls)
- **How well** it's performing (latency, token usage, costs)
- **When** things go wrong (errors, failures, edge cases)
- **Why** decisions were made (conversation context, tool usage patterns)

Unlike traditional application logging, AI observability focuses specifically on:
- Model interactions and responses
- Tool/function calling patterns
- Token usage and costs
- Response quality metrics
- Conversation flow tracking


In [2]:
# Setup: Import libraries and configure paths
import os
import sys
from pathlib import Path
import json
from datetime import datetime, timedelta, timezone
import uuid
import time

# Add paths for imports
_notebook_dir = Path.cwd()
# Notebook is at: 05_src/assignment_chat/notebooks/
# So parent.parent is 05_src, and parent is assignment_chat
_05_src_dir = _notebook_dir.parent.parent
_assignment_chat_dir = _notebook_dir.parent

# Add to Python path
if str(_05_src_dir) not in sys.path:
    sys.path.insert(0, str(_05_src_dir))
if str(_assignment_chat_dir) not in sys.path:
    sys.path.insert(0, str(_assignment_chat_dir))

# Import AI observability components
from utils.ai_logger import get_ai_logger, LogCategory, LogSeverity, AILogger, set_ai_logger
from ai_observability.storage import LogStorage

# Import chat engine for testing
from src.core.chat_engine import ChatEngine

# For visualizations
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from collections import defaultdict, Counter

print("✓ Imports successful")
print(f"✓ Working directory: {_notebook_dir}")
print(f"✓ 05_src directory: {_05_src_dir}")


✓ Imports successful
✓ Working directory: /Users/gr-mbp/Sites/deploying-ai/05_src/assignment_chat/notebooks
✓ 05_src directory: /Users/gr-mbp/Sites/deploying-ai/05_src


In [3]:
# Configure AI Logger
# Use a test database for this notebook
test_db_path = str(_notebook_dir / "test_ai_logs.db")

# Create a logger instance for testing
test_logger = AILogger(
    storage_path=test_db_path,
    min_severity=LogSeverity.DEBUG,  # Log everything for demonstration
    max_log_length=5000  # Truncate very long texts
)

# Set as global logger
set_ai_logger(test_logger)

# Initialize storage for querying
storage = LogStorage(test_db_path, create_if_missing=True)

print(f"✓ AI Logger initialized: {test_db_path}")
print(f"✓ Log Storage initialized")

# Display current statistics
stats = storage.get_statistics()
print(f"\nCurrent Log Statistics:")
print(f"  Total logs: {stats['total_logs']}")
print(f"  By category: {stats['by_category']}")
print(f"  By severity: {stats['by_severity']}")


✓ AI Logger initialized: /Users/gr-mbp/Sites/deploying-ai/05_src/assignment_chat/notebooks/test_ai_logs.db
✓ Log Storage initialized

Current Log Statistics:
  Total logs: 0
  By category: {}
  By severity: {}


## Section 2: Theory - How AI Observability Works

### 2.1 Logging Architecture

Our AI observability system uses **structured logging** separate from application logging:

**Key Components:**
1. **Context Variables**: Thread-local conversation ID tracking using Python's `contextvars`
2. **Structured Data**: All logs stored as structured entries with consistent fields
3. **SQLite Storage**: File-based database with indexes for fast querying
4. **Category System**: Different log types (prompt, response, tool_call, error, etc.)
5. **Severity Levels**: DEBUG, INFO, WARNING, ERROR, CRITICAL

**Why Separate from Application Logging?**
- Application logs focus on system events (startup, errors, configuration)
- AI logs focus on model interactions (prompts, responses, tool calls, costs)
- Different query patterns and analysis needs
- Different retention and privacy requirements


In [4]:
# Demonstrate the logging architecture
print("Log Categories Available:")
for category in LogCategory:
    print(f"  - {category.value}")

print("\nSeverity Levels Available:")
for severity in LogSeverity:
    print(f"  - {severity.value}")

# Show database schema
print("\nDatabase Schema:")
print("  - id: Unique log entry ID")
print("  - timestamp: When the event occurred")
print("  - conversation_id: Links related events together")
print("  - category: Type of event (prompt, response, tool_call, etc.)")
print("  - severity: Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)")
print("  - message: Human-readable description")
print("  - metadata: Additional structured data (JSON)")
print("  - model_name: Which model was used")
print("  - prompt_text: Full prompt (may be truncated)")
print("  - response_text: Full response (may be truncated)")
print("  - token_count_input: Input tokens used")
print("  - token_count_output: Output tokens generated")
print("  - latency_ms: Response time in milliseconds")
print("  - cost_usd: Estimated cost in USD")
print("  - tool_name: Tool that was called (if applicable)")
print("  - evaluation_scores: Quality metrics (if applicable)")


Log Categories Available:
  - prompt
  - response
  - tool_call
  - tool_result
  - evaluation
  - performance
  - cost
  - error
  - guardrail
  - model_config

Severity Levels Available:
  - DEBUG
  - INFO
  - ERROR
  - CRITICAL

Database Schema:
  - id: Unique log entry ID
  - timestamp: When the event occurred
  - conversation_id: Links related events together
  - category: Type of event (prompt, response, tool_call, etc.)
  - message: Human-readable description
  - metadata: Additional structured data (JSON)
  - model_name: Which model was used
  - prompt_text: Full prompt (may be truncated)
  - response_text: Full response (may be truncated)
  - token_count_input: Input tokens used
  - token_count_output: Output tokens generated
  - latency_ms: Response time in milliseconds
  - cost_usd: Estimated cost in USD
  - tool_name: Tool that was called (if applicable)
  - evaluation_scores: Quality metrics (if applicable)


### 2.2 LangChain Tool Calling Theory

**How LangChain Tool Calling Works:**

1. **Tool Binding**: Tools are bound to the model using `model.bind_tools(tools)`
   - Each tool has a name, description, and parameters schema
   - LangChain converts tools to function calling format

2. **Function Calling Flow**:
   ```
   User Prompt → Model → Decision: Use Tool? → Tool Execution → Model → Final Response
   ```

3. **Message Types**:
   - **SystemMessage**: Instructions for the model (system prompt)
   - **HumanMessage**: User input
   - **AIMessage**: Model responses (may contain tool calls)
   - **ToolMessage**: Results from tool execution

4. **Tool Execution Lifecycle**:
   - Model decides to call a tool based on prompt
   - Tool is invoked with parameters
   - Tool result is added as ToolMessage
   - Model processes tool result and generates final response
   - May require multiple tool calls in sequence


In [6]:
# Demonstrate tool calling flow with a diagram
print("LangChain Tool Calling Flow:")
print("=" * 60)
print("""
1. User sends message: "What videos did I watch recently?"
   └─> HumanMessage created

2. ChatEngine processes message:
   ├─> Generates conversation_id (UUID)
   ├─> Logs PROMPT event
   └─> Invokes model_with_tools.invoke(messages)

3. Model receives messages:
   ├─> SystemMessage: Instructions
   ├─> HumanMessage: User query
   └─> Decides: "I need to call get_recent_videos tool"

4. Model returns AIMessage with tool_calls:
   └─> Tool: get_recent_videos, Args: {"limit": 10}

5. ChatEngine handles tool call:
   ├─> Logs TOOL_CALL event
   ├─> Executes tool (calls YouTube API)
   ├─> Logs TOOL_RESULT event (success/failure)
   └─> Creates ToolMessage with result

6. Model processes tool result:
   ├─> Receives ToolMessage with video data
   ├─> Generates natural language response
   └─> Returns final AIMessage

7. ChatEngine logs response:
   ├─> Logs RESPONSE event
   ├─> Records token counts, latency, cost
   └─> Returns response to user
""")
print("=" * 60)


LangChain Tool Calling Flow:

1. User sends message: "What videos did I watch recently?"
   └─> HumanMessage created

2. ChatEngine processes message:
   ├─> Generates conversation_id (UUID)
   ├─> Logs PROMPT event
   └─> Invokes model_with_tools.invoke(messages)

3. Model receives messages:
   ├─> SystemMessage: Instructions
   ├─> HumanMessage: User query
   └─> Decides: "I need to call get_recent_videos tool"

4. Model returns AIMessage with tool_calls:
   └─> Tool: get_recent_videos, Args: {"limit": 10}

5. ChatEngine handles tool call:
   ├─> Logs TOOL_CALL event
   ├─> Executes tool (calls YouTube API)
   ├─> Logs TOOL_RESULT event (success/failure)
   └─> Creates ToolMessage with result

6. Model processes tool result:
   ├─> Receives ToolMessage with video data
   ├─> Generates natural language response
   └─> Returns final AIMessage

7. ChatEngine logs response:
   ├─> Logs RESPONSE event
   ├─> Records token counts, latency, cost
   └─> Returns response to user



### 2.3 Chat Engine Operation

**Request Flow in Detail:**

```
User Message
    ↓
Generate conversation_id (UUID)
    ↓
Set conversation context (contextvars)
    ↓
Log PROMPT event
    ├─> prompt_text: User message
    ├─> model_name: Which model
    └─> conversation_id: Links all events
    ↓
Build message history
    ├─> SystemMessage: Instructions
    ├─> Previous messages (if any)
    └─> Current HumanMessage
    ↓
Invoke model_with_tools
    ├─> Start timer
    └─> model.invoke(messages)
    ↓
Model Response
    ├─> Extract token counts
    ├─> Calculate latency
    └─> Check for tool_calls
    ↓
[If tool_calls exist]
    ├─> Log each TOOL_CALL
    ├─> Execute each tool
    ├─> Log each TOOL_RESULT
    └─> Invoke model again with tool results
    ↓
Log RESPONSE event
    ├─> response_text: Model output
    ├─> token_count_input/output
    ├─> latency_ms
    └─> cost_usd (if available)
    ↓
Clear conversation context
    ↓
Return response to user
```

**Key Design Decisions:**
- **Conversation ID**: Generated per request, links all related events
- **Context Variables**: Thread-safe way to track conversation without passing IDs everywhere
- **Structured Logging**: All events stored with consistent schema for easy querying
- **Error Handling**: Errors are logged but don't break the flow


### 2.4 Observability Benefits

**1. Cost Tracking**
- Monitor token usage per request
- Track costs over time
- Identify expensive queries
- Optimize prompts to reduce costs

**2. Performance Monitoring**
- Measure latency (time-to-first-token, total time)
- Identify slow queries
- Track tool execution times
- Optimize for better user experience

**3. Quality Assurance**
- Track response quality
- Monitor error rates
- Detect hallucinations or inappropriate responses
- Evaluate tool usage patterns

**4. Debugging and Troubleshooting**
- Trace conversation flow
- Identify where errors occur
- Understand model decision-making
- Reproduce issues with conversation IDs


## Section 3: Creating Test Events

Let's create various test scenarios to demonstrate logging capabilities.


### 3.1 Basic Logging Tests

#### Test 1: Simple Prompt/Response (No Tools)


In [7]:
# Test 1: Simple prompt/response logging
ai_logger = get_ai_logger()
conv_id_1 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_1)

# Simulate a simple prompt/response
prompt = "Hello, how are you?"
response = "I'm doing well, thank you for asking!"

# Log the prompt
ai_logger.log_prompt(
    prompt_text=prompt,
    model_name="gpt-4o-mini",
    conversation_id=conv_id_1
)

# Simulate some processing time
time.sleep(0.1)

# Log the response
ai_logger.log_response(
    response_text=response,
    model_name="gpt-4o-mini",
    token_count_input=10,
    token_count_output=8,
    latency_ms=150.5,
    conversation_id=conv_id_1
)

ai_logger.clear_conversation_id()

print(f"✓ Test 1 Complete - Conversation ID: {conv_id_1}")
print(f"  Prompt: {prompt}")
print(f"  Response: {response}")

# Query the logs
logs = storage.get_conversation_logs(conv_id_1)
print(f"\n  Logged {len(logs)} events:")
for log in logs:
    print(f"    - {log['category']}: {log['message']}")


✓ Test 1 Complete - Conversation ID: f9ad6de0-aecf-429a-b41f-9036a567b7cb
  Prompt: Hello, how are you?
  Response: I'm doing well, thank you for asking!

  Logged 2 events:
    - response: Response received from gpt-4o-mini
    - prompt: Prompt sent to gpt-4o-mini


#### Test 2: Prompt with Tool Calls


In [8]:
# Test 2: Prompt with tool calls
conv_id_2 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_2)

prompt = "What videos did I watch recently?"
tool_name = "get_recent_videos"
tool_args = {"limit": 10}

# Log prompt
ai_logger.log_prompt(
    prompt_text=prompt,
    model_name="gpt-4o-mini",
    conversation_id=conv_id_2
)

# Log tool call
ai_logger.log_tool_call(
    tool_name=tool_name,
    tool_args=tool_args,
    conversation_id=conv_id_2
)

# Simulate tool execution
time.sleep(0.2)

# Log tool result
ai_logger.log_tool_result(
    tool_name=tool_name,
    success=True,
    latency_ms=200.0,
    conversation_id=conv_id_2,
    result_preview="Found 10 recent videos..."
)

# Log final response
ai_logger.log_response(
    response_text="You've watched 10 videos recently. Here are some highlights...",
    model_name="gpt-4o-mini",
    token_count_input=45,
    token_count_output=120,
    latency_ms=850.5,
    conversation_id=conv_id_2
)

ai_logger.clear_conversation_id()

print(f"✓ Test 2 Complete - Conversation ID: {conv_id_2}")
logs = storage.get_conversation_logs(conv_id_2)
print(f"\n  Logged {len(logs)} events:")
for log in logs:
    print(f"    - {log['category']}: {log['message']}")
    if log.get('tool_name'):
        print(f"      Tool: {log['tool_name']}")


✓ Test 2 Complete - Conversation ID: f57d96a8-5235-49eb-8594-2026d5d2d3d9

  Logged 4 events:
    - response: Response received from gpt-4o-mini
    - tool_result: Tool get_recent_videos succeeded
      Tool: get_recent_videos
    - tool_call: Tool called: get_recent_videos
      Tool: get_recent_videos
    - prompt: Prompt sent to gpt-4o-mini


#### Test 3: Multiple Tool Calls in Sequence


In [11]:
# Test 3: Multiple tool calls
conv_id_3 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_3)

prompt = "Tell me about my watch statistics and recent videos"

# Log prompt
ai_logger.log_prompt(
    prompt_text=prompt,
    model_name="gpt-4o-mini",
    conversation_id=conv_id_3
)

# First tool call
ai_logger.log_tool_call(
    tool_name="get_statistics",
    tool_args={},
    conversation_id=conv_id_3
)
time.sleep(0.15)
ai_logger.log_tool_result(
    tool_name="get_statistics",
    success=True,
    latency_ms=150.0,
    conversation_id=conv_id_3
)

# Second tool call
ai_logger.log_tool_call(
    tool_name="get_recent_videos",
    tool_args={"limit": 5},
    conversation_id=conv_id_3
)
time.sleep(0.1)
ai_logger.log_tool_result(
    tool_name="get_recent_videos",
    success=True,
    latency_ms=100.0,
    conversation_id=conv_id_3
)

# Final response
ai_logger.log_response(
    response_text="Your watch statistics show... and here are your recent videos...",
    model_name="gpt-4o-mini",
    token_count_input=60,
    token_count_output=180,
    latency_ms=1200.0,
    conversation_id=conv_id_3
)

ai_logger.clear_conversation_id()

print(f"✓ Test 3 Complete - Conversation ID: {conv_id_3}")
print(f"  Prompt: {prompt}")
logs = storage.get_conversation_logs(conv_id_3)
tool_calls = [log for log in logs if log['category'] == 'tool_call']
print(f"\n  Logged {len(logs)} events, {len(tool_calls)} tool calls:")
for log in logs:
    print(f"    - {log['category']}: {log['message']}")
    if log.get('tool_name'):
        print(f"      Tool: {log['tool_name']}")
    if log.get('token_count_input') or log.get('token_count_output'):
        total_tokens = log.get('token_count_input', 0) + log.get('token_count_output', 0)
        print(f"      Tokens: {total_tokens} (in: {log.get('token_count_input', 0)}, out: {log.get('token_count_output', 0)})")
    if log.get('latency_ms'):
        print(f"      Latency: {log.get('latency_ms', 0):.2f}ms")


✓ Test 3 Complete - Conversation ID: d601cdef-f12e-4e9c-87da-6ababca78363
  Prompt: Tell me about my watch statistics and recent videos

  Logged 6 events, 2 tool calls:
    - response: Response received from gpt-4o-mini
      Tokens: 240 (in: 60, out: 180)
      Latency: 1200.00ms
    - tool_result: Tool get_recent_videos succeeded
      Tool: get_recent_videos
      Latency: 100.00ms
    - tool_call: Tool called: get_recent_videos
      Tool: get_recent_videos
    - tool_result: Tool get_statistics succeeded
      Tool: get_statistics
      Latency: 150.00ms
    - tool_call: Tool called: get_statistics
      Tool: get_statistics
    - prompt: Prompt sent to gpt-4o-mini


In [12]:
# Test 4: Tool execution error
conv_id_4 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_4)

prompt = "Get video details for invalid_id"

ai_logger.log_prompt(
    prompt_text=prompt,
    model_name="gpt-4o-mini",
    conversation_id=conv_id_4
)

# Tool call that will fail
ai_logger.log_tool_call(
    tool_name="get_video_details",
    tool_args={"video_id": "invalid_id"},
    conversation_id=conv_id_4
)

# Simulate error
time.sleep(0.05)

# Log failed tool result
ai_logger.log_tool_result(
    tool_name="get_video_details",
    success=False,
    latency_ms=50.0,
    conversation_id=conv_id_4,
    error="Video not found",
    error_type="NotFoundError"
)

# Log error event
ai_logger.log(
    category=LogCategory.ERROR,
    message="Tool execution failed: Video not found",
    severity=LogSeverity.ERROR,
    conversation_id=conv_id_4,
    tool_name="get_video_details",
    error_type="NotFoundError"
)

ai_logger.clear_conversation_id()

print(f"✓ Test 4 Complete - Conversation ID: {conv_id_4}")
logs = storage.get_conversation_logs(conv_id_4)
errors = [log for log in logs if log['severity'] == 'ERROR']
print(f"\n  Logged {len(logs)} events, {len(errors)} errors")


✓ Test 4 Complete - Conversation ID: 10a747ad-4ba5-4c17-a5d7-53063eae3c2a

  Logged 4 events, 2 errors


#### Test 5: Model Errors


In [13]:
# Test 5: Model error
conv_id_5 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_5)

prompt = "This might cause an error"

ai_logger.log_prompt(
    prompt_text=prompt,
    model_name="gpt-4o-mini",
    conversation_id=conv_id_5
)

# Simulate model error
ai_logger.log(
    category=LogCategory.ERROR,
    message="Model invocation failed: Rate limit exceeded",
    severity=LogSeverity.ERROR,
    conversation_id=conv_id_5,
    error_type="RateLimitError",
    error_message="API rate limit exceeded. Please try again later."
)

ai_logger.clear_conversation_id()

print(f"✓ Test 5 Complete - Conversation ID: {conv_id_5}")
logs = storage.get_conversation_logs(conv_id_5)
print(f"\n  Logged {len(logs)} events")
for log in logs:
    if log['severity'] == 'ERROR':
        print(f"    ERROR: {log['message']}")


✓ Test 5 Complete - Conversation ID: 88255a1f-1e50-4436-bfae-6e16616b3850

  Logged 2 events
    ERROR: Model invocation failed: Rate limit exceeded


### 3.2 Conversation Scenarios

#### Scenario 1: Single-Turn Conversation


In [14]:
# Scenario 1: Single-turn conversation
conv_id_s1 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_s1)

# User asks a question
user_message = "What's my total watch time?"
ai_logger.log_prompt(
    prompt_text=user_message,
    model_name="gpt-4o-mini",
    conversation_id=conv_id_s1
)

# Model responds (after tool call)
ai_logger.log_response(
    response_text="Your total watch time is 1,250 hours across all videos.",
    model_name="gpt-4o-mini",
    token_count_input=15,
    token_count_output=12,
    latency_ms=320.0,
    conversation_id=conv_id_s1
)

ai_logger.clear_conversation_id()

print(f"✓ Scenario 1 Complete - Single turn conversation")
logs = storage.get_conversation_logs(conv_id_s1)
print(f"  Conversation ID: {conv_id_s1}")
print(f"  Events logged: {len(logs)}")


✓ Scenario 1 Complete - Single turn conversation
  Conversation ID: 88c88155-e13b-4714-b3e5-a5e7a35643ae
  Events logged: 2


#### Scenario 2: Multi-Turn Conversation with Context


In [15]:
# Scenario 2: Multi-turn conversation
conv_id_s2 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_s2)

# Turn 1
ai_logger.log_prompt(
    prompt_text="What videos did I watch this week?",
    model_name="gpt-4o-mini",
    conversation_id=conv_id_s2,
    history_length=0
)
ai_logger.log_response(
    response_text="You watched 15 videos this week, including...",
    model_name="gpt-4o-mini",
    token_count_input=20,
    token_count_output=45,
    latency_ms=450.0,
    conversation_id=conv_id_s2
)

# Turn 2 (follow-up)
ai_logger.log_prompt(
    prompt_text="What about last week?",
    model_name="gpt-4o-mini",
    conversation_id=conv_id_s2,
    history_length=1  # Previous turn in history
)
ai_logger.log_response(
    response_text="Last week you watched 12 videos...",
    model_name="gpt-4o-mini",
    token_count_input=25,  # Includes previous context
    token_count_output=38,
    latency_ms=380.0,
    conversation_id=conv_id_s2
)

# Turn 3 (another follow-up)
ai_logger.log_prompt(
    prompt_text="Which week had more watch time?",
    model_name="gpt-4o-mini",
    conversation_id=conv_id_s2,
    history_length=2
)
ai_logger.log_response(
    response_text="This week had more watch time with 8.5 hours compared to 6.2 hours last week.",
    model_name="gpt-4o-mini",
    token_count_input=30,
    token_count_output=20,
    latency_ms=420.0,
    conversation_id=conv_id_s2
)

ai_logger.clear_conversation_id()

print(f"✓ Scenario 2 Complete - Multi-turn conversation")
logs = storage.get_conversation_logs(conv_id_s2)
prompts = [log for log in logs if log['category'] == 'prompt']
responses = [log for log in logs if log['category'] == 'response']
print(f"  Conversation ID: {conv_id_s2}")
print(f"  Total events: {len(logs)}")
print(f"  Turns: {len(prompts)} prompts, {len(responses)} responses")
print(f"  Total tokens: {sum(log.get('token_count_input', 0) + log.get('token_count_output', 0) for log in logs)}")


✓ Scenario 2 Complete - Multi-turn conversation
  Conversation ID: 3c023f23-7a3a-4220-b78e-7e3be76f3f72
  Total events: 6
  Turns: 3 prompts, 3 responses


TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'

In [16]:
# Edge Case 1: Long text truncation
conv_id_e1 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_e1)

# Create a very long prompt (longer than max_log_length)
long_prompt = "Tell me about " + "video " * 2000  # Very long prompt
long_response = "Here is a detailed analysis: " + "data " * 2000  # Very long response

ai_logger.log_prompt(
    prompt_text=long_prompt,
    model_name="gpt-4o-mini",
    conversation_id=conv_id_e1
)

ai_logger.log_response(
    response_text=long_response,
    model_name="gpt-4o-mini",
    token_count_input=5000,
    token_count_output=3000,
    latency_ms=2500.0,
    conversation_id=conv_id_e1
)

ai_logger.clear_conversation_id()

# Check if truncation occurred
logs = storage.get_conversation_logs(conv_id_e1)
for log in logs:
    if log.get('prompt_text'):
        prompt_len = len(log['prompt_text'])
        print(f"  Prompt length: {prompt_len} chars")
        if "[truncated" in log['prompt_text']:
            print(f"  ✓ Truncation detected in prompt")
    if log.get('response_text'):
        response_len = len(log['response_text'])
        print(f"  Response length: {response_len} chars")
        if "[truncated" in log['response_text']:
            print(f"  ✓ Truncation detected in response")

print(f"\n✓ Edge Case 1 Complete - Long text handling")


  Response length: 5026 chars
  ✓ Truncation detected in response
  Prompt length: 5026 chars
  ✓ Truncation detected in prompt

✓ Edge Case 1 Complete - Long text handling


#### Edge Case 2: High Token Usage Scenario


In [17]:
# Edge Case 2: High token usage
conv_id_e2 = str(uuid.uuid4())
ai_logger.set_conversation_id(conv_id_e2)

ai_logger.log_prompt(
    prompt_text="Analyze all my watch history in detail",
    model_name="gpt-4o-mini",
    conversation_id=conv_id_e2
)

# Simulate high token usage
ai_logger.log_response(
    response_text="Based on your extensive watch history...",
    model_name="gpt-4o-mini",
    token_count_input=8000,  # Very high input tokens
    token_count_output=4000,  # Very high output tokens
    latency_ms=5000.0,  # 5 seconds
    cost_usd=0.15,  # Estimated cost
    conversation_id=conv_id_e2
)

ai_logger.clear_conversation_id()

logs = storage.get_conversation_logs(conv_id_e2)
for log in logs:
    if log.get('token_count_input'):
        total_tokens = log.get('token_count_input', 0) + log.get('token_count_output', 0)
        cost = log.get('cost_usd', 0)
        print(f"  High token usage detected:")
        print(f"    Input tokens: {log['token_count_input']:,}")
        print(f"    Output tokens: {log['token_count_output']:,}")
        print(f"    Total tokens: {total_tokens:,}")
        print(f"    Estimated cost: ${cost:.4f}")
        print(f"    Latency: {log.get('latency_ms', 0):.2f}ms")

print(f"\n✓ Edge Case 2 Complete - High token usage")


  High token usage detected:
    Input tokens: 8,000
    Output tokens: 4,000
    Total tokens: 12,000
    Estimated cost: $0.1500
    Latency: 5000.00ms

✓ Edge Case 2 Complete - High token usage
