# Voice AI Data Analysis Demo

This notebook demonstrates how to connect to and analyze data from the Voice AI Agent.

## What you'll learn:
- How to connect to PostgreSQL and Redis
- How to query conversation data
- How to visualize session statistics
- How to test API endpoints

In [7]:
# Setup: Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sqlalchemy import create_engine
import redis
import requests
from datetime import datetime, timedelta

# Set visualization style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

print("✓ Libraries imported successfully")

✓ Libraries imported successfully


## 1. Environment Check

Let's verify we can access the necessary services.

In [8]:
# Connection strings for local access to Docker services
DATABASE_URL = 'postgresql://voiceai:voiceai_dev@localhost:5432/voiceai_db'
REDIS_URL = 'redis://localhost:6379'
API_BASE_URL = 'http://localhost:8000'

print("Connection Configuration:")
print(f"Database: {DATABASE_URL}")
print(f"Redis: {REDIS_URL}")
print(f"API: {API_BASE_URL}")

Connection Configuration:
Database: postgresql://voiceai:voiceai_dev@localhost:5432/voiceai_db
Redis: redis://localhost:6379
API: http://localhost:8000


## 2. Database Connection Test

Verify we can connect to PostgreSQL and query conversation data.

In [9]:
# Create database connection
engine = create_engine(DATABASE_URL)

# Test query: Get recent conversation sessions
query = """
SELECT 
    session_id,
    session_type,
    direction,
    status,
    created_at
FROM conversations.conversation_sessions
ORDER BY created_at DESC
LIMIT 10;
"""

try:
    df = pd.read_sql(query, engine)
    print(f"✓ Database connection successful")
    print(f"\nRecent conversation sessions ({len(df)}):\n")
    display(df)
except Exception as e:
    print(f"✗ Database error: {e}")
    print("\nMake sure Docker services are running: make dev")

✗ Database error: (psycopg2.errors.UndefinedTable) relation "conversations.conversation_sessions" does not exist
LINE 8: FROM conversations.conversation_sessions
             ^

[SQL: 
SELECT 
    session_id,
    session_type,
    direction,
    status,
    created_at
FROM conversations.conversation_sessions
ORDER BY created_at DESC
LIMIT 10;
]
(Background on this error at: https://sqlalche.me/e/20/f405)

Make sure Docker services are running: make dev


## 3. Redis Connection Test

Test connection to Redis for session state management.

In [None]:
# Connect to Redis
r = redis.from_url(REDIS_URL)

try:
    # Test connection
    r.ping()
    print("✓ Redis connection successful")
    
    # Get some stats
    info = r.info('stats')
    print(f"\nRedis Stats:")
    print(f"  Total connections: {info.get('total_connections_received', 0)}")
    print(f"  Total commands: {info.get('total_commands_processed', 0)}")
    
    # List conversation session keys
    session_keys = r.keys('conversation:session:*')
    print(f"\nActive conversation sessions: {len(session_keys)}")
    
except Exception as e:
    print(f"✗ Redis error: {e}")
    print("\nMake sure Docker services are running: make dev")

## 4. API Health Check

Test the API endpoint.

In [None]:
try:
    response = requests.get(f'{API_BASE_URL}/api/v1/health', timeout=5)
    
    if response.ok:
        print("✓ API is healthy")
        print(f"\nResponse:")
        print(response.json())
    else:
        print(f"✗ API returned status {response.status_code}")
        
except requests.exceptions.ConnectionError:
    print("✗ Could not connect to API")
    print("\nMake sure Docker services are running: make dev")
except Exception as e:
    print(f"✗ Error: {e}")

## 5. Session Statistics

Analyze conversation session statistics.

In [None]:
# Get session statistics
query = """
SELECT 
    DATE(created_at) as date,
    COUNT(*) as total_sessions,
    COUNT(CASE WHEN status = 'completed' THEN 1 END) as completed_sessions,
    COUNT(CASE WHEN status = 'active' THEN 1 END) as active_sessions,
    COUNT(CASE WHEN direction = 'inbound' THEN 1 END) as inbound_calls,
    COUNT(CASE WHEN direction = 'outbound' THEN 1 END) as outbound_calls
FROM conversations.conversation_sessions
WHERE created_at >= NOW() - INTERVAL '30 days'
GROUP BY DATE(created_at)
ORDER BY date DESC;
"""

try:
    df_sessions = pd.read_sql(query, engine)
    
    if len(df_sessions) > 0:
        print(f"Sessions in last 30 days: {df_sessions['total_sessions'].sum()}")
        display(df_sessions.head(10))
        
        # Visualize
        fig, ax = plt.subplots(figsize=(12, 5))
        ax.plot(df_sessions['date'], df_sessions['total_sessions'], marker='o', linewidth=2, label='Total')
        ax.plot(df_sessions['date'], df_sessions['completed_sessions'], marker='s', linewidth=2, label='Completed')
        ax.set_title('Session Volume Over Time', fontsize=14, fontweight='bold')
        ax.set_xlabel('Date')
        ax.set_ylabel('Number of Sessions')
        ax.legend()
        ax.grid(True, alpha=0.3)
        plt.xticks(rotation=45)
        plt.tight_layout()
        plt.show()
    else:
        print("No session data available yet")
        
except Exception as e:
    print(f"Error querying sessions: {e}")

## 6. Test TTS API Endpoint

Test the text-to-speech API (optional - requires API to be running).

In [None]:
# Test TTS endpoint
test_text = "Hello! This is a test of the voice synthesis system."

try:
    response = requests.post(
        f'{API_BASE_URL}/api/v1/voice/synthesize',
        json={'text': test_text, 'voice_id': 'default'},
        timeout=30
    )
    
    if response.ok:
        print(f"✓ TTS synthesis successful")
        print(f"Audio size: {len(response.content)} bytes")
        
        # Save audio file
        with open('test_tts.wav', 'wb') as f:
            f.write(response.content)
        print("Audio saved to: test_tts.wav")
    else:
        print(f"✗ TTS failed with status {response.status_code}")
        print(response.text)
        
except Exception as e:
    print(f"Error testing TTS: {e}")
    print("This is optional - you can skip if the API isn't fully configured")

## 7. Custom Query Section

Use this section for your own custom analysis queries.

In [None]:
# Your custom query here
custom_query = """
-- Example: Count sessions by type
SELECT 
    session_type,
    COUNT(*) as count
FROM conversations.conversation_sessions
GROUP BY session_type
ORDER BY count DESC;
"""

try:
    df_custom = pd.read_sql(custom_query, engine)
    display(df_custom)
except Exception as e:
    print(f"Query error: {e}")

## Next Steps

Explore more notebooks:
- `data_analysis/conversation_analytics.ipynb` - Analyze conversation patterns
- `model_testing/whisper_testing.ipynb` - Analyze transcription data
- `experiments/` - Create your own experiments!

Check `notebooks/README.md` for more examples and connection details.

Happy analyzing! 🚀