# 7. System Integration - Pol.is Math Python Implementation

This notebook explores the system integration components of the Pol.is math Python conversion. It shows how the various components work together, including the database connections, poller, server, and overall system orchestration.

In [None]:
import os
import time
import tempfile
import threading
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from polismath.components.config import ConfigManager
from polismath.components.server import Server
from polismath.database.postgres import PostgresManager
from polismath.conversation import ConversationManager
from polismath.poller import Poller
from polismath.system import SystemManager

## 7.1 Configuration Management

The `ConfigManager` provides centralized configuration management for the Pol.is system.

In [None]:
# Create a temporary directory for data storage
data_dir = tempfile.mkdtemp()
print(f"Created temporary data directory: {data_dir}")

# Define a configuration for testing
test_config = {
    # Environment
    'math_env': 'dev',
    
    # Data storage
    'data_dir': data_dir,
    
    # Database settings - disabled for testing
    'database': {
        'enabled': False,
        'host': 'localhost',
        'port': 5432,
        'dbname': 'polis_test',
        'user': 'postgres',
        'password': 'postgres'
    },
    
    # Server settings
    'server': {
        'enabled': True,
        'host': '127.0.0.1',
        'port': 8000,
        'log_level': 'info',
        'workers': 1
    },
    
    # Polling settings - disabled for testing
    'poller': {
        'enabled': False,
        'vote_interval_ms': 1000,
        'mod_interval_ms': 10000,
        'task_interval_ms': 10000,
        'task_process_interval_ms': 1000
    },
    
    # Compute settings
    'compute': {
        'n_clusters': 3,
        'pca_iters': 10,
        'auto_k': True,
        'max_ptpts_per_group': 60000,  # Large enough for testing
        'vote_threshold': 7,
        'mod_threshold': 1
    },
    
    # Logging settings
    'logging': {
        'level': 'INFO',
        'format': '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
    }
}

# Initialize the configuration manager
config = ConfigManager.get_config(test_config)
print("Configuration initialized successfully")
print(f"Math environment: {config['math_env']}")
print(f"Data directory: {config['data_dir']}")

## 7.2 Database Integration

The `PostgresManager` provides database integration for the Pol.is system. For testing purposes, we won't connect to a real database.

In [None]:
# Initialize the database manager (won't actually connect since enabled=False)
db_manager = PostgresManager(config)
print("Database manager initialized (disabled for testing)")

# Check connection settings (would be used if enabled)
print(f"Database connection settings:")
print(f"  Host: {config['database']['host']}")
print(f"  Port: {config['database']['port']}")
print(f"  Database: {config['database']['dbname']}")
print(f"  User: {config['database']['user']}")

## 7.3 Conversation Manager

The `ConversationManager` manages conversations and integrates with the database when enabled.

In [None]:
# Initialize the conversation manager
conv_manager = ConversationManager(data_dir=config['data_dir'], db_manager=db_manager)
print("Conversation manager initialized")
print(f"Data directory: {conv_manager.data_dir}")
print(f"Using database: {conv_manager.db_manager is not None}")

# Create a test conversation
test_conv_id = "system-test-" + str(int(time.time()))
conv_manager.create_conversation(test_conv_id)
print(f"\nCreated test conversation with ID: {test_conv_id}")

# Get the conversation
test_conv = conv_manager.get_conversation(test_conv_id)
print(f"Retrieved conversation: {test_conv}")
print(f"Initial participant count: {test_conv.participant_count}")
print(f"Initial comment count: {test_conv.comment_count}")

## 7.4 Processing Votes

Let's add some votes to the test conversation and see how the system processes them.

In [None]:
# Define some test votes
import random
random.seed(42)  # For reproducibility

# Generate votes with two distinct opinion groups
num_participants = 100
num_comments = 20
participant_ids = [f"p{i}" for i in range(num_participants)]
comment_ids = [f"c{i}" for i in range(num_comments)]

test_votes = {"votes": []}

for p_idx, pid in enumerate(participant_ids):
    # First group tends to agree with first half of comments, second group with second half
    group = 0 if p_idx < 50 else 1
    
    for c_idx, cid in enumerate(comment_ids):
        # Determine tendency to agree based on group
        if (group == 0 and c_idx < 10) or (group == 1 and c_idx >= 10):
            agree_prob = 0.8  # High probability of agreement
        else:
            agree_prob = 0.2  # Low probability of agreement
        
        # Randomly determine vote (1=agree, -1=disagree, None=pass)
        r = random.random()
        if r < agree_prob:
            vote = 1
        elif r < agree_prob + 0.15:
            vote = -1
        else:
            continue  # Skip this vote (pass)
        
        # Add vote
        test_votes["votes"].append({
            "pid": pid,
            "tid": cid,
            "vote": vote
        })

test_votes["lastVoteTimestamp"] = int(time.time() * 1000)  # Current time in milliseconds

print(f"Generated {len(test_votes['votes'])} test votes")

# Process the votes
print("\nProcessing votes...")
updated_conv = conv_manager.process_votes(test_conv_id, test_votes)

print("Votes processed successfully:")
print(f"Updated participant count: {updated_conv.participant_count}")
print(f"Updated comment count: {updated_conv.comment_count}")
print(f"Total votes: {np.sum(~np.isnan(updated_conv.raw_rating_mat.matrix))}")

## 7.5 Examining Computation Results

Let's check if PCA, clustering, and representativeness were computed after adding the votes.

In [None]:
# Check computation results
print("Computation Results:")
print(f"PCA computed: {updated_conv.pca is not None}")
print(f"Clustering computed: {updated_conv.group_clusters is not None}")
print(f"Representativeness computed: {updated_conv.repness is not None}")

# If clustering was computed, show the clusters
if updated_conv.group_clusters is not None:
    print("\nCluster Results:")
    for i, cluster in enumerate(updated_conv.group_clusters):
        print(f"Cluster {i}: {len(cluster)} participants")
        print(f"  First 5 members: {cluster[:5] if len(cluster) >= 5 else cluster}")

# If representativeness was computed, show the top representative comments
if updated_conv.repness is not None and 'group_repness' in updated_conv.repness:
    print("\nRepresentative Comments:")
    for group_id, comments in updated_conv.repness['group_repness'].items():
        print(f"\nTop Representative Comments for Group {group_id}:")
        
        # Show top agrees
        agrees = sorted([c for c in comments if c['repful'] == 'agree'], 
                       key=lambda x: abs(x['repness_z']), reverse=True)
        print("Top 'Agree' comments:")
        for i, comment in enumerate(agrees[:3]):
            print(f"  {i+1}. Comment {comment['comment_id']}: z-score={comment['repness_z']:.3f}")
        
        # Show top disagrees
        disagrees = sorted([c for c in comments if c['repful'] == 'disagree'], 
                          key=lambda x: abs(x['repness_z']), reverse=True)
        print("Top 'Disagree' comments:")
        for i, comment in enumerate(disagrees[:3]):
            print(f"  {i+1}. Comment {comment['comment_id']}: z-score={comment['repness_z']:.3f}")

## 7.6 Visualizing the Results

Let's visualize the PCA and clustering results if they were computed.

In [None]:
# Visualize PCA and clustering results if available
if updated_conv.pca is not None and 'projection' in updated_conv.pca and updated_conv.group_clusters is not None:
    # Extract the projection coordinates
    proj_dict = updated_conv.pca['projection']
    
    # Create a mapping from participant ID to cluster
    id_to_cluster = {}
    for cluster_idx, cluster_members in enumerate(updated_conv.group_clusters):
        for pid in cluster_members:
            id_to_cluster[pid] = cluster_idx
    
    # Extract the projection coordinates and assigned clusters
    x_coords = []
    y_coords = []
    assigned_clusters = []
    true_groups = []
    labels = []
    
    for p_id, projection in proj_dict.items():
        if len(projection) >= 2:  # Make sure we have at least 2D projection
            x_coords.append(projection[0])
            y_coords.append(projection[1])
            assigned_clusters.append(id_to_cluster.get(p_id, -1))  # -1 if not in any cluster
            labels.append(p_id)
            
            # Determine true group based on participant ID
            p_idx = int(p_id[1:])  # Extract the index from "p{idx}"
            true_groups.append(0 if p_idx < 50 else 1)
    
    # Create scatter plots for true groups and detected clusters
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 8))
    
    # Plot with true groups
    colors_true = ["blue" if g == 0 else "red" for g in true_groups]
    scatter1 = ax1.scatter(x_coords, y_coords, c=colors_true, alpha=0.6, s=50)
    ax1.set_title("Participants Colored by True Groups")
    ax1.set_xlabel("Principal Component 1")
    ax1.set_ylabel("Principal Component 2")
    from matplotlib.lines import Line2D
    legend_elements1 = [
        Line2D([0], [0], marker='o', color='w', markerfacecolor='blue', markersize=10, label='True Group 1'),
        Line2D([0], [0], marker='o', color='w', markerfacecolor='red', markersize=10, label='True Group 2')
    ]
    ax1.legend(handles=legend_elements1)
    ax1.grid(True, linestyle="--", alpha=0.7)
    
    # Plot with detected clusters
    num_clusters = max(assigned_clusters) + 1 if assigned_clusters else 0
    cluster_colors = plt.cm.tab10(np.linspace(0, 1, num_clusters)) if num_clusters > 0 else []
    colors_cluster = [cluster_colors[c] if c >= 0 and c < len(cluster_colors) else (0.7, 0.7, 0.7, 1.0) 
                     for c in assigned_clusters]
    scatter2 = ax2.scatter(x_coords, y_coords, c=colors_cluster, alpha=0.6, s=50)
    ax2.set_title("Participants Colored by Detected Clusters")
    ax2.set_xlabel("Principal Component 1")
    ax2.set_ylabel("Principal Component 2")
    legend_elements2 = []
    for i in range(num_clusters):
        legend_elements2.append(Line2D([0], [0], marker='o', color='w', 
                                       markerfacecolor=cluster_colors[i], 
                                       markersize=10, label=f'Cluster {i}'))
    ax2.legend(handles=legend_elements2)
    ax2.grid(True, linestyle="--", alpha=0.7)
    
    plt.tight_layout()
    plt.show()
else:
    print("PCA and clustering results not available for visualization.")

## 7.7 Server Component

The `Server` component provides HTTP endpoints for interacting with the Pol.is system. For demonstration purposes, we'll initialize it but not actually start it.

In [None]:
# Initialize the server but don't start it (to avoid port conflicts)
server = Server(config, conv_manager=conv_manager)
print("Server initialized (not started)")
print(f"Server would run at: http://{config['server']['host']}:{config['server']['port']}")

# List the available API endpoints
print("\nAvailable API Endpoints:")
print("- GET /health: Health check")
print("- POST /api/v3/votes/{conversation_id}: Process votes for a conversation")
print("- POST /api/v3/moderation/{conversation_id}: Update moderation settings")
print("- POST /api/v3/math/{conversation_id}: Recompute math results")
print("- GET /api/v3/conversations/{conversation_id}: Get conversation data")
print("- GET /api/v3/conversations: List all conversations")

## 7.8 Poller Component

The `Poller` component polls for new votes, moderation changes, and tasks from the database. For demonstration purposes, we'll initialize it but not actually start it.

In [None]:
# Initialize the poller but don't start it
poller = Poller(config, db_manager=db_manager, conv_manager=conv_manager)
print("Poller initialized (not started)")
print(f"Vote polling interval: {config['poller']['vote_interval_ms']} ms")
print(f"Moderation polling interval: {config['poller']['mod_interval_ms']} ms")
print(f"Task polling interval: {config['poller']['task_interval_ms']} ms")

# Explain the poller's role
print("\nPoller Responsibilities:")
print("1. Poll for new votes from the database and process them")
print("2. Poll for moderation changes and apply them to conversations")
print("3. Poll for tasks (like recomputing math) and execute them")
print("4. Process tasks in the background")

## 7.9 System Manager

The `SystemManager` orchestrates all the components and provides a unified interface for starting and stopping the system.

In [None]:
# Initialize the system manager but don't actually start the system
# This just demonstrates the component initialization
print("Initializing System Manager (components won't actually start)...")

# Turn off server and poller in config to avoid starting threads
config['server']['enabled'] = False
config['poller']['enabled'] = False

# Initialize the system manager
system = SystemManager(config)
print("System Manager initialized")

# List the system components
print("\nSystem Components:")
print(f"- Configuration Manager: Provides centralized configuration")
print(f"- Database Manager: {'Enabled' if config['database']['enabled'] else 'Disabled'} - Connects to PostgreSQL")
print(f"- Conversation Manager: Manages conversation state and computation")
print(f"- Server: {'Enabled' if config['server']['enabled'] else 'Disabled'} - Provides HTTP API")
print(f"- Poller: {'Enabled' if config['poller']['enabled'] else 'Disabled'} - Polls for updates")

# Explain the system starting process
print("\nSystem Starting Process:")
print("1. Initialize configuration")
print("2. Initialize database connection if enabled")
print("3. Initialize conversation manager")
print("4. Start server in a separate thread if enabled")
print("5. Start poller in a separate thread if enabled")
print("6. Wait for shutdown signal")

# Note about actually starting the system
print("\nTo actually start the system:")
print("system = SystemManager.start(config)  # Static method")
print("# Use the system")
print("SystemManager.stop()  # Static method to stop")

## 7.10 Creating a Simplified Working Example

Let's create a simplified working example of the system integration by running a local server in a separate thread for a short time.

In [None]:
# Create a very simple demo showing system initialization and shutdown
def run_demo_system():
    # Create a temporary directory for data
    demo_data_dir = tempfile.mkdtemp()
    print(f"Created demo data directory: {demo_data_dir}")
    
    # Create a demo configuration
    demo_config = {
        'math_env': 'dev',
        'data_dir': demo_data_dir,
        'database': {'enabled': False},
        'server': {
            'enabled': True,
            'host': '127.0.0.1',
            'port': 8001,  # Different port to avoid conflicts
            'log_level': 'info',
            'workers': 1
        },
        'poller': {'enabled': False},
        'compute': {
            'n_clusters': 3,
            'pca_iters': 10,
            'auto_k': True,
            'vote_threshold': 7,
            'mod_threshold': 1
        },
        'logging': {'level': 'INFO'}
    }
    
    try:
        print("Starting demo system...")
        # Start the system
        system = SystemManager.init(demo_config)
        
        # Create a conversation
        demo_conv_id = "demo-system-" + str(int(time.time()))
        print(f"Creating conversation {demo_conv_id}...")
        system.conversation_manager.create_conversation(demo_conv_id)
        
        # Add some votes
        demo_votes = {
            "votes": [
                {"pid": "demo-p1", "tid": "demo-c1", "vote": 1},
                {"pid": "demo-p1", "tid": "demo-c2", "vote": -1},
                {"pid": "demo-p2", "tid": "demo-c1", "vote": 1},
                {"pid": "demo-p2", "tid": "demo-c3", "vote": 1},
                {"pid": "demo-p3", "tid": "demo-c2", "vote": -1},
                {"pid": "demo-p3", "tid": "demo-c3", "vote": -1}
            ]
        }
        print(f"Processing votes...")
        updated_demo_conv = system.conversation_manager.process_votes(demo_conv_id, demo_votes)
        
        print(f"Conversation updated: {updated_demo_conv.participant_count} participants, "
              f"{updated_demo_conv.comment_count} comments")
        
        # Start server in a thread
        if system.server:
            server_thread = threading.Thread(target=system.server.run)
            server_thread.daemon = True
            server_thread.start()
            print(f"Server started at http://{demo_config['server']['host']}:{demo_config['server']['port']}")
            print("Server would now be available for API requests")
        
        # Wait a short time
        print("Demo system running. Waiting 2 seconds...")
        time.sleep(2)
        
    finally:
        # Clean up
        print("Stopping demo system...")
        SystemManager.stop()
        print("Demo system stopped")

# Run the demo
run_demo_system()

## 7.11 Detailed API Usage Example

Let's explore how a client would interact with the Pol.is math system through its API by showing pseudo-code for common operations.

In [None]:
# API Usage Examples
import json

# This is pseudo-code to show how a client would interact with the API
# In a real application, you would use a HTTP client library to make these requests

api_examples = """
# Python HTTP Client Example (using requests)
import requests
import json

BASE_URL = "http://localhost:8000/api/v3"

# 1. Health check
response = requests.get(f"{BASE_URL}/health")
print("Health check:", response.json())

# 2. Create a new conversation (implicit from first vote)
conversation_id = "my-conversation-123"

# 3. Submit votes
votes_data = {
    "votes": [
        {"pid": "participant1", "tid": "comment1", "vote": 1},
        {"pid": "participant1", "tid": "comment2", "vote": -1},
        {"pid": "participant2", "tid": "comment1", "vote": 1},
        {"pid": "participant2", "tid": "comment3", "vote": 1},
    ]
}
response = requests.post(f"{BASE_URL}/votes/{conversation_id}", json=votes_data)
print("Submit votes:", response.json())

# 4. Apply moderation
moderation_data = {
    "mod_out_tids": ["comment2"],  # Exclude comment2
    "mod_in_tids": ["comment1"],    # Feature comment1
    "meta_tids": [],                # No meta comments
    "mod_out_ptpts": []             # No excluded participants
}
response = requests.post(f"{BASE_URL}/moderation/{conversation_id}", json=moderation_data)
print("Apply moderation:", response.json())

# 5. Force recomputation
response = requests.post(f"{BASE_URL}/math/{conversation_id}")
print("Force recomputation:", response.json())

# 6. Get conversation data
response = requests.get(f"{BASE_URL}/conversations/{conversation_id}")
conversation_data = response.json()
print("Conversation participants:", len(conversation_data.get('participants', [])))
print("Conversation comments:", len(conversation_data.get('comments', [])))
print("Conversation groups:", len(conversation_data.get('groups', [])))

# 7. List all conversations
response = requests.get(f"{BASE_URL}/conversations")
all_conversations = response.json()
print("All conversations:", [conv['conversation_id'] for conv in all_conversations])
"""

print("API Usage Example:")
print(api_examples)

## 7.12 Summary

The Pol.is system integration components provide:

1. Centralized configuration management through the `ConfigManager`
2. Database integration through the `PostgresManager`
3. Conversation state management through the `ConversationManager`
4. HTTP API endpoints through the `Server` component
5. Background polling through the `Poller` component
6. System orchestration through the `SystemManager`

These components work together to provide a complete system for managing Pol.is conversations, processing votes, performing mathematical computations, and exposing the results through a RESTful API.

To run the full system in a production environment, you would typically:

1. Configure the system with appropriate database credentials and other settings
2. Start the system using `SystemManager.start(config)`
3. Let it run continuously, processing votes and moderation changes
4. Interact with it through the HTTP API
5. Shut it down gracefully using `SystemManager.stop()` when needed

The Python conversion provides a modular and maintainable architecture that makes it easy to extend and customize the system for different deployment scenarios.