# üß™ Test Interactivo del Grafo V2.1

**Versi√≥n:** V2.1 con Reglas Determin√≠sticas  
**Fecha:** 2025-10-14

---

## üìã Caracter√≠sticas del Sistema

### **Router**
- ‚úÖ Top-K din√°mico: `score2 >= score1 - 0.10` ‚Üí K=2, sino K=1
- ‚úÖ Tracking: visited, hops, MAX_HOPS=2

### **Parallel Executor**
- ‚úÖ STOP_AFTER = 0.85: Cancela si resolved >= 0.85
- ‚úÖ TOOL_TIMEOUT = 2s por tool
- ‚úÖ BUDGET_TOOLS_PER_TURN = 6 tools m√°ximo

### **Aggregator**
- ‚úÖ Umbrales calibrados: talent=0.75, platform=0.70, business=0.70, content=0.68
- ‚úÖ common_graph nunca ganador final
- ‚úÖ MAX_HOPS = 2 re-routings

### **Clarifier**
- ‚úÖ Loop-safe: MAX_HOPS=2
- ‚úÖ M√°ximo 2 campos de clarificaci√≥n

## 1. Setup - Imports y Configuraci√≥n

In [1]:
from src.strands.main_router.graph import process_question_advanced

# Uso simple
result = await process_question_advanced(
    question="peliculas de coppola",
    max_hops=2
)

print(result['answer'])

Loading validation data from platform_name_iso

üéØ ADVANCED ROUTER
üìù Pregunta: peliculas de coppola
{
  "primary": "TALENT",
  "confidence": 0.90,
  "candidates": [
    {"category": "TALENT", "confidence": 0.90}
  ]
}[ROUTER] ‚úÖ Grafo seleccionado: talent
[ROUTER] üìä Confidence: 0.90
[ROUTER] üî¢ Candidatos: 1


üîç VALIDATION PREPROCESSOR
üìù Pregunta: peliculas de coppola
[VALIDATION] ü§ñ Ejecutando validaci√≥n con LLM...
Para validar los directores de Coppola, usar√© la herramienta validate_director:
Tool #1: validate_director

üîç SQL QUERY EJECUTADA
üìù Operaci√≥n: director exact search
üìÑ Query:

WITH q AS (SELECT %s::text AS s)
SELECT 
  d.id, 
  d.name,
  t.n_titles
FROM ms.directors d
CROSS JOIN q
LEFT JOIN LATERAL (
  SELECT COUNT(*)::integer AS n_titles
  FROM ms.directed_by db 
  WHERE db.director_id = d.id
) t ON TRUE
WHERE d.name ILIKE q.s
ORDER BY t.n_titles DESC NULLS LAST, d.name ASC
LIMIT 25

üîß Par√°metros: ('Coppola',)

‚úÖ Query retorn√≥ 0 filas e

In [2]:
# Uso simple
result = await process_question_advanced(
    question="1",
    max_hops=2
)

print(result['answer'])


üéØ ADVANCED ROUTER
üìù Pregunta: 1
{
  "primary": "COMMON",
  "confidence": 0.75,
  "candidates": [
    {"category": "COMMON", "confidence": 0.75},
    {"category": "PLATFORM", "confidence": 0.5}
  ]
}[ROUTER] ‚úÖ Grafo seleccionado: common
[ROUTER] üìä Confidence: 0.75
[ROUTER] üî¢ Candidatos: 2
[ROUTER] ‚è≠Ô∏è  Validaci√≥n no requerida


üîç VALIDATION PREPROCESSOR
üìù Pregunta: 1
[VALIDATION] ‚è≠Ô∏è  Validaci√≥n no requerida para este grafo, saltando...

üé¨ DOMAIN GRAPH EXECUTOR
[DOMAIN] Ejecutando: common
[SUPERVISOR] Evaluando estado... tools=0, task=None
[SUPERVISOR] Primera iteracion, necesita clasificacion
ADMIN
[ADMIN NODE]
Question: 1
Current state:
   ‚Ä¢ Task: admin
   ‚Ä¢ Previous tool calls: 0
   ‚Ä¢ Accumulated data: 0 characters

[ROUTING] Selecting tool...
   üîç Router LLM analizando pregunta...
   üìã Tools disponibles: build_sql, run_sql_adapter, validate_intent
admin_validate_intent   üí° LLM sugiere: admin_validate_intent
   ‚è±Ô∏è  Tiempo de respuest

In [1]:
import sys
import asyncio
from pathlib import Path
import json
from datetime import datetime

# Agregar src al path si no est√°
if str(Path.cwd()) not in sys.path:
    sys.path.insert(0, str(Path.cwd()))

# Imports del sistema
from src.strands.main_router.graph import create_main_graph, process_question
from src.strands.main_router.aggregator import THRESHOLD_BY_NODE, MAX_HOPS
from src.strands.main_router.parallel_executor import TOOL_TIMEOUT, BUDGET_TOOLS_PER_TURN, STOP_AFTER

print("‚úÖ Imports completados")
print(f"\nüìä Configuraci√≥n:")
print(f"  ‚Ä¢ Umbrales: {THRESHOLD_BY_NODE}")
print(f"  ‚Ä¢ MAX_HOPS: {MAX_HOPS}")
print(f"  ‚Ä¢ TOOL_TIMEOUT: {TOOL_TIMEOUT}s")
print(f"  ‚Ä¢ BUDGET: {BUDGET_TOOLS_PER_TURN} tools")
print(f"  ‚Ä¢ STOP_AFTER: {STOP_AFTER}")

Loading validation data from platform_name_iso
‚úÖ Imports completados

üìä Configuraci√≥n:
  ‚Ä¢ Umbrales: {'platform_graph': 0.7, 'business_graph': 0.7, 'talent_graph': 0.75, 'content_graph': 0.68, 'common_graph': None}
  ‚Ä¢ MAX_HOPS: 2
  ‚Ä¢ TOOL_TIMEOUT: 2.0s
  ‚Ä¢ BUDGET: 6 tools
  ‚Ä¢ STOP_AFTER: 0.85


## 2. Crear Grafo

In [2]:
# Crear grafo
try:
    graph = create_main_graph()
    print("‚úÖ Grafo creado correctamente")
except Exception as e:
    print(f"‚ùå Error al crear grafo: {e}")
    raise

‚úÖ Grafo creado correctamente


## 3. Funci√≥n Helper para Testear

In [3]:
def print_result(result, question):
    """
    Imprime resultado de manera legible.
    """
    print("\n" + "="*80)
    print(f"üìù PREGUNTA: {question}")
    print("="*80)
    
    # Respuesta
    answer = result.get("answer", "Sin respuesta")
    print(f"\nüí¨ RESPUESTA:\n{answer}")
    
    # Metadata del router
    print(f"\nüéØ ROUTER:")
    routing_scores = result.get("routing_scores", {})
    if routing_scores:
        sorted_scores = sorted(routing_scores.items(), key=lambda x: x[1], reverse=True)
        for graph, score in sorted_scores[:3]:
            print(f"  ‚Ä¢ {graph}: {score:.2f}")
    
    selected = result.get("selected_candidates", [])
    if selected:
        print(f"  ‚úÖ Seleccionados: {[f'{g}({s:.2f})' for g, s in selected]}")
    
    # Metadata del aggregator
    print(f"\n‚öñÔ∏è AGGREGATOR:")
    decision = result.get("aggregator_decision", "N/A")
    print(f"  ‚Ä¢ Decisi√≥n: {decision}")
    
    winning_node = result.get("winning_node")
    if winning_node:
        confidence = result.get("final_confidence", 0)
        threshold = result.get("threshold_used", 0)
        print(f"  ‚Ä¢ Ganador: {winning_node}")
        print(f"  ‚Ä¢ Confidence: {confidence:.3f} >= {threshold:.3f}")
    
    # Telemetr√≠a
    print(f"\nüìä TELEMETR√çA:")
    tools_used = result.get("tools_used", 0)
    budget_exhausted = result.get("budget_exhausted", False)
    print(f"  ‚Ä¢ Tools usados: {tools_used}/{BUDGET_TOOLS_PER_TURN}")
    print(f"  ‚Ä¢ Budget agotado: {budget_exhausted}")
    
    cancelled = result.get("cancelled_branch")
    if cancelled:
        print(f"  ‚Ä¢ Branch cancelado: {cancelled} (STOP_AFTER)")
    
    lat_parallel = result.get("lat_parallel")
    lat_aggregator = result.get("lat_aggregator")
    if lat_parallel:
        print(f"  ‚Ä¢ Latencia parallel: {lat_parallel:.3f}s")
    if lat_aggregator:
        print(f"  ‚Ä¢ Latencia aggregator: {lat_aggregator:.3f}s")
    
    # Re-routing
    hops = result.get("rerouting_count", 0)
    visited = result.get("visited_graphs", [])
    print(f"  ‚Ä¢ Hops: {hops}/{MAX_HOPS}")
    if visited:
        print(f"  ‚Ä¢ Visited: {visited}")
    
    # Clarificaci√≥n
    needs_clarification = result.get("needs_clarification", False)
    if needs_clarification:
        reason = result.get("clarification_reason", "unknown")
        print(f"\n‚ùì CLARIFICACI√ìN:")
        print(f"  ‚Ä¢ Raz√≥n: {reason}")
    
    print("\n" + "="*80)

print("‚úÖ Helper function definida")

‚úÖ Helper function definida


## 4. Tests de Preguntas

### üìù Categor√≠as de Preguntas

1. **TALENT** - Actores, directores, filmograf√≠as
2. **CONTENT** - Metadata de t√≠tulos (a√±o, g√©nero, duraci√≥n)
3. **PLATFORM** - Disponibilidad, d√≥nde ver
4. **BUSINESS** - Precios, rankings, popularidad
5. **COMMON** - Estad√≠sticas, administraci√≥n

### Test 1: Pregunta TALENT (Alta Confianza)

In [4]:
# Pregunta sobre director (deber√≠a ir a talent_graph)
question = "Informacion de Inception"

result = await process_question(question)
print_result(result, question)


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: Informacion de Inception

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): N/A
  ‚Ä¢ Tipos de entidad: N/A
  ‚Ä¢ Tokens: 3


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: Informacion de Inception
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.7,
  "talent": 0.8,
  "content": 0.9,
  "platform": 0.6,
  "common": 0.3
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.7,\n  "talent": 0.8,\n  "content": 0.9,\n  "platform": 0.6,\n  "common": 0.3\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.7,\n  "talent": 0.8,\n  "content": 0.9,\n  "platform": 0.6,\n  "common": 0.3\n}

[SCORES] Puntuaci√≥n de candidatos:
  ‚Ä¢ content: 0.60
  ‚Ä¢ talent: 0.40
  ‚Ä¢ business: 0.20
  ‚Ä¢ platform: 0.20
  ‚Ä¢ common: 0.10

[ROUTER] Top-K = 1 (diff=0.2

In [4]:
# Pregunta sobre director (deber√≠a ir a talent_graph)
question = "¬øQu√© pel√≠culas ha dirigido Christopher Nolan?"

result = await process_question(question)
print_result(result, question)


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: ¬øQu√© pel√≠culas ha dirigido Christopher Nolan?

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): ['AR', 'BO', 'BR', 'CL', 'CO', 'CR', 'CU', 'DO', 'EC', 'GT', 'HN', 'MX', 'NI', 'PA', 'PE', 'PR', 'PY', 'SV', 'UY', 'VE']
  ‚Ä¢ Tipos de entidad: movie
  ‚Ä¢ Tokens: 6


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: ¬øQu√© pel√≠culas ha dirigido Christopher Nolan?
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.2,
  "talent": 0.9,
  "content": 0.6,
  "platform": 0.1,
  "common": 0.0
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.2,\n  "talent": 0.9,\n  "content": 0.6,\n  "platform": 0.1,\n  "common": 0.0\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.2,\n  "talent": 0.9,\n  "content": 0.6,\n  "platform": 0.1,\n  "common": 0.0\n}

### Test 2: Pregunta PLATFORM (Alta Confianza)

In [5]:
# Pregunta sobre disponibilidad (deber√≠a ir a platform_graph)
question = "¬øD√≥nde puedo ver Stranger Things en Argentina?"

result = await process_question(question)
print_result(result, question)


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: ¬øD√≥nde puedo ver Stranger Things en Argentina?

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): ['AT', 'BE', 'BG', 'CY', 'CZ', 'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GR', 'HR', 'HU', 'IE', 'IT', 'LT', 'LU', 'LV', 'MT', 'NL', 'PL', 'PT', 'RO', 'SE', 'SI', 'SK']
  ‚Ä¢ Tipos de entidad: country
  ‚Ä¢ Tokens: 7


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: ¬øD√≥nde puedo ver Stranger Things en Argentina?
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.2,
  "talent": 0.1,
  "content": 0.3,
  "platform": 0.8,
  "common": 0.0
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.2,\n  "talent": 0.1,\n  "content": 0.3,\n  "platform": 0.8,\n  "common": 0.0\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.2,\n  "talent": 0.1,\n  "content": 

### Test 3: Pregunta BUSINESS (Precios)

In [6]:
# Pregunta sobre precios (deber√≠a ir a business_graph)
question = "¬øCu√°nto cuesta Netflix en Argentina?"

result = await process_question(question)
print_result(result, question)


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: ¬øCu√°nto cuesta Netflix en Argentina?

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): ['AT', 'BE', 'BG', 'CY', 'CZ', 'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GR', 'HR', 'HU', 'IE', 'IT', 'LT', 'LU', 'LV', 'MT', 'NL', 'PL', 'PT', 'RO', 'SE', 'SI', 'SK']
  ‚Ä¢ Tipos de entidad: price, platform
  ‚Ä¢ Tokens: 5


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: ¬øCu√°nto cuesta Netflix en Argentina?
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.8,
  "talent": 0.0,
  "content": 0.0,
  "platform": 0.9,
  "common": 0.0
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.8,\n  "talent": 0.0,\n  "content": 0.0,\n  "platform": 0.9,\n  "common": 0.0\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.8,\n  "talent": 0.0,\n  "content": 0.0,\n  "pla

### Test 4: Pregunta CONTENT (Metadata)

In [7]:
# Pregunta sobre metadata (deber√≠a ir a content_graph)
question = "¬øDe qu√© a√±o es la pel√≠cula Inception?"

result = await process_question(question)
print_result(result, question)


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: ¬øDe qu√© a√±o es la pel√≠cula Inception?

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): ['AR', 'BO', 'BR', 'CL', 'CO', 'CR', 'CU', 'DO', 'EC', 'GT', 'HN', 'MX', 'NI', 'PA', 'PE', 'PR', 'PY', 'SV', 'UY', 'VE']
  ‚Ä¢ Tipos de entidad: movie
  ‚Ä¢ Tokens: 7


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: ¬øDe qu√© a√±o es la pel√≠cula Inception?
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.1,
  "talent": 0.2,
  "content": 0.8,
  "platform": 0.2,
  "common": 0.1
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.1,\n  "talent": 0.2,\n  "content": 0.8,\n  "platform": 0.2,\n  "common": 0.1\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.1,\n  "talent": 0.2,\n  "content": 0.8,\n  "platform": 0.2,\n  "common": 0.1\n}

[SCORES] Pun

### Test 5: Pregunta Ambigua (Top-K = 2)

In [8]:
# Pregunta ambigua que podr√≠a ir a m√∫ltiples grafos
question = "¬øCu√°nto cuesta?"

result = await process_question(question)
print_result(result, question)

# Deber√≠a mostrar:
# - Top-K = 2 (scores similares)
# - Posible re-routing si confidence < threshold


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: ¬øCu√°nto cuesta?

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): ['AT', 'BE', 'BG', 'CY', 'CZ', 'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GR', 'HR', 'HU', 'IE', 'IT', 'LT', 'LU', 'LV', 'MT', 'NL', 'PL', 'PT', 'RO', 'SE', 'SI', 'SK']
  ‚Ä¢ Tipos de entidad: price
  ‚Ä¢ Tokens: 2


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: ¬øCu√°nto cuesta?
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.9,
  "talent": 0.1,
  "content": 0.2,
  "platform": 0.2,
  "common": 0.1
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.9,\n  "talent": 0.1,\n  "content": 0.2,\n  "platform": 0.2,\n  "common": 0.1\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.9,\n  "talent": 0.1,\n  "content": 0.2,\n  "platform": 0.2,\n  "common": 0.1\n}

[SCORES] Puntuaci√

### Test 6: Pregunta con Nombre Ambiguo

In [9]:
# Pregunta con nombre que podr√≠a tener m√∫ltiples matches
question = "¬øQu√© pel√≠culas ha hecho Nolan?"

result = await process_question(question)
print_result(result, question)

# Deber√≠a mostrar:
# - talent_graph con alta confianza
# - Posible ambiguous si hay m√∫ltiples "Nolan" (Christopher, Jonathan)


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: ¬øQu√© pel√≠culas ha hecho Nolan?

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): ['AR', 'BO', 'BR', 'CL', 'CO', 'CR', 'CU', 'DO', 'EC', 'GT', 'HN', 'MX', 'NI', 'PA', 'PE', 'PR', 'PY', 'SV', 'UY', 'VE']
  ‚Ä¢ Tipos de entidad: movie
  ‚Ä¢ Tokens: 5


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: ¬øQu√© pel√≠culas ha hecho Nolan?
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.2,
  "talent": 0.9,
  "content": 0.6,
  "platform": 0.1,
  "common": 0.0
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.2,\n  "talent": 0.9,\n  "content": 0.6,\n  "platform": 0.1,\n  "common": 0.0\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.2,\n  "talent": 0.9,\n  "content": 0.6,\n  "platform": 0.1,\n  "common": 0.0\n}

[SCORES] Puntuaci√≥n de cand

## 5. Test Personalizado

Escribe tu propia pregunta aqu√≠:

In [10]:
# üëá Escribe tu pregunta aqu√≠
question = "Tu pregunta aqu√≠"

result = await process_question(question)
print_result(result, question)


üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: Tu pregunta aqu√≠

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): N/A
  ‚Ä¢ Tipos de entidad: N/A
  ‚Ä¢ Tokens: 3


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: Tu pregunta aqu√≠
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.3,
  "talent": 0.7,
  "content": 0.8,
  "platform": 0.4,
  "common": 0.0
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.3,\n  "talent": 0.7,\n  "content": 0.8,\n  "platform": 0.4,\n  "common": 0.0\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.3,\n  "talent": 0.7,\n  "content": 0.8,\n  "platform": 0.4,\n  "common": 0.0\n}

[SCORES] Puntuaci√≥n de candidatos:
  ‚Ä¢ content: 0.60
  ‚Ä¢ talent: 0.40
  ‚Ä¢ business: 0.20
  ‚Ä¢ platform: 0.20
  ‚Ä¢ common: 0.10

[ROUTER] Top-K = 1 (diff=0.20 > 0.10)

[SE

## 6. An√°lisis de M√∫ltiples Preguntas

In [11]:
# Lista de preguntas para testear en batch
test_questions = [
    "¬øQu√© pel√≠culas ha dirigido Steven Spielberg?",
    "¬øD√≥nde puedo ver Breaking Bad?",
    "¬øCu√°nto cuesta Disney+ en M√©xico?",
    "¬øDe qu√© a√±o es The Matrix?",
    "¬øQu√© plataformas tienen Game of Thrones?",
    "¬øQui√©n actu√≥ en Titanic?",
    "¬øCu√°l es la serie m√°s popular?",
]

results = []

for i, q in enumerate(test_questions, 1):
    print(f"\n{'='*80}")
    print(f"TEST {i}/{len(test_questions)}")
    print(f"{'='*80}")
    
    try:
        result = await process_question(q)
        print_result(result, q)
        
        # Guardar para an√°lisis
        results.append({
            "question": q,
            "winning_node": result.get("winning_node"),
            "confidence": result.get("final_confidence"),
            "decision": result.get("aggregator_decision"),
            "hops": result.get("rerouting_count", 0)
        })
    except Exception as e:
        print(f"‚ùå Error: {e}")
        results.append({
            "question": q,
            "error": str(e)
        })

print(f"\n\n{'='*80}")
print("üìä RESUMEN DE TESTS")
print(f"{'='*80}")
print(f"Total preguntas: {len(results)}")
print(f"\nResultados por nodo:")

from collections import Counter
node_counts = Counter([r.get("winning_node") for r in results if "winning_node" in r])
for node, count in node_counts.most_common():
    print(f"  ‚Ä¢ {node}: {count}")


TEST 1/7

üîç LIGHTWEIGHT PREPROCESSOR - Normalizaci√≥n Barata
üìù Pregunta: ¬øQu√© pel√≠culas ha dirigido Steven Spielberg?

[PREPROCESSING] Contexto extra√≠do:
  ‚Ä¢ Pa√≠s (ISO-2): ['AR', 'BO', 'BR', 'CL', 'CO', 'CR', 'CU', 'DO', 'EC', 'GT', 'HN', 'MX', 'NI', 'PA', 'PE', 'PR', 'PY', 'SV', 'UY', 'VE']
  ‚Ä¢ Tipos de entidad: movie
  ‚Ä¢ Tokens: 6


üéØ UNIFIED ROUTER - Puntuaci√≥n de Candidatos
üìù Pregunta: ¬øQu√© pel√≠culas ha dirigido Steven Spielberg?
[ROUTER] Visited: set(), Hops: 0/2
[ROUTER] ü§ñ Puntuando candidatos con LLM...
{
  "business": 0.2,
  "talent": 0.9,
  "content": 0.6,
  "platform": 0.1,
  "common": 0.0
}{'role': 'assistant', 'content': [{'text': '{\n  "business": 0.2,\n  "talent": 0.9,\n  "content": 0.6,\n  "platform": 0.1,\n  "common": 0.0\n}'}]}
[ERROR] JSON inv√°lido: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
[DEBUG] JSON extra√≠do: {\n  "business": 0.2,\n  "talent": 0.9,\n  "content": 0.6,\n  "platform": 0.1,\n  "common"

## 7. Exportar Resultados

In [12]:
# Exportar resultados a JSON
output_file = f"test_results_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"

with open(output_file, 'w', encoding='utf-8') as f:
    json.dump(results, f, indent=2, ensure_ascii=False)

print(f"‚úÖ Resultados exportados a: {output_file}")

‚úÖ Resultados exportados a: test_results_20251014_235426.json


## 8. Verificaci√≥n de Reglas Determin√≠sticas

In [13]:
# Verificar que las reglas se cumplen
print("üîç VERIFICACI√ìN DE REGLAS DETERMIN√çSTICAS")
print("="*80)

# 1. Umbrales
print("\n1. Umbrales por nodo:")
for node, threshold in THRESHOLD_BY_NODE.items():
    if threshold is not None:
        print(f"  ‚úÖ {node}: {threshold}")
    else:
        print(f"  üö´ {node}: None (nunca ganador)")

# 2. Presupuesto
print(f"\n2. Presupuesto y Timeout:")
print(f"  ‚úÖ TOOL_TIMEOUT: {TOOL_TIMEOUT}s")
print(f"  ‚úÖ BUDGET_TOOLS_PER_TURN: {BUDGET_TOOLS_PER_TURN}")
print(f"  ‚úÖ STOP_AFTER: {STOP_AFTER}")

# 3. Loop-safe
print(f"\n3. Loop-safe:")
print(f"  ‚úÖ MAX_HOPS: {MAX_HOPS}")
print(f"  ‚úÖ MAX_CLARIFICATION_FIELDS: 2")

# 4. An√°lisis de resultados
if results:
    print(f"\n4. An√°lisis de resultados:")
    
    # Verificar que ning√∫n common_graph gan√≥
    common_wins = [r for r in results if r.get("winning_node") == "common_graph"]
    if not common_wins:
        print(f"  ‚úÖ common_graph nunca ganador (0/{len(results)})")
    else:
        print(f"  ‚ùå common_graph gan√≥ {len(common_wins)} veces (ERROR)")
    
    # Verificar hops
    max_hops_used = max([r.get("hops", 0) for r in results])
    if max_hops_used <= MAX_HOPS:
        print(f"  ‚úÖ Max hops respetado: {max_hops_used}/{MAX_HOPS}")
    else:
        print(f"  ‚ùå Max hops excedido: {max_hops_used} > {MAX_HOPS}")
    
    # Verificar confidence vs threshold
    threshold_violations = 0
    for r in results:
        node = r.get("winning_node")
        conf = r.get("confidence")
        if node and conf:
            threshold = THRESHOLD_BY_NODE.get(node, 0.65)
            if conf < threshold:
                threshold_violations += 1
    
    if threshold_violations == 0:
        print(f"  ‚úÖ Todos los ganadores cumplen threshold (0 violaciones)")
    else:
        print(f"  ‚ùå {threshold_violations} ganadores no cumplen threshold")

print("\n" + "="*80)

üîç VERIFICACI√ìN DE REGLAS DETERMIN√çSTICAS

1. Umbrales por nodo:
  ‚úÖ platform_graph: 0.7
  ‚úÖ business_graph: 0.7
  ‚úÖ talent_graph: 0.75
  ‚úÖ content_graph: 0.68
  üö´ common_graph: None (nunca ganador)

2. Presupuesto y Timeout:
  ‚úÖ TOOL_TIMEOUT: 2.0s
  ‚úÖ BUDGET_TOOLS_PER_TURN: 6
  ‚úÖ STOP_AFTER: 0.85

3. Loop-safe:
  ‚úÖ MAX_HOPS: 2
  ‚úÖ MAX_CLARIFICATION_FIELDS: 2

4. An√°lisis de resultados:
  ‚úÖ common_graph nunca ganador (0/7)
  ‚úÖ Max hops respetado: 2/2
  ‚úÖ Todos los ganadores cumplen threshold (0 violaciones)



## 9. Notas y Observaciones

Usa esta celda para agregar notas sobre los tests:

### üìù Observaciones:

- [ ] Top-K din√°mico funciona correctamente
- [ ] STOP_AFTER cancela branches cuando corresponde
- [ ] Umbrales calibrados son apropiados
- [ ] Clarifier es loop-safe
- [ ] Presupuesto se respeta

### üêõ Issues encontrados:

1. ...
2. ...

### üí° Mejoras sugeridas:

1. ...
2. ...