# 01 — Timeouts, Retries y Circuit Breakers

**Objetivo**: Implementar patrones de resiliencia para agentes que dependen de APIs externas.

## Contenido
1. Simulacion de fallos (network, rate limit, malformed output)
2. Timeout configuration
3. Retry con exponential backoff (tenacity)
4. Circuit breaker

In [None]:
import os
import json
import time
import random
from dotenv import load_dotenv
from openai import OpenAI
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

load_dotenv()

client = OpenAI()
MODEL = "gpt-5-mini"

print("=" * 60)
print("TIMEOUTS, RETRIES Y CIRCUIT BREAKERS")
print("=" * 60)

## 1. Simulacion de Fallos

En produccion, las APIs fallan. Simulamos tres tipos de fallo comunes:
- **Network timeout**: La API no responde a tiempo
- **Rate limit**: Demasiadas requests por segundo
- **Malformed output**: La respuesta no tiene el formato esperado

In [None]:
# ============================================================
# SIMULACION DE FALLOS
# ============================================================

class SimulatedNetworkError(Exception):
    """Simula un timeout de red."""
    pass

class SimulatedRateLimitError(Exception):
    """Simula un rate limit."""
    pass

class MalformedOutputError(Exception):
    """La respuesta no tiene el formato esperado."""
    pass


def llm_call_unreliable(prompt: str, fail_rate: float = 0.5) -> dict:
    """
    Simula una llamada a LLM que puede fallar.
    
    Args:
        prompt: Texto a enviar.
        fail_rate: Probabilidad de fallo (0-1).
    
    Returns:
        Dict con respuesta y metricas.
    
    Raises:
        SimulatedNetworkError, SimulatedRateLimitError, MalformedOutputError
    """
    # Simular fallo aleatorio
    if random.random() < fail_rate:
        error_type = random.choice(["network", "rate_limit", "malformed"])
        if error_type == "network":
            raise SimulatedNetworkError("Connection timeout after 30s")
        elif error_type == "rate_limit":
            raise SimulatedRateLimitError("Rate limit exceeded: 429")
        else:
            raise MalformedOutputError("Response is not valid JSON")
    
    # Llamada real
    t0 = time.time()
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": "Responde en español, brevemente."},
            {"role": "user", "content": prompt},
        ],
        max_tokens=100,
        timeout=10,
    )
    latencia = (time.time() - t0) * 1000
    
    return {
        "content": response.choices[0].message.content,
        "tokens": response.usage.total_tokens,
        "latencia_ms": round(latencia, 1),
    }


# Test sin simulacion
print("Llamada sin simulacion de fallos:")
result = llm_call_unreliable("Que es un agente de IA?", fail_rate=0.0)
print(f"  {result['content'][:100]}...")
print(f"  Tokens: {result['tokens']}, Latencia: {result['latencia_ms']}ms")

# Test con alta tasa de fallos
print("\n10 llamadas con 50% tasa de fallos:")
exitos, fallos = 0, 0
for i in range(10):
    try:
        llm_call_unreliable("Test", fail_rate=0.5)
        exitos += 1
    except Exception as e:
        fallos += 1
print(f"  Exitos: {exitos}, Fallos: {fallos}")

## 2. Retry con Exponential Backoff

`tenacity` permite configurar reintentos con backoff exponencial:
- Intento 1: espera 1s
- Intento 2: espera 2s
- Intento 3: espera 4s

In [None]:
# ============================================================
# RETRY CON TENACITY
# ============================================================

@retry(
    stop=stop_after_attempt(4),
    wait=wait_exponential(multiplier=1, min=1, max=10),
    retry=retry_if_exception_type((SimulatedNetworkError, SimulatedRateLimitError)),
    reraise=True,
)
def llm_call_with_retry(prompt: str, fail_rate: float = 0.4) -> dict:
    """Llamada a LLM con retry automatico."""
    return llm_call_unreliable(prompt, fail_rate=fail_rate)


# Test
print("=" * 60)
print("RETRY CON EXPONENTIAL BACKOFF")
print("=" * 60)

intentos_totales = 0
exitos = 0
for i in range(5):
    try:
        t0 = time.time()
        result = llm_call_with_retry("Que es machine learning?", fail_rate=0.3)
        total_time = (time.time() - t0) * 1000
        print(f"  [{i+1}] Exito en {total_time:.0f}ms total")
        exitos += 1
    except Exception as e:
        print(f"  [{i+1}] Fallo permanente: {type(e).__name__}: {e}")

print(f"\nExitos: {exitos}/5")

## 3. Circuit Breaker

El circuit breaker "abre" (deja de intentar) despues de N fallos consecutivos.
Evita bombardear una API que esta caida.

```
CLOSED ──(fallo)──▶ HALF-OPEN ──(fallo)──▶ OPEN
  ▲                      │                    │
  │                      │(exito)             │(cooldown)
  └──────────────────────┘                    │
  ▲                                           │
  └───────────────────────────────────────────┘
```

In [None]:
# ============================================================
# CIRCUIT BREAKER
# ============================================================

class CircuitBreaker:
    """Circuit breaker para proteger llamadas a APIs."""
    
    def __init__(self, max_failures: int = 3, cooldown_seconds: float = 10.0):
        self.max_failures = max_failures
        self.cooldown_seconds = cooldown_seconds
        self.failure_count = 0
        self.state = "closed"  # closed, open, half-open
        self.last_failure_time = 0.0
        self.stats = {"calls": 0, "successes": 0, "failures": 0, "rejected": 0}
    
    def call(self, fn, *args, **kwargs):
        """Ejecuta fn protegida por el circuit breaker."""
        self.stats["calls"] += 1
        
        # Check if circuit is open
        if self.state == "open":
            elapsed = time.time() - self.last_failure_time
            if elapsed < self.cooldown_seconds:
                self.stats["rejected"] += 1
                raise RuntimeError(f"Circuit OPEN. Cooldown: {self.cooldown_seconds - elapsed:.1f}s restantes")
            else:
                self.state = "half-open"
        
        try:
            result = fn(*args, **kwargs)
            self.failure_count = 0
            self.state = "closed"
            self.stats["successes"] += 1
            return result
        except Exception as e:
            self.failure_count += 1
            self.last_failure_time = time.time()
            self.stats["failures"] += 1
            
            if self.failure_count >= self.max_failures:
                self.state = "open"
                print(f"  ⚠ Circuit OPENED tras {self.failure_count} fallos consecutivos")
            
            raise
    
    def status(self) -> dict:
        return {
            "state": self.state,
            "failure_count": self.failure_count,
            "stats": self.stats,
        }


# Test
cb = CircuitBreaker(max_failures=3, cooldown_seconds=5)

print("=" * 60)
print("CIRCUIT BREAKER")
print("=" * 60)

for i in range(8):
    try:
        result = cb.call(llm_call_unreliable, "Test", fail_rate=0.7)
        print(f"  [{i+1}] Exito | Estado: {cb.state}")
    except RuntimeError as e:
        print(f"  [{i+1}] Rechazado (circuit open) | {e}")
    except Exception as e:
        print(f"  [{i+1}] Fallo: {type(e).__name__} | Estado: {cb.state} ({cb.failure_count} fallos)")

print(f"\nEstadisticas: {cb.status()}")

## 4. Patron Completo: Retry + Circuit Breaker

In [None]:
# ============================================================
# PATRON COMBINADO
# ============================================================

def llamada_resiliente(prompt: str, cb: CircuitBreaker | None = None) -> dict:
    """
    Llamada resiliente: circuit breaker + retry + timeout.
    
    Args:
        prompt: Texto a enviar.
        cb: Circuit breaker (opcional).
    
    Returns:
        Dict con respuesta o error controlado.
    """
    if cb is None:
        cb = CircuitBreaker(max_failures=3, cooldown_seconds=5)
    
    try:
        result = cb.call(llm_call_with_retry, prompt, fail_rate=0.3)
        return {"status": "ok", **result}
    except RuntimeError:
        return {"status": "circuit_open", "content": "Servicio temporalmente no disponible."}
    except Exception as e:
        return {"status": "error", "content": f"Error tras reintentos: {type(e).__name__}"}


cb_test = CircuitBreaker(max_failures=3, cooldown_seconds=5)

print("=" * 60)
print("PATRON COMBINADO: Circuit Breaker + Retry")
print("=" * 60)

for i in range(6):
    result = llamada_resiliente("Que es deep learning?", cb=cb_test)
    print(f"  [{i+1}] Status: {result['status']:15s} | {result['content'][:80] if result.get('content') else 'N/A'}")

print(f"\nCircuit breaker: {cb_test.status()}")

## Takeaways

1. **Timeouts** previenen que una request lenta bloquee todo el sistema
2. **Retry con backoff** maneja fallos transitorios sin bombardear la API
3. **Circuit breaker** protege contra APIs caidas, evitando cascadas de fallos
4. La combinacion de los tres es el standard de produccion
5. Siempre definir limites: max retries, max cooldown, max timeout
6. Monitorear las estadisticas del circuit breaker para detectar problemas de infra