# Cap√≠tulo 2: DSPy Essentials & Primeiro Single Agent

**Production-Ready Multi-Agent Systems with DSPy**

---

## Objetivos de Aprendizado

Ao final deste cap√≠tulo, voc√™ ser√° capaz de:

1. ‚úÖ **Entender core concepts do DSPy:**
   - Signatures (inputs/outputs estruturados)
   - Modules (componentes reutiliz√°veis)
   - Predictors (wrappers de LLM)
   - ChainOfThought (racioc√≠nio expl√≠cito)

2. ‚úÖ **Implementar seu primeiro ReAct agent:**
   - Setup completo do ambiente
   - Data models com Pydantic
   - Tool functions
   - Agent completo funcionando

3. ‚úÖ **Identificar limita√ß√µes de single agents:**
   - Testar em casos simples (sucesso)
   - Testar em casos complexos (falha)
   - Entender POR QU√ä falha
   - Motiva√ß√£o para multi-agent (Cap 3)

---

## Pr√©-requisitos

- Python intermedi√°rio (classes, type hints)
- Conceitos b√°sicos de LLMs
- Ambiente configurado (ver [Introdu√ß√£o](../../introducao.md))

---

## Tempo Estimado

- **Leitura + Execu√ß√£o:** 60-75 minutos
- **Experimenta√ß√£o:** +30-45 minutos

---

## Estrutura do Cap√≠tulo

```
1. Teoria: DSPy Core Concepts
2. Setup e Configura√ß√£o
3. Data Models (Pydantic)
4. Tool Functions
5. Primeiro Single Agent (ReAct)
6. Testes: Casos Simples (‚úÖ Sucesso)
7. Testes: Casos Complexos (‚ùå Falha)
8. An√°lise de Limita√ß√µes
9. Conclus√µes e Pr√≥ximos Passos
```

---

**Vamos come√ßar!** üöÄ


## Parte 1: DSPy Core Concepts - Teoria Fundamenta

### O Que √â DSPy?

**DSPy** (Declarative Self-improving Language Programs) √© um framework criado por Omar Khattab e equipe no Stanford NLP Group que trata **LLM pipelines como programs**, n√£o como strings m√°gicas.

**Diferen√ßa fundamental:**

**Prompt Engineering Tradicional:**
```python
prompt = """
You are a helpful assistant. Given user request, you should analyze it carefully,
use available tools, and provide a detailed response. Always be professional...
[20+ linhas de instru√ß√µes]
"""
response = llm(prompt + user_input)
```

**Problemas:**
- ‚ùå Prompts longos e fr√°geis
- ‚ùå Dif√≠cil otimizar sistematicamente
- ‚ùå Muito trial-and-error
- ‚ùå N√£o reutiliz√°vel

**DSPy:**
```python
class MyAgent(dspy.Module):
    def __init__(self):
        self.process = dspy.ChainOfThought(ProcessSignature)
    
    def forward(self, user_input):
        return self.process(user_input=user_input)

# Optimization autom√°tica
optimized = optimizer.compile(MyAgent(), trainset=examples)
```

**Vantagens:**
- ‚úÖ Declarativo: define O QUE, n√£o COMO
- ‚úÖ Optimization nativa (MIPRO, BootstrapFewShot)
- ‚úÖ Modular e reutiliz√°vel
- ‚úÖ Test√°vel e reproduz√≠vel

---

### Core Concepts

DSPy tem 3 conceitos fundamentais:

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ          DSPy Program               ‚îÇ
‚îÇ                                     ‚îÇ
‚îÇ  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê ‚îÇ
‚îÇ  ‚îÇ   1. SIGNATURE                ‚îÇ ‚îÇ
‚îÇ  ‚îÇ   (O QU√ä fazer)               ‚îÇ ‚îÇ
‚îÇ  ‚îÇ   inputs ‚Üí outputs            ‚îÇ ‚îÇ
‚îÇ  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò ‚îÇ
‚îÇ              ‚Üì                      ‚îÇ
‚îÇ  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê ‚îÇ
‚îÇ  ‚îÇ   2. MODULE                   ‚îÇ ‚îÇ
‚îÇ  ‚îÇ   (COMO fazer)                ‚îÇ ‚îÇ
‚îÇ  ‚îÇ   l√≥gica de processamento     ‚îÇ ‚îÇ
‚îÇ  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò ‚îÇ
‚îÇ              ‚Üì                      ‚îÇ
‚îÇ  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê ‚îÇ
‚îÇ  ‚îÇ   3. PREDICTOR                ‚îÇ ‚îÇ
‚îÇ  ‚îÇ   (QUEM executa)              ‚îÇ ‚îÇ
‚îÇ  ‚îÇ   wrapper do LLM              ‚îÇ ‚îÇ
‚îÇ  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò ‚îÇ
‚îÇ                                     ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

Vamos explorar cada um em detalhes.

---

### 1. Signatures: Inputs e Outputs Estruturados

**Signature** define **O QU√ä** seu programa faz: quais inputs recebe e quais outputs produz.

**Conceito:** Similar√£o a type signatures em linguagens de programa√ß√£o.

```python
# Em Python puro
def soma(a: int, b: int) -> int:
    return a + b
```

**Em DSPy:**
```python
class ProcessRequest(dspy.Signature):
    """Processa requisi√ß√£o do usu√°rio e retorna resposta."""
    
    user_request: str = dspy.InputField(
        desc="Requisi√ß√£o do usu√°rio em linguagem natural"
    )
    response: str = dspy.OutputField(
        desc="Resposta processada para o usu√°rio"
    )
```

**Componentes de uma Signature:**

1. **Docstring:** Descreve o objetivo (usado no prompt)
2. **InputField:** Campos de entrada com descri√ß√µes
3. **OutputField:** Campos de sa√≠da esperados

**Por que Signatures?**
- ‚úÖ Estrutura clara de entrada/sa√≠da
- ‚úÖ Type safety
- ‚úÖ Auto-documenta√ß√£o
- ‚úÖ Optimization usa descri√ß√µes para melhorar prompts

---

### 2. Modules: Componentes Reutiliz√°veis

**Module** define **COMO** processar: a l√≥gica do seu programa.

**Conceito:** Similar a `nn.Module` no PyTorch, mas para LLMs.

```python
class MyAgent(dspy.Module):
    def __init__(self):
        super().__init__()
        # Inicializa sub-m√≥dulos
        self.process = dspy.ChainOfThought(ProcessRequest)
    
    def forward(self, user_request: str) -> dspy.Prediction:
        # L√≥gica de processamento
        return self.process(user_request=user_request)
```

**Caracter√≠sticas:**
- Herda de `dspy.Module`
- Implementa `forward()` method
- Pode compor outros modules
- Pode ser otimizado

**Por que Modules?**
- ‚úÖ Reutiliz√°vel
- ‚úÖ Test√°vel
- ‚úÖ Composable (modules dentro de modules)
- ‚úÖ Optimization-ready

---

### 3. Predictors: Wrappers do LLM

**Predictor** √© **QUEM** executa: o wrapper que chama o LLM.

DSPy oferece v√°rios predictors prontos:

#### `dspy.Predict`

Predictor b√°sico, chamada direta ao LLM.

```python
predict = dspy.Predict(ProcessRequest)
result = predict(user_request="Hello")
```

**Quando usar:** Tasks simples, sem racioc√≠nio complexo.

---

#### `dspy.ChainOfThought`

Adiciona **racioc√≠nio expl√≠cito** antes da resposta.

```python
cot = dspy.ChainOfThought(ProcessRequest)
result = cot(user_request="Analyze this contract")
# result.rationale: "Let me think step by step..."
# result.response: "Based on analysis..."
```

**Como funciona:**
1. LLM gera `rationale` (reasoning)
2. Usa rationale para gerar `response`

**Quando usar:** Tasks que beneficiam de reasoning expl√≠cito.

**Refer√™ncia:** Baseado em Chain-of-Thought Prompting [Wei et al., 2022]

---

#### `dspy.ReAct`

**Reasoning + Acting:** alterna entre racioc√≠nio e a√ß√µes (tool use).

```python
react = dspy.ReAct(ProcessSignature, tools=[search, calculator])
result = react(user_request="What's 2+2 and the weather in SF?")
# Alterna: thought ‚Üí action (tool) ‚Üí observation ‚Üí thought ‚Üí ...
```

**Como funciona:**
```
1. Thought: "I need to calculate 2+2"
2. Action: calculator(2+2)
3. Observation: 4
4. Thought: "Now I need weather for SF"
5. Action: search_weather("SF")
6. Observation: "Sunny, 72¬∞F"
7. Thought: "I have all information"
8. Answer: "2+2 is 4. Weather in SF is sunny..."
```

**Quando usar:** Tasks que requerem usar tools/APIs externas.

**Refer√™ncia:** Baseado em ReAct pattern [Yao et al., 2022]

---

### Compara√ß√£o de Predictors

| Predictor | Racioc√≠nio | Tool Use | Complexidade | Quando Usar |
|-----------|-----------|----------|--------------|-------------|
| `Predict` | ‚ùå | ‚ùå | Baixa | Tasks simples, classifica√ß√£o |
| `ChainOfThought` | ‚úÖ | ‚ùå | M√©dia | An√°lise, reasoning |
| `ReAct` | ‚úÖ | ‚úÖ | Alta | Agents, tool use |

---

### Exemplo Completo: Anatomia de um DSPy Program

```python
# 1. SIGNATURE: O QU√ä fazer
class AnalyzeText(dspy.Signature):
    """Analisa texto e extrai insights principais."""
    text: str = dspy.InputField(desc="Texto a ser analisado")
    insights: str = dspy.OutputField(desc="Insights extra√≠dos")

# 2. MODULE: COMO fazer
class TextAnalyzer(dspy.Module):
    def __init__(self):
        super().__init__()
        # 3. PREDICTOR: QUEM executa
        self.analyze = dspy.ChainOfThought(AnalyzeText)
    
    def forward(self, text: str):
        return self.analyze(text=text)

# Uso
analyzer = TextAnalyzer()
result = analyzer(text="DSPy √© um framework...")
print(result.insights)
```

---

### DSPy vs Outros Frameworks

**vs LangChain:**
- LangChain: orquestra√ß√£o, muitas integra√ß√µes
- DSPy: optimization nativa, mais structured

**vs Prompt Engineering:**
- Prompt Engineering: manual, trial-and-error
- DSPy: autom√°tico, optimization-first

**vs Fine-Tuning:**
- Fine-Tuning: modifica modelo (caro, lento)
- DSPy Optimization: modifica prompts (r√°pido, barato)

---

### Refer√™ncias

**Paper Principal:**
> Khattab, O., et al. (2023). **DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines.** arXiv:2310.03714.

**Chain-of-Thought:**
> Wei, J., et al. (2022). **Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.** NeurIPS 2022.

**ReAct:**
> Yao, S., et al. (2022). **ReAct: Synergizing Reasoning and Acting in Language Models.** arXiv:2210.03629.

---

**Agora que entendemos a teoria, vamos IMPLEMENTAR nosso primeiro agent!** üíª


## Parte 2: Setup e Configura√ß√£o

Agora vamos configurar o ambiente para nosso primeiro agent.

### Imports Necess√°rios

Vamos importar tudo que precisamos:


In [None]:
# Core imports
import dspy
from datetime import datetime, timedelta
from typing import List, Optional, Dict, Any
import json
import uuid
import os

# Data validation
from pydantic import BaseModel, Field

# Environment variables
from dotenv import load_dotenv
load_dotenv()

print("‚úÖ Imports completos!")


### Configura√ß√£o do LLM

DSPy suporta m√∫ltiplos provedores de LLM. Vamos usar **Groq** (r√°pido e com tier gratuito generoso).

**Alternativas:**
- OpenAI: `dspy.LM(model="openai/gpt-4")`
- Anthropic: `dspy.LM(model="anthropic/claude-3-5-sonnet-20241022")`
- Local: Ollama, vLLM, etc.

**Configura√ß√£o de API Key:**
```bash
# No terminal ou .env file
export GROQ_API_KEY="your-key-here"
```

**Obter key gratuita:** https://console.groq.com


In [None]:
# Configurar LLM
lm = dspy.LM(
    model="groq/llama-3.3-70b-versatile",
    temperature=0.1  # Baixa temperatura para respostas mais determin√≠sticas
)

# Configurar DSPy para usar este LLM
dspy.configure(lm=lm)

print(f"‚úÖ LLM configurado: {lm.model}")
print(f"   Temperatura: {lm.kwargs.get('temperature', 'default')}")


## Parte 3: Data Models com Pydantic

Vamos modelar nosso dom√≠nio: **sistema de booking de voos**.

**Por que Pydantic?**
- ‚úÖ Valida√ß√£o autom√°tica de dados
- ‚úÖ Type safety
- ‚úÖ Serializa√ß√£o JSON f√°cil
- ‚úÖ Integra√ß√£o perfeita com DSPy

### Domain Models

Nosso sistema precisa representar:


In [None]:
class UserProfile(BaseModel):
    """Perfil do usu√°rio."""
    name: str
    user_id: str
    email: str
    phone: str
    frequent_flyer_number: Optional[str] = None

class Flight(BaseModel):
    """Informa√ß√µes de um voo."""
    flight_id: str
    flight_number: str
    departure_airport: str
    arrival_airport: str
    departure_time: str
    arrival_time: str
    duration_minutes: int
    price: float
    available_seats: int
    
class Itinerary(BaseModel):
    """Itiner√°rio de viagem (pode ter m√∫ltiplos voos)."""
    itinerary_id: str
    user_id: str
    flights: List[Flight]
    total_price: float
    booking_date: str
    status: str  # "confirmed", "cancelled", "pending"

print("‚úÖ Models definidos!")
print("\nExemplo de uso:")
user = UserProfile(
    name="Maria",
    user_id="user_001", 
    email="maria@example.com",
    phone="+55-11-98765-4321"
)
print(f"   User: {user.name} ({user.email})")


### Mock Databases

Para simular um sistema real, vamos criar databases em mem√≥ria:


In [None]:
# Database de usu√°rios
users_db = {
    "Maria": UserProfile(
        name="Maria",
        user_id="user_001",
        email="maria@example.com",
        phone="+55-11-98765-4321",
        frequent_flyer_number="FF12345"
    ),
    "Jo√£o": UserProfile(
        name="Jo√£o",
        user_id="user_002",
        email="joao@example.com",
        phone="+55-11-98765-5678"
    )
}

# Database de voos
flights_db = {
    "GRU-SDU": [  # S√£o Paulo ‚Üí Rio
        Flight(
            flight_id="f001",
            flight_number="G3-1001",
            departure_airport="GRU",
            arrival_airport="SDU",
            departure_time="08:00",
            arrival_time="09:10",
            duration_minutes=70,
            price=450.00,
            available_seats=15
        ),
        Flight(
            flight_id="f002",
            flight_number="LA-4567",
            departure_airport="GRU",
            arrival_airport="SDU",
            departure_time="14:00",
            arrival_time="15:15",
            duration_minutes=75,
            price=380.00,
            available_seats=8
        )
    ],
    "SDU-GRU": [  # Rio ‚Üí S√£o Paulo
        Flight(
            flight_id="f003",
            flight_number="G3-1002",
            departure_airport="SDU",
            arrival_airport="GRU",
            departure_time="10:00",
            arrival_time="11:15",
            duration_minutes=75,
            price=420.00,
            available_seats=12
        )
    ]
}

# Databases para bookings (come√ßam vazios)
itineraries_db = {}

print("‚úÖ Databases criados!")
print(f"   Usu√°rios: {len(users_db)}")
print(f"   Rotas: {len(flights_db)}")
print(f"   Total de voos: {sum(len(v) for v in flights_db.values())}")
