# üìã Padr√µes GoF no LangExtract

## üéØ 5 Padr√µes Identificados

| # | Padr√£o | Arquivo | GoF | O que faz |
|---|--------|---------|-----|-----------|
| 1 | **Factory** | `factory.py` | ‚ö†Ô∏è | Cria providers (Gemini/OpenAI/Ollama) via `create_model()` |
| 2 | **Iterator** | `chunking.py` | ‚úÖ | Itera documentos grandes com `ChunkIterator.__next__()` |
| 3 | **Facade** | `extraction.py` | ‚úÖ | Fun√ß√£o `extract()` oculta 6 subsistemas |
| 4 | **Builder** | `prompting.py` | ‚ö†Ô∏è | `QAPromptGenerator.render()` constr√≥i prompts em 6 passos |
| 5 | **Strategy** | `base_model.py` | ‚úÖ | `BaseLanguageModel` interface, 3 algoritmos (Gemini/OpenAI/Ollama) |

**Legenda:** ‚úÖ 100% GoF | ‚ö†Ô∏è Varia√ß√£o simplificada

---

## üìù Evid√™ncias

### 1. Factory (`factory.py`)
```python
def create_model(config) -> BaseLanguageModel:
    provider_cls = router.resolve(config.model_id)
    return provider_cls(**kwargs)  # Retorna Gemini/OpenAI/Ollama
```

### 2. Iterator (`chunking.py`)
```python
class ChunkIterator:
    def __next__(self) -> Chunk:
        if not self._has_more:
            raise StopIteration
        return self._get_next_chunk()
```

### 3. Facade (`extraction.py`)
```python
def extract(...):
    model = create_model(...)      # Factory
    prompt = QAPromptGenerator()   # Builder
    return Annotator(model).annotate(...)  # Strategy
```

### 4. Builder (`prompting.py`)
```python
def render(question):
    lines = []
    lines.append(description)       # Passo 1
    lines.append(context)           # Passo 2
    lines.append(examples)          # Passo 3
    lines.append(question)          # Passo 4
    return "\n".join(lines)
```

### 5. Strategy (`base_model.py` + providers)
```python
class BaseLanguageModel(ABC):
    @abstractmethod
    def infer(...): pass

class GeminiLanguageModel(BaseLanguageModel):
    def infer(...): # google-genai SDK

class OpenAILanguageModel(BaseLanguageModel):
    def infer(...): # openai SDK
```

---

## ÔøΩ Documenta√ß√£o Detalhada

1. `factory_pattern_analysis.md`
2. `iterator_pattern_analysis.md`
3. `facade_pattern_analysis.md`
4. `builder_pattern_analysis.md`
5. `strategy_pattern_analysis.md`

**Total:** 5 padr√µes GoF com compara√ß√µes c√≥digo cl√°ssico vs LangExtract.


# üè≠ Factory Pattern - An√°lise Comparativa
## LangExtract vs Padr√£o Arquitetural Cl√°ssico

---

## üìö **1. PADR√ÉO FACTORY CL√ÅSSICO (GoF)**

### **Defini√ß√£o Te√≥rica**

> **Factory Method Pattern**: Define uma interface para criar um objeto, mas deixa as subclasses decidirem qual classe instanciar. O Factory Method permite que uma classe delegue a instancia√ß√£o para subclasses.

### **Estrutura UML Cl√°ssica**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ    Creator          ‚îÇ (Abstract)
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ + factoryMethod()   ‚îÇ ‚óÑ‚îÄ‚îÄ‚îÄ M√©todo abstrato
‚îÇ + operation()       ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         ‚ñ≥
         ‚îÇ (heran√ßa)
         ‚îÇ
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ ConcreteCreatorA    ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ + factoryMethod()   ‚îÇ ‚óÑ‚îÄ‚îÄ‚îÄ Retorna ProductA
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ    Product          ‚îÇ (Interface)
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         ‚ñ≥
         ‚îÇ
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ   ProductA          ‚îÇ   ProductB     ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### **C√≥digo Exemplo Cl√°ssico**

```python
from abc import ABC, abstractmethod

# 1. PRODUCT - Interface comum
class Animal(ABC):
    @abstractmethod
    def speak(self) -> str:
        pass

# 2. CONCRETE PRODUCTS - Implementa√ß√µes espec√≠ficas
class Dog(Animal):
    def speak(self) -> str:
        return "Woof!"

class Cat(Animal):
    def speak(self) -> str:
        return "Meow!"

# 3. CREATOR - Factory abstrata
class AnimalFactory(ABC):
    @abstractmethod
    def create_animal(self) -> Animal:
        """Factory Method - subclasses implementam"""
        pass
    
    def make_sound(self) -> str:
        """Opera√ß√£o que usa o factory method"""
        animal = self.create_animal()
        return animal.speak()

# 4. CONCRETE CREATORS - Factories espec√≠ficas
class DogFactory(AnimalFactory):
    def create_animal(self) -> Animal:
        return Dog()

class CatFactory(AnimalFactory):
    def create_animal(self) -> Animal:
        return Cat()

# 5. USO
factory = DogFactory()
print(factory.make_sound())  # Output: Woof!
```

---

## üîç **2. FACTORY NO LANGEXTRACT (factory.py)**

### **Estrutura do C√≥digo Real**

```python
# ============================================
# ARQUIVO: langextract/factory.py
# ============================================

import dataclasses
from langextract import providers
from langextract.providers import router

@dataclasses.dataclass(slots=True, frozen=True)
class ModelConfig:
    """Configura√ß√£o para criar um provider"""
    model_id: str | None = None
    provider: str | None = None
    provider_kwargs: dict[str, Any] = dataclasses.field(default_factory=dict)

# ============================================
# FACTORY METHOD - Fun√ß√£o principal
# ============================================
def create_model(
    config: ModelConfig,
    examples: Sequence[Any] | None = None,
    use_schema_constraints: bool = False,
    fence_output: bool | None = None,
) -> BaseLanguageModel:
    """
    Factory Method que cria inst√¢ncias de providers.
    
    Equivale ao m√©todo abstrato factoryMethod() do padr√£o GoF.
    """
    
    # 1. Carrega providers dispon√≠veis (built-in + plugins)
    providers.load_builtins_once()
    providers.load_plugins_once()
    
    # 2. RESOLU√á√ÉO: Qual classe instanciar?
    if config.provider:
        # Sele√ß√£o expl√≠cita por nome
        provider_class = router.resolve_provider(config.provider)
    else:
        # Sele√ß√£o autom√°tica por model_id
        provider_class = router.resolve(config.model_id)
    
    # 3. PREPARA√á√ÉO: Par√¢metros com defaults
    kwargs = _kwargs_with_environment_defaults(
        config.model_id or config.provider or "",
        config.provider_kwargs
    )
    
    if config.model_id:
        kwargs["model_id"] = config.model_id
    
    # 4. INSTANCIA√á√ÉO: Cria o produto concreto
    try:
        model = provider_class(**kwargs)
        return model
    except (ValueError, TypeError) as e:
        raise InferenceConfigError(
            f"Failed to create provider {provider_class.__name__}: {e}"
        ) from e

# ============================================
# HELPER: Adiciona defaults de ambiente
# ============================================
def _kwargs_with_environment_defaults(
    model_id: str,
    kwargs: dict[str, Any]
) -> dict[str, Any]:
    """Adiciona API keys de vari√°veis de ambiente"""
    resolved = dict(kwargs)
    
    if "api_key" not in resolved:
        # Tenta buscar de GEMINI_API_KEY, OPENAI_API_KEY, etc.
        env_vars_by_provider = {
            "gemini": ("GEMINI_API_KEY", "LANGEXTRACT_API_KEY"),
            "gpt": ("OPENAI_API_KEY", "LANGEXTRACT_API_KEY"),
        }
        
        for provider_prefix, env_vars in env_vars_by_provider.items():
            if provider_prefix in model_id.lower():
                for env_var in env_vars:
                    api_key = os.getenv(env_var)
                    if api_key:
                        resolved["api_key"] = api_key
                        break
    
    return resolved
```

### **PRODUCTS - Implementa√ß√µes concretas (providers)**

```python
# ============================================
# ARQUIVO: langextract/providers/gemini.py
# ============================================
class GeminiLanguageModel(BaseLanguageModel):
    """Provider concreto para Gemini"""
    
    def __init__(self, model_id: str = 'gemini-2.5-flash', api_key: str = None, **kwargs):
        self.model_id = model_id
        self.api_key = api_key
        self._client = genai.Client(api_key=api_key)
    
    def infer(self, batch_prompts: Sequence[str], **kwargs):
        """Implementa√ß√£o espec√≠fica de infer√™ncia"""
        for prompt in batch_prompts:
            response = self._client.models.generate_content(
                model=self.model_id,
                contents=prompt
            )
            yield [ScoredOutput(score=1.0, output=response.text)]

# ============================================
# ARQUIVO: langextract/providers/openai.py
# ============================================
class OpenAILanguageModel(BaseLanguageModel):
    """Provider concreto para OpenAI"""
    
    def __init__(self, model_id: str = 'gpt-4o', api_key: str = None, **kwargs):
        self.model_id = model_id
        self.api_key = api_key
        self._client = openai.OpenAI(api_key=api_key)
    
    def infer(self, batch_prompts: Sequence[str], **kwargs):
        """Implementa√ß√£o espec√≠fica de infer√™ncia"""
        for prompt in batch_prompts:
            response = self._client.chat.completions.create(
                model=self.model_id,
                messages=[{"role": "user", "content": prompt}]
            )
            yield [ScoredOutput(score=1.0, output=response.choices[0].message.content)]
```

### **INTERFACE COMUM (Product)**

```python
# ============================================
# ARQUIVO: langextract/core/base_model.py
# ============================================
class BaseLanguageModel(ABC):
    """Interface comum para todos os providers"""
    
    @abstractmethod
    def infer(self, batch_prompts: Sequence[str], **kwargs) -> Iterator[Sequence[ScoredOutput]]:
        """M√©todo que todo provider deve implementar"""
        pass
```

---

## üìä **3. COMPARA√á√ÉO LADO A LADO**

### **Tabela Comparativa**

| Aspecto | Factory Cl√°ssico (GoF) | LangExtract Factory |
|---------|------------------------|---------------------|
| **Factory Method** | `create_animal()` (abstrato) | `create_model()` (fun√ß√£o) |
| **Creators** | `DogFactory`, `CatFactory` | N√£o h√° classes creator separadas |
| **Products** | `Dog`, `Cat` | `GeminiLanguageModel`, `OpenAILanguageModel` |
| **Interface Product** | `Animal` | `BaseLanguageModel` |
| **Resolu√ß√£o** | Polimorfismo (qual creator?) | Router + Registry (qual provider?) |
| **Configura√ß√£o** | Hardcoded na factory | `ModelConfig` dataclass |
| **Defaults** | N√£o tem | `_kwargs_with_environment_defaults()` |
| **Extensibilidade** | Criar nova subclasse Creator | Plugin system (entry points) |

---

## üéØ **4. MAPEAMENTO CONCEITUAL**

### **Elementos do Padr√£o GoF ‚Üí LangExtract**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  PADR√ÉO GoF              ‚Üí    LANGEXTRACT                   ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Creator (abstract)      ‚Üí    N√£o existe (fun√ß√£o direta)    ‚îÇ
‚îÇ  factoryMethod()         ‚Üí    create_model()                ‚îÇ
‚îÇ  ConcreteCreator         ‚Üí    N√£o existe (resolvido pelo    ‚îÇ
‚îÇ                               Router)                        ‚îÇ
‚îÇ  Product (interface)     ‚Üí    BaseLanguageModel             ‚îÇ
‚îÇ  ConcreteProductA        ‚Üí    GeminiLanguageModel           ‚îÇ
‚îÇ  ConcreteProductB        ‚Üí    OpenAILanguageModel           ‚îÇ
‚îÇ  ConcreteProductC        ‚Üí    OllamaLanguageModel           ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### **Fluxo de Decis√£o**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Cliente chama:      ‚îÇ
‚îÇ create_model()      ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
           ‚îÇ
           ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ ModelConfig         ‚îÇ
‚îÇ - model_id          ‚îÇ ‚îÄ‚îÄ‚îê
‚îÇ - provider          ‚îÇ   ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò   ‚îÇ
                          ‚îÇ
                          ‚ñº
                 ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
                 ‚îÇ Router.resolve()‚îÇ
                 ‚îÇ (Registry)      ‚îÇ
                 ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                          ‚îÇ
        ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îº‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
        ‚ñº                 ‚ñº                 ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Gemini       ‚îÇ  ‚îÇ OpenAI       ‚îÇ  ‚îÇ Ollama       ‚îÇ
‚îÇ LanguageModel‚îÇ  ‚îÇ LanguageModel‚îÇ  ‚îÇ LanguageModel‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

---

## üí° **5. VARIA√á√ïES DO PADR√ÉO**

### **A) Simple Factory (o que LangExtract usa)**

```python
# ‚ùå N√ÉO TEM hierarquia de Creators
# ‚úÖ TEM fun√ß√£o factory simples

def create_model(config: ModelConfig) -> BaseLanguageModel:
    # Decis√£o baseada em config
    if config.provider == "gemini":
        return GeminiLanguageModel(...)
    elif config.provider == "openai":
        return OpenAILanguageModel(...)
    # ...
```

**Caracter√≠stica:** Uma √∫nica fun√ß√£o decide qual classe instanciar.

### **B) Factory Method (GoF cl√°ssico)**

```python
# ‚úÖ TEM hierarquia de Creators
# ‚úÖ TEM m√©todo abstrato

class ModelFactory(ABC):
    @abstractmethod
    def create_model(self) -> BaseLanguageModel:
        pass

class GeminiFactory(ModelFactory):
    def create_model(self) -> BaseLanguageModel:
        return GeminiLanguageModel(...)

class OpenAIFactory(ModelFactory):
    def create_model(self) -> BaseLanguageModel:
        return OpenAILanguageModel(...)
```

**Caracter√≠stica:** Hierarquia de classes factory.

### **C) Abstract Factory (m√∫ltiplas fam√≠lias)**

```python
class LLMProviderFactory(ABC):
    @abstractmethod
    def create_language_model(self) -> BaseLanguageModel:
        pass
    
    @abstractmethod
    def create_embedder(self) -> BaseEmbedder:
        pass
    
    @abstractmethod
    def create_tokenizer(self) -> BaseTokenizer:
        pass

class GeminiProviderFactory(LLMProviderFactory):
    # Cria fam√≠lia completa de produtos Gemini
    ...
```

**Caracter√≠stica:** Cria fam√≠lias de produtos relacionados.

---

## ‚úÖ **6. VEREDITO: QUAL VARIA√á√ÉO √â USADA?**

### **LangExtract implementa: Simple Factory + Registry Pattern**

```
Simple Factory:
    ‚úÖ Fun√ß√£o create_model() centralizada
    ‚úÖ N√£o h√° hierarquia de creators
    ‚úÖ Decis√£o interna sobre qual classe criar

Registry Pattern (adicional):
    ‚úÖ Router mapeia model_id ‚Üí provider_class
    ‚úÖ Permite registro din√¢mico de providers
    ‚úÖ Extens√≠vel via plugins
```

### **Por que n√£o √© Factory Method puro?**

1. **N√£o h√° classes Creator**: Apenas uma fun√ß√£o `create_model()`
2. **N√£o usa polimorfismo de Creator**: Usa Router para resolver
3. **Mais simples**: Adequado para o caso de uso

### **Por que n√£o √© Abstract Factory?**

1. **N√£o cria fam√≠lias de produtos**: S√≥ cria `BaseLanguageModel`
2. **N√£o h√° m√∫ltiplos m√©todos factory relacionados**

---

## üìà **7. VANTAGENS DA IMPLEMENTA√á√ÉO DO LANGEXTRACT**

| Vantagem | Descri√ß√£o |
|----------|-----------|
| **Simplicidade** | Uma fun√ß√£o, n√£o hierarquia de classes |
| **Flexibilidade** | ModelConfig aceita model_id OU provider expl√≠cito |
| **Extensibilidade** | Plugin system permite adicionar providers sem modificar c√≥digo |
| **Defaults inteligentes** | API keys de vari√°veis de ambiente |
| **Separa√ß√£o de responsabilidades** | Router cuida da resolu√ß√£o, Factory cuida da instancia√ß√£o |
| **Type Safety** | Retorna tipo abstrato `BaseLanguageModel` |

---

## üéì **8. EXEMPLO DE USO COMPARADO**

### **Factory Cl√°ssico**

```python
# Cliente precisa saber qual factory usar
factory = DogFactory()  # ‚Üê Cliente decide
animal = factory.create_animal()
print(animal.speak())
```

### **LangExtract Factory**

```python
# Cliente s√≥ passa configura√ß√£o, factory decide
config = ModelConfig(model_id="gemini-2.5-flash")
model = create_model(config)  # ‚Üê Factory decide internamente
result = model.infer(["Hello"])
```

---

## üîß **9. C√ìDIGO COMPLETO DE COMPARA√á√ÉO**

### **Factory Cl√°ssico GoF - Vers√£o Completa**

```python
from abc import ABC, abstractmethod

# ========== PRODUCTS ==========
class Vehicle(ABC):
    @abstractmethod
    def drive(self) -> str:
        pass

class Car(Vehicle):
    def drive(self) -> str:
        return "Driving a car üöó"

class Truck(Vehicle):
    def drive(self) -> str:
        return "Driving a truck üöö"

# ========== CREATORS ==========
class VehicleFactory(ABC):
    @abstractmethod
    def create_vehicle(self) -> Vehicle:
        """Factory Method"""
        pass
    
    def deliver(self) -> str:
        """Opera√ß√£o que usa o factory method"""
        vehicle = self.create_vehicle()
        return vehicle.drive()

class CarFactory(VehicleFactory):
    def create_vehicle(self) -> Vehicle:
        return Car()

class TruckFactory(VehicleFactory):
    def create_vehicle(self) -> Vehicle:
        return Truck()

# ========== USO ==========
def main():
    # Cliente escolhe factory
    factory: VehicleFactory = CarFactory()
    print(factory.deliver())  # Driving a car üöó
    
    factory = TruckFactory()
    print(factory.deliver())  # Driving a truck üöö

if __name__ == "__main__":
    main()
```

### **LangExtract Factory - Vers√£o Simplificada**

```python
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Dict, Type

# ========== PRODUCT INTERFACE ==========
class BaseLanguageModel(ABC):
    @abstractmethod
    def infer(self, prompt: str) -> str:
        pass

# ========== CONCRETE PRODUCTS ==========
class GeminiLanguageModel(BaseLanguageModel):
    def __init__(self, model_id: str, api_key: str):
        self.model_id = model_id
        self.api_key = api_key
    
    def infer(self, prompt: str) -> str:
        return f"Gemini ({self.model_id}): {prompt}"

class OpenAILanguageModel(BaseLanguageModel):
    def __init__(self, model_id: str, api_key: str):
        self.model_id = model_id
        self.api_key = api_key
    
    def infer(self, prompt: str) -> str:
        return f"OpenAI ({self.model_id}): {prompt}"

# ========== REGISTRY (Router) ==========
class ProviderRegistry:
    _providers: Dict[str, Type[BaseLanguageModel]] = {
        "gemini": GeminiLanguageModel,
        "openai": OpenAILanguageModel,
    }
    
    @classmethod
    def resolve(cls, model_id: str) -> Type[BaseLanguageModel]:
        if "gemini" in model_id.lower():
            return cls._providers["gemini"]
        elif "gpt" in model_id.lower():
            return cls._providers["openai"]
        raise ValueError(f"Unknown model: {model_id}")

# ========== CONFIGURATION ==========
@dataclass
class ModelConfig:
    model_id: str
    api_key: str

# ========== SIMPLE FACTORY ==========
def create_model(config: ModelConfig) -> BaseLanguageModel:
    """Factory function - Simple Factory Pattern"""
    
    # 1. Resolve qual classe usar
    provider_class = ProviderRegistry.resolve(config.model_id)
    
    # 2. Instancia com par√¢metros
    model = provider_class(
        model_id=config.model_id,
        api_key=config.api_key
    )
    
    return model

# ========== USO ==========
def main():
    # Cliente n√£o sabe qual provider ser√° usado
    config = ModelConfig(
        model_id="gemini-2.5-flash",
        api_key="fake-key"
    )
    
    model = create_model(config)  # ‚Üê Factory decide
    print(model.infer("Hello!"))  # Gemini (gemini-2.5-flash): Hello!
    
    # Trocar provider √© transparente
    config = ModelConfig(
        model_id="gpt-4o",
        api_key="fake-key"
    )
    
    model = create_model(config)
    print(model.infer("Hello!"))  # OpenAI (gpt-4o): Hello!

if __name__ == "__main__":
    main()
```




# üîÑ Iterator Pattern - An√°lise LangExtract
## Compara√ß√£o: chunking.py vs Padr√£o GoF

---

## üìö **1. PADR√ÉO ITERATOR (GoF) - RESUMO**

### **Defini√ß√£o**
> Fornece uma maneira de acessar sequencialmente elementos de uma cole√ß√£o sem expor sua representa√ß√£o interna.

### **Estrutura Cl√°ssica**

```python
from abc import ABC, abstractmethod

# Iterator Interface
class Iterator(ABC):
    @abstractmethod
    def __next__(self):
        pass
    
    @abstractmethod
    def has_next(self) -> bool:
        pass

# Aggregate Interface
class Iterable(ABC):
    @abstractmethod
    def __iter__(self) -> Iterator:
        pass

# Concrete Iterator
class ListIterator(Iterator):
    def __init__(self, collection: list):
        self._collection = collection
        self._position = 0
    
    def __next__(self):
        if self.has_next():
            item = self._collection[self._position]
            self._position += 1
            return item
        raise StopIteration
    
    def has_next(self) -> bool:
        return self._position < len(self._collection)

# Concrete Aggregate
class MyList(Iterable):
    def __init__(self):
        self._items = []
    
    def add(self, item):
        self._items.append(item)
    
    def __iter__(self) -> Iterator:
        return ListIterator(self._items)

# USO
my_list = MyList()
my_list.add("A")
my_list.add("B")

for item in my_list:  # Python chama __iter__() e __next__()
    print(item)
```

---

## üîç **2. ITERATOR NO LANGEXTRACT (chunking.py)**

### **A) ChunkIterator - Divide documentos em chunks**

```python
# ============================================
# ARQUIVO: langextract/chunking.py
# ============================================

class ChunkIterator:
    """Itera chunks de texto tokenizado respeitando max_char_buffer"""
    
    def __init__(
        self,
        text: str | TokenizedText,
        max_char_buffer: int,
        document: Document | None = None,
    ):
        if isinstance(text, str):
            text = TokenizedText(text=text)
        
        self.tokenized_text = text
        self.max_char_buffer = max_char_buffer
        self.sentence_iter = SentenceIterator(self.tokenized_text)
        self.broken_sentence = False
        self.document = document if document else Document(text=text.text)
    
    def __iter__(self) -> Iterator[TextChunk]:
        """Protocolo Iterator: retorna self"""
        return self
    
    def __next__(self) -> TextChunk:
        """Protocolo Iterator: retorna pr√≥ximo chunk"""
        sentence = next(self.sentence_iter)  # Pode lan√ßar StopIteration
        
        # Inicializa chunk com primeiro token
        curr_chunk = create_token_interval(
            sentence.start_index,
            sentence.start_index + 1
        )
        
        # Se token excede buffer, retorna s√≥ ele
        if self._tokens_exceed_buffer(curr_chunk):
            self.sentence_iter = SentenceIterator(
                self.tokenized_text,
                curr_token_pos=sentence.start_index + 1
            )
            self.broken_sentence = True
            return TextChunk(token_interval=curr_chunk, document=self.document)
        
        # Adiciona tokens at√© atingir max_char_buffer
        start_of_new_line = -1
        for token_index in range(curr_chunk.start_index, sentence.end_index):
            if self.tokenized_text.tokens[token_index].first_token_after_newline:
                start_of_new_line = token_index
            
            test_chunk = create_token_interval(
                curr_chunk.start_index,
                token_index + 1
            )
            
            if self._tokens_exceed_buffer(test_chunk):
                # Quebra em newline se poss√≠vel
                if start_of_new_line > curr_chunk.start_index:
                    curr_chunk = create_token_interval(
                        curr_chunk.start_index,
                        start_of_new_line
                    )
                
                self.sentence_iter = SentenceIterator(
                    self.tokenized_text,
                    curr_token_pos=curr_chunk.end_index
                )
                self.broken_sentence = True
                return TextChunk(token_interval=curr_chunk, document=self.document)
            else:
                curr_chunk = test_chunk
        
        # Tenta adicionar senten√ßas completas se couber
        if not self.broken_sentence:
            for sentence in self.sentence_iter:
                test_chunk = create_token_interval(
                    curr_chunk.start_index,
                    sentence.end_index
                )
                
                if self._tokens_exceed_buffer(test_chunk):
                    self.sentence_iter = SentenceIterator(
                        self.tokenized_text,
                        curr_token_pos=curr_chunk.end_index
                    )
                    return TextChunk(token_interval=curr_chunk, document=self.document)
                else:
                    curr_chunk = test_chunk
        
        self.broken_sentence = False
        return TextChunk(token_interval=curr_chunk, document=self.document)
    
    def _tokens_exceed_buffer(self, token_interval: TokenInterval) -> bool:
        """Verifica se intervalo excede buffer m√°ximo"""
        char_interval = get_char_interval(self.tokenized_text, token_interval)
        return (char_interval.end_pos - char_interval.start_pos) > self.max_char_buffer
```

### **B) SentenceIterator - Itera por senten√ßas**

```python
class SentenceIterator:
    """Itera atrav√©s de senten√ßas em texto tokenizado"""
    
    def __init__(
        self,
        tokenized_text: TokenizedText,
        curr_token_pos: int = 0,
    ):
        self.tokenized_text = tokenized_text
        self.token_len = len(tokenized_text.tokens)
        
        if curr_token_pos < 0 or curr_token_pos > self.token_len:
            raise IndexError(f"Invalid token position: {curr_token_pos}")
        
        self.curr_token_pos = curr_token_pos
    
    def __iter__(self) -> Iterator[TokenInterval]:
        """Protocolo Iterator: retorna self"""
        return self
    
    def __next__(self) -> TokenInterval:
        """Protocolo Iterator: retorna pr√≥xima senten√ßa"""
        if self.curr_token_pos == self.token_len:
            raise StopIteration
        
        # Localiza range da senten√ßa contendo token atual
        sentence_range = tokenizer.find_sentence_range(
            self.tokenized_text.text,
            self.tokenized_text.tokens,
            self.curr_token_pos,
        )
        
        # Ajusta para come√ßar da posi√ß√£o atual
        sentence_range = create_token_interval(
            self.curr_token_pos,
            sentence_range.end_index
        )
        
        self.curr_token_pos = sentence_range.end_index
        return sentence_range
```

---

## üìä **3. COMPARA√á√ÉO DIRETA**

| Aspecto | Iterator Cl√°ssico | LangExtract ChunkIterator |
|---------|-------------------|---------------------------|
| **`__iter__()`** | ‚úÖ Retorna Iterator | ‚úÖ `return self` |
| **`__next__()`** | ‚úÖ Retorna pr√≥ximo item | ‚úÖ Retorna `TextChunk` |
| **`StopIteration`** | ‚úÖ Lan√ßa quando acabar | ‚úÖ Delegado para `SentenceIterator` |
| **Estado interno** | `_position` (√≠ndice) | `sentence_iter`, `broken_sentence` |
| **Lazy evaluation** | ‚úÖ SIM | ‚úÖ SIM (processa sob demanda) |
| **Itera√ß√£o aninhada** | ‚ùå N√£o | ‚úÖ SIM (`ChunkIterator` usa `SentenceIterator`) |
| **L√≥gica complexa** | ‚ùå Simples (index++) | ‚úÖ Complexa (buffer, newlines, senten√ßas) |

---

## üéØ **4. MAPEAMENTO CONCEITUAL**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  PADR√ÉO GoF          ‚Üí    LANGEXTRACT                    ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Iterator            ‚Üí    ChunkIterator                  ‚îÇ
‚îÇ  __iter__()          ‚Üí    return self                    ‚îÇ
‚îÇ  __next__()          ‚Üí    Retorna TextChunk              ‚îÇ
‚îÇ  has_next()          ‚Üí    Impl√≠cito (StopIteration)      ‚îÇ
‚îÇ  Aggregate           ‚Üí    TokenizedText (cole√ß√£o)        ‚îÇ
‚îÇ  Item                ‚Üí    TextChunk (elemento)           ‚îÇ
‚îÇ  Estado (_position)  ‚Üí    sentence_iter + broken_sentence‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

---

## üí° **5. DIFEREN√áAS PRINCIPAIS**

### **A) Iterator Cl√°ssico: Simples**
```python
class SimpleIterator:
    def __init__(self, data):
        self.data = data
        self.index = 0  # ‚Üê Estado simples
    
    def __next__(self):
        if self.index >= len(self.data):
            raise StopIteration
        item = self.data[self.index]
        self.index += 1  # ‚Üê Incremento simples
        return item
```

### **B) ChunkIterator: Complexo**
```python
class ChunkIterator:
    def __init__(self, text, max_char_buffer):
        self.tokenized_text = text
        self.max_char_buffer = max_char_buffer
        self.sentence_iter = SentenceIterator(...)  # ‚Üê Iterator aninhado
        self.broken_sentence = False  # ‚Üê Estado adicional
    
    def __next__(self):
        # 1. Pega senten√ßa
        sentence = next(self.sentence_iter)
        
        # 2. L√≥gica complexa: buffer, newlines, tokens
        # ... 50+ linhas de l√≥gica
        
        # 3. Retorna chunk otimizado
        return TextChunk(...)
```

**Por qu√™?**
- Precisa respeitar limite de caracteres (`max_char_buffer`)
- Prefere quebrar em newlines quando poss√≠vel
- Agrupa senten√ßas completas se couberem no buffer
- Trata tokens gigantes que excedem buffer

---

## üîß **6. EXEMPLO DE USO COMPARADO**

### **Iterator Cl√°ssico**

```python
# Cria cole√ß√£o iter√°vel
numbers = MyList()
numbers.add(1)
numbers.add(2)
numbers.add(3)

# Itera
for num in numbers:
    print(num)

# Output: 1, 2, 3
```

### **ChunkIterator no LangExtract**

```python
# Texto longo
text = """
This is a very long document that needs to be split into chunks
for processing by an LLM with limited context window.
Each chunk should be around 200 characters maximum.
"""

# Cria iterator
chunk_iter = ChunkIterator(
    text=text,
    max_char_buffer=200,
    document=Document(text=text)
)

# Itera por chunks
for chunk in chunk_iter:
    print(f"Chunk: {chunk.chunk_text[:50]}...")
    print(f"Size: {len(chunk.chunk_text)} chars\n")

# Output:
# Chunk: This is a very long document that needs to be ...
# Size: 197 chars
#
# Chunk: for processing by an LLM with limited context...
# Size: 186 chars
```

---

## ‚úÖ **7. CHECKLIST DE VERIFICA√á√ÉO**

| Caracter√≠stica do Iterator Pattern | ChunkIterator | SentenceIterator |
|-------------------------------------|---------------|------------------|
| ‚úÖ Implementa `__iter__()` | ‚úÖ SIM | ‚úÖ SIM |
| ‚úÖ Implementa `__next__()` | ‚úÖ SIM | ‚úÖ SIM |
| ‚úÖ Lan√ßa `StopIteration` | ‚úÖ SIM | ‚úÖ SIM |
| ‚úÖ Mant√©m estado interno | ‚úÖ SIM | ‚úÖ SIM |
| ‚úÖ Lazy evaluation | ‚úÖ SIM | ‚úÖ SIM |
| ‚úÖ Funciona com `for` loop | ‚úÖ SIM | ‚úÖ SIM |
| ‚úÖ Esconde representa√ß√£o interna | ‚úÖ SIM | ‚úÖ SIM |

---

## üìà **8. VANTAGENS DA IMPLEMENTA√á√ÉO**

| Vantagem | Descri√ß√£o |
|----------|-----------|
| **Mem√≥ria eficiente** | N√£o carrega documento inteiro na mem√≥ria |
| **Streaming** | Processa chunks sob demanda |
| **Flex√≠vel** | L√≥gica complexa de chunking encapsulada |
| **Pythonic** | Usa protocolo nativo (`__iter__`, `__next__`) |
| **Composi√ß√£o** | `ChunkIterator` usa `SentenceIterator` internamente |
| **Reutiliz√°vel** | Pode iterar m√∫ltiplas vezes criando novo iterator |

---

## üéì **9. C√ìDIGO EXECUT√ÅVEL COMPLETO**

### **A) Iterator Cl√°ssico (Simples)**

```python
from abc import ABC, abstractmethod

class Iterator(ABC):
    @abstractmethod
    def __next__(self):
        pass

class BookShelf:
    """Aggregate: Cole√ß√£o de livros"""
    def __init__(self):
        self._books = []
    
    def add(self, book: str):
        self._books.append(book)
    
    def __iter__(self):
        """Retorna iterator"""
        return BookIterator(self._books)

class BookIterator:
    """Iterator: Percorre livros"""
    def __init__(self, books: list):
        self._books = books
        self._index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._index >= len(self._books):
            raise StopIteration
        
        book = self._books[self._index]
        self._index += 1
        return book

# USO
shelf = BookShelf()
shelf.add("Python Design Patterns")
shelf.add("Clean Code")
shelf.add("Refactoring")

for book in shelf:
    print(f"üìö {book}")
```

### **B) ChunkIterator (Simplificado)**

```python
from dataclasses import dataclass

@dataclass
class Chunk:
    text: str
    start: int
    end: int

class TextChunker:
    """Iterator que divide texto em chunks de tamanho fixo"""
    
    def __init__(self, text: str, chunk_size: int):
        self.text = text
        self.chunk_size = chunk_size
        self.position = 0
    
    def __iter__(self):
        return self
    
    def __next__(self) -> Chunk:
        if self.position >= len(self.text):
            raise StopIteration
        
        start = self.position
        end = min(self.position + self.chunk_size, len(self.text))
        
        # Tenta quebrar em espa√ßo para n√£o partir palavras
        if end < len(self.text) and self.text[end] != ' ':
            # Procura √∫ltimo espa√ßo antes do fim
            last_space = self.text.rfind(' ', start, end)
            if last_space > start:
                end = last_space + 1
        
        chunk_text = self.text[start:end]
        self.position = end
        
        return Chunk(text=chunk_text, start=start, end=end)

# USO
text = "The quick brown fox jumps over the lazy dog. Pack my box with five dozen liquor jugs."
chunker = TextChunker(text, chunk_size=30)

for i, chunk in enumerate(chunker, 1):
    print(f"Chunk {i}: '{chunk.text}' [{chunk.start}:{chunk.end}]")

# Output:
# Chunk 1: 'The quick brown fox jumps ' [0:26]
# Chunk 2: 'over the lazy dog. Pack my ' [26:53]
# Chunk 3: 'box with five dozen liquor ' [53:80]
# Chunk 4: 'jugs.' [80:85]
```

---

## üìù **10. CONCLUS√ÉO**

### **‚úÖ √â Iterator Pattern? SIM!**

**ChunkIterator e SentenceIterator implementam PERFEITAMENTE o padr√£o Iterator:**

1. ‚úÖ Protocolo Python (`__iter__`, `__next__`, `StopIteration`)
2. ‚úÖ Acesso sequencial sem expor representa√ß√£o interna
3. ‚úÖ Lazy evaluation (processa sob demanda)
4. ‚úÖ Estado interno gerenciado pelo iterator
5. ‚úÖ Permite m√∫ltiplas itera√ß√µes independentes

### **Diferencial do LangExtract:**

- **L√≥gica de neg√≥cio complexa** embutida no iterator
- **Iterators compostos** (`ChunkIterator` usa `SentenceIterator`)
- **Otimiza√ß√µes** (newlines, buffer, senten√ßas completas)
- **Aplica√ß√£o real** para processamento de LLMs

### **Classifica√ß√£o:**
üèÜ **Iterator Pattern (GoF) - Implementa√ß√£o Avan√ßada com L√≥gica de Dom√≠nio**




# üé≠ Facade Pattern - An√°lise LangExtract
## Compara√ß√£o: extraction.py vs Padr√£o GoF

---

## üìö **1. PADR√ÉO FACADE (GoF) - RESUMO**

### **Defini√ß√£o**
> Fornece uma interface unificada para um conjunto de interfaces em um subsistema. Facade define uma interface de n√≠vel mais alto que torna o subsistema mais f√°cil de usar.

### **Estrutura Cl√°ssica**

```python
# ============================================
# SUBSISTEMAS COMPLEXOS (m√∫ltiplas classes)
# ============================================

class SubsystemA:
    def operation_a1(self):
        return "SubsystemA: opera√ß√£o A1"
    
    def operation_a2(self):
        return "SubsystemA: opera√ß√£o A2"

class SubsystemB:
    def operation_b1(self):
        return "SubsystemB: opera√ß√£o B1"
    
    def operation_b2(self):
        return "SubsystemB: opera√ß√£o B2"

class SubsystemC:
    def operation_c1(self):
        return "SubsystemC: opera√ß√£o C1"

# ============================================
# FACADE - Interface simplificada
# ============================================

class Facade:
    """Esconde a complexidade dos subsistemas"""
    
    def __init__(self):
        self._subsystem_a = SubsystemA()
        self._subsystem_b = SubsystemB()
        self._subsystem_c = SubsystemC()
    
    def simple_operation(self):
        """Uma chamada simples que coordena subsistemas"""
        results = []
        results.append(self._subsystem_a.operation_a1())
        results.append(self._subsystem_b.operation_b1())
        results.append(self._subsystem_c.operation_c1())
        return "\n".join(results)

# ============================================
# USO - Cliente usa facade, n√£o subsistemas
# ============================================

# ‚ùå SEM Facade (cliente lida com complexidade):
subsystem_a = SubsystemA()
subsystem_b = SubsystemB()
subsystem_c = SubsystemC()
result1 = subsystem_a.operation_a1()
result2 = subsystem_b.operation_b1()
result3 = subsystem_c.operation_c1()

# ‚úÖ COM Facade (cliente usa interface simples):
facade = Facade()
result = facade.simple_operation()  # ‚Üê Tudo em uma chamada!
```

---

## üîç **2. FACADE NO LANGEXTRACT (extraction.py)**

### **A) Fun√ß√£o `extract()` - A Facade Principal**

```python
# ============================================
# ARQUIVO: langextract/extraction.py
# ============================================

def extract(
    text_or_documents: Any,
    prompt_description: str | None = None,
    examples: Sequence[Any] | None = None,
    model_id: str = "gemini-2.5-flash",
    api_key: str | None = None,
    # ... muitos outros par√¢metros opcionais
) -> Any:
    """
    FACADE: Interface simplificada para extra√ß√£o estruturada.
    
    Esconde a complexidade de:
    - Factory (cria√ß√£o de models)
    - Prompting (gera√ß√£o de prompts)
    - FormatHandler (parsing JSON/YAML)
    - Resolver (convers√£o string ‚Üí Extraction)
    - Annotator (pipeline de extra√ß√£o)
    - Chunking (divis√£o de documentos)
    - Alignment (mapeamento texto ‚Üí posi√ß√µes)
    """
    
    # ============================================
    # 1. VALIDA√á√ÉO
    # ============================================
    if not examples:
        raise ValueError("Examples are required...")
    
    if prompt_validation_level is not pv.PromptValidationLevel.OFF:
        report = pv.validate_prompt_alignment(examples=examples, ...)
        pv.handle_alignment_report(report, ...)
    
    # ============================================
    # 2. SUBSISTEMA: DOWNLOAD (se necess√°rio)
    # ============================================
    if fetch_urls and isinstance(text_or_documents, str) and io.is_url(text_or_documents):
        text_or_documents = io.download_text_from_url(text_or_documents)
    
    # ============================================
    # 3. SUBSISTEMA: PROMPTING
    # ============================================
    prompt_template = prompting.PromptTemplateStructured(
        description=prompt_description
    )
    prompt_template.examples.extend(examples)
    
    # ============================================
    # 4. SUBSISTEMA: FACTORY (cria√ß√£o de model)
    # ============================================
    if model:
        language_model = model
    elif config:
        language_model = factory.create_model(
            config=config,
            examples=prompt_template.examples if use_schema_constraints else None,
            use_schema_constraints=use_schema_constraints,
            fence_output=fence_output,
        )
    else:
        # Cria config e usa factory
        config = factory.ModelConfig(
            model_id=model_id,
            provider_kwargs=filtered_kwargs
        )
        language_model = factory.create_model(config=config, ...)
    
    # ============================================
    # 5. SUBSISTEMA: FORMAT HANDLER
    # ============================================
    format_handler, remaining_params = fh.FormatHandler.from_resolver_params(
        resolver_params=resolver_params,
        base_format_type=format_type,
        base_use_fences=language_model.requires_fence_output,
        ...
    )
    
    # ============================================
    # 6. SUBSISTEMA: RESOLVER
    # ============================================
    res = resolver.Resolver(**effective_params)
    
    # ============================================
    # 7. SUBSISTEMA: ANNOTATOR (pipeline principal)
    # ============================================
    annotator = annotation.Annotator(
        language_model=language_model,
        prompt_template=prompt_template,
        format_handler=format_handler,
    )
    
    # ============================================
    # 8. EXECU√á√ÉO (coordena tudo)
    # ============================================
    if isinstance(text_or_documents, str):
        return annotator.annotate_text(
            text=text_or_documents,
            resolver=res,
            max_char_buffer=max_char_buffer,
            batch_length=batch_length,
            additional_context=additional_context,
            extraction_passes=extraction_passes,
            show_progress=show_progress,
            **alignment_kwargs,
        )
    else:
        return annotator.annotate_documents(
            documents=text_or_documents,
            resolver=res,
            max_char_buffer=max_char_buffer,
            batch_length=batch_length,
            extraction_passes=extraction_passes,
            show_progress=show_progress,
            **alignment_kwargs,
        )
```

### **B) Subsistemas Complexos que a Facade Esconde**

```python
# ============================================
# SUBSISTEMA 1: Factory (factory.py)
# ============================================
class ModelConfig:
    model_id: str
    provider: str | None
    provider_kwargs: dict

def create_model(config: ModelConfig, ...) -> BaseLanguageModel:
    # L√≥gica complexa de resolu√ß√£o de providers
    providers.load_builtins_once()
    providers.load_plugins_once()
    provider_class = router.resolve(config.model_id)
    kwargs = _kwargs_with_environment_defaults(...)
    return provider_class(**kwargs)

# ============================================
# SUBSISTEMA 2: Prompting (prompting.py)
# ============================================
class PromptTemplateStructured:
    description: str
    examples: list[ExampleData]

class QAPromptGenerator:
    def render(self, question: str, ...) -> str:
        # Constr√≥i prompt complexo com exemplos
        ...

# ============================================
# SUBSISTEMA 3: FormatHandler (format_handler.py)
# ============================================
class FormatHandler:
    def parse_output(self, text: str) -> Sequence[Mapping]:
        # Parse JSON/YAML com fences, wrappers, etc.
        ...

# ============================================
# SUBSISTEMA 4: Resolver (resolver.py)
# ============================================
class Resolver:
    def resolve(self, input_text: str) -> Sequence[Extraction]:
        # Converte string ‚Üí Extraction objects
        ...
    
    def align(self, extractions, source_text, ...) -> Iterator[Extraction]:
        # Alinha extra√ß√µes com texto fonte
        ...

# ============================================
# SUBSISTEMA 5: Annotator (annotation.py)
# ============================================
class Annotator:
    def annotate_text(self, text: str, ...) -> AnnotatedDocument:
        # Pipeline completo: chunk ‚Üí prompt ‚Üí infer ‚Üí resolve ‚Üí align
        ...

# ============================================
# SUBSISTEMA 6: Chunking (chunking.py)
# ============================================
class ChunkIterator:
    def __next__(self) -> TextChunk:
        # Divide texto em chunks respeitando buffer
        ...
```

---

## üìä **3. COMPARA√á√ÉO DIRETA**

| Aspecto | Facade Cl√°ssico | LangExtract `extract()` |
|---------|-----------------|-------------------------|
| **Interface √∫nica** | ‚úÖ `simple_operation()` | ‚úÖ `extract()` |
| **Esconde subsistemas** | ‚úÖ SubsystemA, B, C | ‚úÖ Factory, Prompting, Resolver, etc. |
| **Coordena opera√ß√µes** | ‚úÖ Chama m√∫ltiplos subsistemas | ‚úÖ Orquestra 6+ subsistemas |
| **Simplifica uso** | ‚úÖ Cliente n√£o v√™ complexidade | ‚úÖ `lx.extract(text="...", model_id="...")` |
| **Defaults inteligentes** | ‚ùå N√£o tem | ‚úÖ SIM (model_id padr√£o, etc.) |
| **Valida√ß√£o** | ‚ùå N√£o tem | ‚úÖ SIM (examples, prompts) |
| **Flexibilidade** | ‚ùå Interface fixa | ‚úÖ Muitos par√¢metros opcionais |

---

## üéØ **4. MAPEAMENTO CONCEITUAL**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  PADR√ÉO GoF          ‚Üí    LANGEXTRACT                  ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Facade              ‚Üí    extract()                    ‚îÇ
‚îÇ  SubsystemA          ‚Üí    factory (create_model)       ‚îÇ
‚îÇ  SubsystemB          ‚Üí    prompting (render prompts)   ‚îÇ
‚îÇ  SubsystemC          ‚Üí    resolver (parse output)      ‚îÇ
‚îÇ  SubsystemD          ‚Üí    annotator (pipeline)         ‚îÇ
‚îÇ  SubsystemE          ‚Üí    format_handler (JSON/YAML)   ‚îÇ
‚îÇ  SubsystemF          ‚Üí    chunking (divide texto)      ‚îÇ
‚îÇ  simple_operation()  ‚Üí    extract(text, model_id, ...) ‚îÇ
‚îÇ  Cliente             ‚Üí    Usu√°rio final                ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### **Diagrama de Depend√™ncias**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  CLIENTE                                ‚îÇ
‚îÇ  (Usu√°rio do LangExtract)               ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                 ‚îÇ
                 ‚îÇ lx.extract(text="...", model_id="gemini")
                 ‚îÇ
                 ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  FACADE: extract()                      ‚îÇ  ‚óÑ‚îÄ‚îÄ‚îÄ Interface Simples
‚îÇ  (extraction.py)                        ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         ‚îÇ
         ‚îÇ Coordena subsistemas ‚Üì
         ‚îÇ
    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
    ‚îÇ                                    ‚îÇ
    ‚ñº                                    ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê                  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Factory    ‚îÇ                  ‚îÇ  Prompting  ‚îÇ
‚îÇ  (criar     ‚îÇ                  ‚îÇ  (gerar     ‚îÇ
‚îÇ   modelo)   ‚îÇ                  ‚îÇ   prompts)  ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò                  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
       ‚îÇ                                ‚îÇ
       ‚ñº                                ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê                  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Router     ‚îÇ                  ‚îÇ  Format     ‚îÇ
‚îÇ  (resolve   ‚îÇ                  ‚îÇ  Handler    ‚îÇ
‚îÇ   provider) ‚îÇ                  ‚îÇ  (parse)    ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò                  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                                        ‚îÇ
                                        ‚ñº
                                 ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
                                 ‚îÇ  Resolver   ‚îÇ
                                 ‚îÇ  (align)    ‚îÇ
                                 ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                                        ‚îÇ
                                        ‚ñº
                                 ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
                                 ‚îÇ  Annotator  ‚îÇ
                                 ‚îÇ  (pipeline) ‚îÇ
                                 ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                                        ‚îÇ
                                        ‚ñº
                                 ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
                                 ‚îÇ  Chunking   ‚îÇ
                                 ‚îÇ  (iterator) ‚îÇ
                                 ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

---

## üí° **5. COMPARA√á√ÉO: COM vs SEM FACADE**

### **‚ùå SEM Facade (Cliente faz tudo manualmente)**

```python
# Cliente precisa conhecer TODOS os subsistemas
from langextract import factory, prompting, resolver, annotation
from langextract import io, providers
from langextract.core import format_handler as fh

# 1. Download se necess√°rio
if io.is_url(text):
    text = io.download_text_from_url(text)

# 2. Criar prompt template
prompt_template = prompting.PromptTemplateStructured(
    description="Extract entities"
)
prompt_template.examples.extend(examples)

# 3. Configurar e criar model
providers.load_builtins_once()
providers.load_plugins_once()
config = factory.ModelConfig(
    model_id="gemini-2.5-flash",
    provider_kwargs={"api_key": "..."}
)
language_model = factory.create_model(config)

# 4. Criar format handler
format_handler = fh.FormatHandler(
    format_type=data.FormatType.JSON,
    use_fences=language_model.requires_fence_output,
    use_wrapper=True,
    wrapper_key="extractions"
)

# 5. Criar resolver
res = resolver.Resolver(format_handler=format_handler)

# 6. Criar annotator
annotator = annotation.Annotator(
    language_model=language_model,
    prompt_template=prompt_template,
    format_handler=format_handler
)

# 7. Executar anota√ß√£o
result = annotator.annotate_text(
    text=text,
    resolver=res,
    max_char_buffer=1000,
    batch_length=10,
    extraction_passes=1,
    show_progress=True
)

# üò∞ Muito complexo para o usu√°rio!
```

### **‚úÖ COM Facade (Interface Simples)**

```python
import langextract as lx

# Uma linha faz tudo!
result = lx.extract(
    text="Your document text here...",
    prompt_description="Extract entities",
    examples=[...],
    model_id="gemini-2.5-flash"
)

# üòä Simples e direto!
```

---

## üîß **6. EXEMPLO PR√ÅTICO COMPLETO**

### **A) Facade Cl√°ssico (Sistema de Home Theater)**

```python
# ============================================
# SUBSISTEMAS COMPLEXOS
# ============================================

class Amplifier:
    def on(self): return "Amplifier ligado"
    def set_volume(self, level): return f"Volume: {level}"
    def off(self): return "Amplifier desligado"

class DVDPlayer:
    def on(self): return "DVD ligado"
    def play(self, movie): return f"Reproduzindo: {movie}"
    def stop(self): return "DVD parado"
    def off(self): return "DVD desligado"

class Projector:
    def on(self): return "Projetor ligado"
    def wide_screen_mode(self): return "Modo widescreen"
    def off(self): return "Projetor desligado"

class Lights:
    def dim(self, level): return f"Luzes: {level}%"

# ============================================
# FACADE - Interface Simples
# ============================================

class HomeTheaterFacade:
    """Simplifica opera√ß√£o do home theater"""
    
    def __init__(self):
        self.amp = Amplifier()
        self.dvd = DVDPlayer()
        self.projector = Projector()
        self.lights = Lights()
    
    def watch_movie(self, movie: str):
        """Uma chamada, m√∫ltiplas opera√ß√µes coordenadas"""
        print("Preparando para assistir filme...")
        print(self.lights.dim(10))
        print(self.projector.on())
        print(self.projector.wide_screen_mode())
        print(self.amp.on())
        print(self.amp.set_volume(5))
        print(self.dvd.on())
        print(self.dvd.play(movie))
    
    def end_movie(self):
        """Desliga tudo de uma vez"""
        print("Finalizando filme...")
        print(self.dvd.stop())
        print(self.dvd.off())
        print(self.amp.off())
        print(self.projector.off())
        print(self.lights.dim(100))

# ============================================
# USO
# ============================================

theater = HomeTheaterFacade()
theater.watch_movie("Matrix")  # ‚Üê Simples!
# ... assistir filme ...
theater.end_movie()  # ‚Üê Simples!
```

### **B) LangExtract Facade (Simplificado)**

```python
# ============================================
# VERS√ÉO SIMPLIFICADA MOSTRANDO SUBSISTEMAS
# ============================================

class SimpleLangExtract:
    """Facade simplificada do LangExtract"""
    
    def extract(
        self,
        text: str,
        prompt: str,
        examples: list,
        model_id: str = "gemini-2.5-flash"
    ):
        """Interface simples que esconde 6 subsistemas"""
        
        # SUBSISTEMA 1: Prompting
        prompt_template = self._create_prompt(prompt, examples)
        
        # SUBSISTEMA 2: Factory
        model = self._create_model(model_id)
        
        # SUBSISTEMA 3: Format Handler
        format_handler = self._create_format_handler(model)
        
        # SUBSISTEMA 4: Resolver
        resolver = self._create_resolver(format_handler)
        
        # SUBSISTEMA 5: Annotator
        annotator = self._create_annotator(model, prompt_template, format_handler)
        
        # SUBSISTEMA 6: Execu√ß√£o (usa Chunking internamente)
        result = annotator.annotate_text(text, resolver)
        
        return result
    
    def _create_prompt(self, description, examples):
        # L√≥gica complexa de prompting
        return PromptTemplate(description, examples)
    
    def _create_model(self, model_id):
        # L√≥gica complexa de factory + router + providers
        return ModelFactory.create(model_id)
    
    def _create_format_handler(self, model):
        # L√≥gica de parsing JSON/YAML
        return FormatHandler(model.format_type)
    
    def _create_resolver(self, format_handler):
        # L√≥gica de convers√£o string ‚Üí objetos
        return Resolver(format_handler)
    
    def _create_annotator(self, model, prompt, format_handler):
        # L√≥gica de pipeline de extra√ß√£o
        return Annotator(model, prompt, format_handler)

# ============================================
# USO
# ============================================

lx = SimpleLangExtract()

result = lx.extract(
    text="Romeo loves Juliet",
    prompt="Extract relationships",
    examples=[...],
    model_id="gemini-2.5-flash"
)  # ‚Üê Interface simples esconde toda complexidade!
```

---

## ‚úÖ **7. CHECKLIST DE VERIFICA√á√ÉO**

| Caracter√≠stica do Facade Pattern | `extract()` |
|-----------------------------------|-------------|
| ‚úÖ Interface unificada de alto n√≠vel | ‚úÖ SIM |
| ‚úÖ Esconde complexidade de subsistemas | ‚úÖ SIM (6+ subsistemas) |
| ‚úÖ Coordena m√∫ltiplas opera√ß√µes | ‚úÖ SIM (factory ‚Üí prompting ‚Üí resolver ‚Üí annotator) |
| ‚úÖ Simplifica uso para cliente | ‚úÖ SIM (1 linha vs 50+ linhas) |
| ‚úÖ Subsistemas ainda acess√≠veis | ‚úÖ SIM (usu√°rios avan√ßados podem usar direto) |
| ‚úÖ Reduz acoplamento | ‚úÖ SIM (cliente n√£o depende de subsistemas) |
| ‚úÖ Facilita manuten√ß√£o | ‚úÖ SIM (mudan√ßas internas n√£o afetam API) |

---

## üìà **8. VANTAGENS DA IMPLEMENTA√á√ÉO**

| Vantagem | Descri√ß√£o |
|----------|-----------|
| **Simplicidade** | API de 1 linha para opera√ß√£o complexa |
| **Defaults inteligentes** | `model_id="gemini-2.5-flash"`, par√¢metros opcionais |
| **Flexibilidade** | Usu√°rios avan√ßados podem passar `model`, `config`, etc. |
| **Valida√ß√£o** | Valida inputs antes de processar |
| **Documenta√ß√£o clara** | Docstring explica todos os par√¢metros |
| **Backward compatibility** | Warnings para par√¢metros deprecados |
| **Extensibilidade** | Subsistemas podem evoluir independentemente |



# üèóÔ∏è Builder Pattern - An√°lise LangExtract
## Compara√ß√£o: prompting.py vs Padr√£o GoF

---

## üìö **1. PADR√ÉO BUILDER (GoF) - RESUMO**

### **Defini√ß√£o**
> Separa a constru√ß√£o de um objeto complexo de sua representa√ß√£o, permitindo que o mesmo processo de constru√ß√£o crie diferentes representa√ß√µes.

### **Estrutura Cl√°ssica**

```python
from abc import ABC, abstractmethod

# ============================================
# PRODUTO - Objeto complexo a ser constru√≠do
# ============================================

class Pizza:
    def __init__(self):
        self.dough = None
        self.sauce = None
        self.topping = None
    
    def __str__(self):
        return f"Pizza: {self.dough}, {self.sauce}, {self.topping}"

# ============================================
# BUILDER - Interface abstrata
# ============================================

class PizzaBuilder(ABC):
    def __init__(self):
        self.pizza = Pizza()
    
    @abstractmethod
    def build_dough(self):
        pass
    
    @abstractmethod
    def build_sauce(self):
        pass
    
    @abstractmethod
    def build_topping(self):
        pass
    
    def get_pizza(self) -> Pizza:
        return self.pizza

# ============================================
# CONCRETE BUILDERS
# ============================================

class MargheritaBuilder(PizzaBuilder):
    def build_dough(self):
        self.pizza.dough = "Massa fina"
        return self
    
    def build_sauce(self):
        self.pizza.sauce = "Molho de tomate"
        return self
    
    def build_topping(self):
        self.pizza.topping = "Mozzarella"
        return self

class PepperoniBuilder(PizzaBuilder):
    def build_dough(self):
        self.pizza.dough = "Massa grossa"
        return self
    
    def build_sauce(self):
        self.pizza.sauce = "Molho picante"
        return self
    
    def build_topping(self):
        self.pizza.topping = "Pepperoni"
        return self

# ============================================
# DIRECTOR (opcional) - Controla constru√ß√£o
# ============================================

class PizzaDirector:
    def __init__(self, builder: PizzaBuilder):
        self._builder = builder
    
    def make_pizza(self) -> Pizza:
        """Controla a ordem de constru√ß√£o"""
        return (self._builder
                .build_dough()
                .build_sauce()
                .build_topping()
                .get_pizza())

# ============================================
# USO
# ============================================

# Com Director
builder = MargheritaBuilder()
director = PizzaDirector(builder)
pizza = director.make_pizza()
print(pizza)  # Pizza: Massa fina, Molho de tomate, Mozzarella

# Sem Director (constru√ß√£o manual)
builder = PepperoniBuilder()
pizza = builder.build_dough().build_sauce().build_topping().get_pizza()
print(pizza)  # Pizza: Massa grossa, Molho picante, Pepperoni
```

---

## üîç **2. BUILDER NO LANGEXTRACT (prompting.py)**

### **A) QAPromptGenerator - O Builder**

```python
# ============================================
# ARQUIVO: langextract/prompting.py
# ============================================

@dataclasses.dataclass
class QAPromptGenerator:
    """
    BUILDER: Constr√≥i prompts complexos incrementalmente.
    
    O produto final √© uma string de prompt formatada com:
    - Descri√ß√£o/instru√ß√µes
    - Contexto adicional (opcional)
    - Exemplos few-shot formatados
    - Pergunta
    - Prefixos Q:/A:
    """
    
    # Componentes do prompt (configura√ß√£o)
    template: PromptTemplateStructured
    format_handler: FormatHandler
    examples_heading: str = "Examples"
    question_prefix: str = "Q: "
    answer_prefix: str = "A: "
    
    def format_example_as_text(self, example: ExampleData) -> str:
        """
        PASSO 1: Formata um √∫nico exemplo.
        
        Constr√≥i string com:
        - Pergunta (Q: texto do exemplo)
        - Resposta (A: extra√ß√µes formatadas)
        """
        question = example.text
        answer = self.format_handler.format_extraction_example(example.extractions)
        
        return "\n".join([
            f"{self.question_prefix}{question}",
            f"{self.answer_prefix}{answer}\n",
        ])
    
    def render(self, question: str, additional_context: str | None = None) -> str:
        """
        M√âTODO PRINCIPAL: Constr√≥i o prompt completo passo a passo.
        
        Builder Pattern aplicado:
        1. Adiciona descri√ß√£o
        2. Adiciona contexto (se houver)
        3. Adiciona heading de exemplos
        4. Adiciona cada exemplo formatado
        5. Adiciona pergunta
        6. Adiciona prefixo de resposta
        
        Retorna: String de prompt completa
        """
        prompt_lines: list[str] = []
        
        # PASSO 1: Descri√ß√£o/Instru√ß√µes
        prompt_lines.append(f"{self.template.description}\n")
        
        # PASSO 2: Contexto adicional (opcional)
        if additional_context:
            prompt_lines.append(f"{additional_context}\n")
        
        # PASSO 3: Se√ß√£o de exemplos
        if self.template.examples:
            prompt_lines.append(self.examples_heading)
            
            # PASSO 4: Cada exemplo formatado
            for ex in self.template.examples:
                prompt_lines.append(self.format_example_as_text(ex))
        
        # PASSO 5: Pergunta atual
        prompt_lines.append(f"{self.question_prefix}{question}")
        
        # PASSO 6: Prefixo de resposta
        prompt_lines.append(self.answer_prefix)
        
        # PRODUTO FINAL: String completa
        return "\n".join(prompt_lines)
    
    def __str__(self) -> str:
        """Renderiza com pergunta vazia (para visualiza√ß√£o)"""
        return self.render("")
```

### **B) PromptTemplateStructured - Configura√ß√£o do Builder**

```python
@dataclasses.dataclass
class PromptTemplateStructured:
    """
    Dados de entrada para o Builder.
    
    Armazena:
    - description: Instru√ß√µes para o LLM
    - examples: Lista de exemplos few-shot
    """
    description: str
    examples: list[ExampleData] = dataclasses.field(default_factory=list)
```

### **C) Produto Final - String de Prompt**

```python
# EXEMPLO DE PROMPT CONSTRU√çDO:
"""
Extract characters, emotions, and relationships in order of appearance.
Use exact text for extractions. Do not paraphrase or overlap entities.

Examples
Q: ROMEO. But soft! What light through yonder window breaks?
A: ```json
{
  "extractions": [
    {
      "character": "ROMEO",
      "character_attributes": {"emotional_state": "wonder"}
    },
    {
      "emotion": "But soft!",
      "emotion_attributes": {"feeling": "gentle awe"}
    }
  ]
}
```

Q: Lady Juliet gazed longingly at the stars, her heart aching for Romeo
A:
"""
# ‚Üê Builder construiu isso incrementalmente!
```

---

## üìä **3. COMPARA√á√ÉO DIRETA**

| Aspecto | Builder Cl√°ssico | QAPromptGenerator |
|---------|------------------|-------------------|
| **Produto complexo** | `Pizza` (objeto) | `str` (prompt completo) |
| **Constru√ß√£o incremental** | ‚úÖ M√©todos build_*() | ‚úÖ `prompt_lines.append()` |
| **Passos separados** | ‚úÖ dough ‚Üí sauce ‚Üí topping | ‚úÖ description ‚Üí context ‚Üí examples ‚Üí question |
| **Ordem importa** | ‚úÖ SIM | ‚úÖ SIM (descri√ß√£o primeiro, resposta por √∫ltimo) |
| **Fluent interface** | ‚úÖ `return self` | ‚ùå N√£o (usa lista interna) |
| **Configur√°vel** | ‚úÖ Diferentes builders | ‚úÖ Par√¢metros (prefixes, headings) |
| **Produto final** | `get_pizza()` | `render()` retorna string |
| **Director** | ‚úÖ `PizzaDirector` | ‚ùå N√£o tem (render() j√° √© o "director") |

---

## üéØ **4. MAPEAMENTO CONCEITUAL**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  PADR√ÉO GoF          ‚Üí    LANGEXTRACT                   ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Builder             ‚Üí    QAPromptGenerator             ‚îÇ
‚îÇ  Product             ‚Üí    str (prompt completo)         ‚îÇ
‚îÇ  build_dough()       ‚Üí    Adiciona description          ‚îÇ
‚îÇ  build_sauce()       ‚Üí    Adiciona context              ‚îÇ
‚îÇ  build_topping()     ‚Üí    Adiciona examples             ‚îÇ
‚îÇ  build_extra()       ‚Üí    Adiciona question + prefix    ‚îÇ
‚îÇ  get_product()       ‚Üí    render() retorna string       ‚îÇ
‚îÇ  Director            ‚Üí    N√£o tem (render() coordena)   ‚îÇ
‚îÇ  Configuration       ‚Üí    PromptTemplateStructured      ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### **Fluxo de Constru√ß√£o**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Cliente            ‚îÇ
‚îÇ  (Annotator)        ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
           ‚îÇ
           ‚îÇ prompt_generator.render(question)
           ‚îÇ
           ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  QAPromptGenerator (Builder)            ‚îÇ
‚îÇ  render() m√©todo coordena constru√ß√£o:   ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
           ‚îÇ
           ‚îú‚îÄ 1. prompt_lines.append(description)
           ‚îÇ
           ‚îú‚îÄ 2. if context: prompt_lines.append(context)
           ‚îÇ
           ‚îú‚îÄ 3. prompt_lines.append(examples_heading)
           ‚îÇ
           ‚îú‚îÄ 4. for ex in examples:
           ‚îÇ      prompt_lines.append(format_example_as_text(ex))
           ‚îÇ
           ‚îú‚îÄ 5. prompt_lines.append(question_prefix + question)
           ‚îÇ
           ‚îî‚îÄ 6. prompt_lines.append(answer_prefix)
           ‚îÇ
           ‚ñº
    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
    ‚îÇ  "\n".join(...)  ‚îÇ  ‚óÑ‚îÄ‚îÄ‚îÄ Produto Final (String)
    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

---

## üí° **5. DIFEREN√áAS PRINCIPAIS**

### **A) Builder Cl√°ssico: M√∫ltiplos M√©todos**

```python
class PizzaBuilder:
    def __init__(self):
        self.pizza = Pizza()
    
    def build_dough(self):
        self.pizza.dough = "..."  # ‚Üê M√©todo 1
        return self
    
    def build_sauce(self):
        self.pizza.sauce = "..."  # ‚Üê M√©todo 2
        return self
    
    def build_topping(self):
        self.pizza.topping = "..." # ‚Üê M√©todo 3
        return self
    
    def get_pizza(self) -> Pizza:
        return self.pizza  # ‚Üê Retorna produto

# Uso: Constru√ß√£o expl√≠cita
pizza = (builder
         .build_dough()
         .build_sauce()
         .build_topping()
         .get_pizza())
```

### **B) QAPromptGenerator: M√©todo √önico de Constru√ß√£o**

```python
class QAPromptGenerator:
    def render(self, question: str, additional_context: str | None = None) -> str:
        """√önico m√©todo que coordena TODA constru√ß√£o"""
        prompt_lines = []
        
        # PASSO 1
        prompt_lines.append(self.template.description)
        
        # PASSO 2 (condicional)
        if additional_context:
            prompt_lines.append(additional_context)
        
        # PASSO 3
        if self.template.examples:
            prompt_lines.append(self.examples_heading)
        
        # PASSO 4 (loop)
        for ex in self.template.examples:
            prompt_lines.append(self.format_example_as_text(ex))
        
        # PASSO 5
        prompt_lines.append(f"{self.question_prefix}{question}")
        
        # PASSO 6
        prompt_lines.append(self.answer_prefix)
        
        # PRODUTO FINAL
        return "\n".join(prompt_lines)

# Uso: Uma chamada
prompt = generator.render(question="Extract entities from...")
```

**Por que a diferen√ßa?**
- Builder cl√°ssico: Flexibilidade m√°xima (cliente controla ordem)
- QAPromptGenerator: Ordem fixa faz sentido (descri√ß√£o sempre primeiro)
- Prompts t√™m estrutura previs√≠vel (n√£o precisam de fluent interface)

---

## üîß **6. EXEMPLO PR√ÅTICO COMPLETO**

### **A) Builder Cl√°ssico (Constru√ß√£o de Email)**

```python
# ============================================
# PRODUTO
# ============================================

class Email:
    def __init__(self):
        self.to = None
        self.subject = None
        self.body = None
        self.attachments = []
    
    def __str__(self):
        return f"To: {self.to}\nSubject: {self.subject}\n\n{self.body}"

# ============================================
# BUILDER
# ============================================

class EmailBuilder:
    def __init__(self):
        self.email = Email()
    
    def to(self, recipient: str):
        self.email.to = recipient
        return self
    
    def subject(self, subject: str):
        self.email.subject = subject
        return self
    
    def body(self, body: str):
        self.email.body = body
        return self
    
    def attach(self, filename: str):
        self.email.attachments.append(filename)
        return self
    
    def build(self) -> Email:
        return self.email

# ============================================
# USO - Fluent Interface
# ============================================

email = (EmailBuilder()
         .to("user@example.com")
         .subject("Hello")
         .body("How are you?")
         .attach("report.pdf")
         .build())

print(email)
```

### **B) QAPromptGenerator (Real do LangExtract)**

```python
# ============================================
# CONFIGURA√á√ÉO
# ============================================

from dataclasses import dataclass

@dataclass
class ExampleData:
    text: str
    extractions: list

@dataclass
class PromptTemplateStructured:
    description: str
    examples: list[ExampleData]

# ============================================
# BUILDER
# ============================================

class QAPromptGenerator:
    def __init__(
        self,
        template: PromptTemplateStructured,
        examples_heading: str = "Examples",
        question_prefix: str = "Q: ",
        answer_prefix: str = "A: "
    ):
        self.template = template
        self.examples_heading = examples_heading
        self.question_prefix = question_prefix
        self.answer_prefix = answer_prefix
    
    def format_example_as_text(self, example: ExampleData) -> str:
        """Formata um exemplo como Q:/A:"""
        return (
            f"{self.question_prefix}{example.text}\n"
            f"{self.answer_prefix}{example.extractions}\n"
        )
    
    def render(self, question: str, additional_context: str | None = None) -> str:
        """Constr√≥i prompt completo"""
        lines = []
        
        # 1. Descri√ß√£o
        lines.append(f"{self.template.description}\n")
        
        # 2. Contexto (opcional)
        if additional_context:
            lines.append(f"{additional_context}\n")
        
        # 3. Exemplos
        if self.template.examples:
            lines.append(self.examples_heading)
            for ex in self.template.examples:
                lines.append(self.format_example_as_text(ex))
        
        # 4. Pergunta
        lines.append(f"{self.question_prefix}{question}")
        
        # 5. Prefixo resposta
        lines.append(self.answer_prefix)
        
        return "\n".join(lines)

# ============================================
# USO
# ============================================

template = PromptTemplateStructured(
    description="Extract entities from text:",
    examples=[
        ExampleData(
            text="Alice loves Bob",
            extractions=["Alice", "Bob", "loves"]
        )
    ]
)

generator = QAPromptGenerator(template)
prompt = generator.render(
    question="Romeo sees Juliet",
    additional_context="Focus on names"
)

print(prompt)
# Output:
# Extract entities from text:
#
# Focus on names
#
# Examples
# Q: Alice loves Bob
# A: ["Alice", "Bob", "loves"]
#
# Q: Romeo sees Juliet
# A:
```

---

## ‚úÖ **7. CHECKLIST DE VERIFICA√á√ÉO**

| Caracter√≠stica do Builder Pattern | QAPromptGenerator |
|------------------------------------|-------------------|
| ‚úÖ Constr√≥i objeto complexo | ‚úÖ SIM (prompt multi-se√ß√£o) |
| ‚úÖ Constru√ß√£o incremental | ‚úÖ SIM (append linha por linha) |
| ‚úÖ Separa constru√ß√£o de representa√ß√£o | ‚úÖ SIM (render() vs template data) |
| ‚úÖ Passos bem definidos | ‚úÖ SIM (6 passos claros) |
| ‚úÖ Ordem de constru√ß√£o controlada | ‚úÖ SIM (descri√ß√£o ‚Üí exemplos ‚Üí quest√£o) |
| ‚úÖ Produto complexo final | ‚úÖ SIM (string formatada) |
| ‚ùå Fluent interface (return self) | ‚ùå N√ÉO (usa lista interna) |
| ‚ùå Classe Director separada | ‚ùå N√ÉO (render() j√° √© o director) |

---

## üìà **8. VANTAGENS DA IMPLEMENTA√á√ÉO**

| Vantagem | Descri√ß√£o |
|----------|-----------|
| **Ordem garantida** | Descri√ß√£o sempre primeiro, resposta sempre √∫ltima |
| **Configur√°vel** | Prefixes, headings customiz√°veis |
| **Reutiliz√°vel** | Mesmo generator para m√∫ltiplas perguntas |
| **Separa√ß√£o de concerns** | Template (dados) vs Generator (constru√ß√£o) |
| **Test√°vel** | Pode testar cada passo individualmente |
| **Legibilidade** | C√≥digo sequencial claro |

---

## üéØ **9. VARIA√á√ïES DO PADR√ÉO**

### **A) Builder Cl√°ssico (GoF)**
```python
# M√∫ltiplos m√©todos build_*()
# Fluent interface (return self)
# get_product() expl√≠cito

builder = EmailBuilder()
email = (builder
         .to("...")
         .subject("...")
         .body("...")
         .build())  # ‚Üê get_product()
```

### **B) Telescoping Constructor (Anti-pattern)**
```python
# ‚ùå ANTIPADR√ÉO - Muitos par√¢metros
prompt = create_prompt(
    description="...",
    context="...",
    examples=[...],
    examples_heading="...",
    question_prefix="...",
    answer_prefix="...",
    question="...",
)
# Dif√≠cil de ler e manter!
```

### **C) QAPromptGenerator (Builder Simplificado)**
```python
# ‚úÖ Builder interno
# M√©todo √∫nico render()
# Ordem fixa de passos

generator = QAPromptGenerator(template)
prompt = generator.render(question="...")  # ‚Üê Tudo em uma chamada
```

**Por que QAPromptGenerator √© Builder?**
- Constr√≥i produto complexo (prompt) passo a passo
- Separa configura√ß√£o (template) de constru√ß√£o (render)
- Encapsula l√≥gica de formata√ß√£o
- Ordem de passos bem definida





# üéØ Strategy Pattern - An√°lise LangExtract
## Compara√ß√£o: base_model.py + Implementa√ß√µes (gemini.py, openai.py, ollama.py) vs Padr√£o GoF

---

## üìö **1. PADR√ÉO STRATEGY (GoF) - RESUMO**

### **Defini√ß√£o**
> Define uma fam√≠lia de algoritmos, encapsula cada um deles e os torna intercambi√°veis. Strategy permite que o algoritmo varie independentemente dos clientes que o utilizam.

### **Estrutura Cl√°ssica**

```python
from abc import ABC, abstractmethod

# ============================================
# STRATEGY - Interface abstrata
# ============================================

class CompressionStrategy(ABC):
    """Interface comum para todas as estrat√©gias de compress√£o"""
    
    @abstractmethod
    def compress(self, data: str) -> bytes:
        """Comprime dados usando algoritmo espec√≠fico"""
        pass

# ============================================
# CONCRETE STRATEGIES - Implementa√ß√µes
# ============================================

class ZipCompression(CompressionStrategy):
    """Estrat√©gia concreta: Compress√£o ZIP"""
    
    def compress(self, data: str) -> bytes:
        # Algoritmo ZIP
        return f"ZIP({data})".encode()

class RarCompression(CompressionStrategy):
    """Estrat√©gia concreta: Compress√£o RAR"""
    
    def compress(self, data: str) -> bytes:
        # Algoritmo RAR
        return f"RAR({data})".encode()

class GzipCompression(CompressionStrategy):
    """Estrat√©gia concreta: Compress√£o GZIP"""
    
    def compress(self, data: str) -> bytes:
        # Algoritmo GZIP
        return f"GZIP({data})".encode()

# ============================================
# CONTEXT - Classe que usa Strategy
# ============================================

class FileCompressor:
    """Context: Usa estrat√©gia de compress√£o"""
    
    def __init__(self, strategy: CompressionStrategy):
        self._strategy = strategy
    
    def set_strategy(self, strategy: CompressionStrategy):
        """Troca estrat√©gia em tempo de execu√ß√£o"""
        self._strategy = strategy
    
    def compress_file(self, data: str) -> bytes:
        """Delega compress√£o para estrat√©gia atual"""
        print(f"Usando estrat√©gia: {self._strategy.__class__.__name__}")
        return self._strategy.compress(data)

# ============================================
# USO - Cliente escolhe estrat√©gia
# ============================================

# Cliente escolhe ZIP
compressor = FileCompressor(ZipCompression())
result1 = compressor.compress_file("Hello World")
print(result1)  # b'ZIP(Hello World)'

# Troca para RAR em runtime
compressor.set_strategy(RarCompression())
result2 = compressor.compress_file("Hello World")
print(result2)  # b'RAR(Hello World)'

# Troca para GZIP
compressor.set_strategy(GzipCompression())
result3 = compressor.compress_file("Hello World")
print(result3)  # b'GZIP(Hello World)'
```

---

## üîç **2. STRATEGY NO LANGEXTRACT**

### **A) BaseLanguageModel - A Interface Strategy**

```python
# ============================================
# ARQUIVO: langextract/core/base_model.py
# ============================================

class BaseLanguageModel(abc.ABC):
    """
    STRATEGY INTERFACE: Interface abstrata para LLMs.
    
    Define contrato comum para todas as estrat√©gias (providers):
    - infer() ‚Üí M√©todo principal (algoritmo intercambi√°vel)
    - parse_output() ‚Üí Parsing de resposta
    - merge_kwargs() ‚Üí Configura√ß√£o
    """
    
    def __init__(self, constraint: types.Constraint | None = None, **kwargs: Any):
        """Inicializa com configura√ß√£o base"""
        self._constraint = constraint or types.Constraint()
        self._schema: schema.BaseSchema | None = None
        self._extra_kwargs: dict[str, Any] = kwargs.copy()
    
    @abc.abstractmethod
    def infer(
        self,
        batch_prompts: Sequence[str],
        **kwargs
    ) -> Iterator[Sequence[types.ScoredOutput]]:
        """
        ALGORITMO INTERCAMBI√ÅVEL: Infer√™ncia do LLM.
        
        Cada provider implementa SEU algoritmo:
        - Gemini: Usa google-genai SDK
        - OpenAI: Usa openai SDK
        - Ollama: Usa HTTP requests
        
        Args:
            batch_prompts: Lista de prompts
            **kwargs: Par√¢metros espec√≠ficos (temperature, etc.)
        
        Returns:
            Iterator de outputs com scores
        """
    
    def parse_output(self, output: str) -> Any:
        """M√©todo comum (n√£o abstrato) - pode ser sobrescrito"""
        format_type = getattr(self, 'format_type', types.FormatType.JSON)
        
        try:
            if format_type == types.FormatType.JSON:
                return json.loads(output)
            else:
                return yaml.safe_load(output)
        except Exception as e:
            raise ValueError(f'Failed to parse output: {str(e)}') from e
    
    def merge_kwargs(self, runtime_kwargs: Mapping[str, Any] | None = None) -> dict:
        """M√©todo comum - merge de par√¢metros"""
        base = getattr(self, '_extra_kwargs', {}) or {}
        incoming = dict(runtime_kwargs or {})
        return {**base, **incoming}
```

### **B) Concrete Strategy 1: GeminiLanguageModel**

```python
# ============================================
# ARQUIVO: langextract/providers/gemini.py
# ============================================

class GeminiLanguageModel(base_model.BaseLanguageModel):
    """
    CONCRETE STRATEGY 1: Implementa√ß√£o Google Gemini.
    
    Algoritmo espec√≠fico:
    - Usa SDK google-genai
    - Suporta structured output (JSON schema)
    - Parallel processing com ThreadPoolExecutor
    """
    
    model_id: str = 'gemini-2.5-flash'
    api_key: str | None = None
    temperature: float = 0.0
    max_workers: int = 10
    
    def __init__(
        self,
        model_id: str = 'gemini-2.5-flash',
        api_key: str | None = None,
        temperature: float = 0.0,
        **kwargs
    ):
        """Inicializa cliente Gemini"""
        from google import genai
        
        self.model_id = model_id
        self.api_key = api_key
        self.temperature = temperature
        
        # Cliente Gemini SDK
        self._client = genai.Client(
            api_key=self.api_key,
            vertexai=False,
        )
        
        super().__init__(constraint=schema.Constraint())
    
    def infer(
        self,
        batch_prompts: Sequence[str],
        **kwargs
    ) -> Iterator[Sequence[core_types.ScoredOutput]]:
        """
        ALGORITMO GEMINI: Infer√™ncia via Google Genai SDK.
        
        Espec√≠fico do Gemini:
        - Parallel processing (ThreadPoolExecutor)
        - response_schema para structured output
        - response_mime_type: application/json
        """
        config = {
            'temperature': kwargs.get('temperature', self.temperature),
        }
        
        # Gemini-specific: JSON schema support
        if self.gemini_schema:
            config['response_mime_type'] = 'application/json'
            config['response_schema'] = self.gemini_schema.schema_dict
        
        # Parallel processing (Gemini-specific optimization)
        if len(batch_prompts) > 1 and self.max_workers > 1:
            with concurrent.futures.ThreadPoolExecutor(
                max_workers=min(self.max_workers, len(batch_prompts))
            ) as executor:
                futures = {
                    executor.submit(self._process_single_prompt, p, config): i
                    for i, p in enumerate(batch_prompts)
                }
                
                results = [None] * len(batch_prompts)
                for future in concurrent.futures.as_completed(futures):
                    idx = futures[future]
                    results[idx] = future.result()
                
                for result in results:
                    yield [result]
        else:
            # Sequential processing
            for prompt in batch_prompts:
                result = self._process_single_prompt(prompt, config)
                yield [result]
    
    def _process_single_prompt(self, prompt: str, config: dict):
        """Processa um √∫nico prompt via Gemini API"""
        response = self._client.models.generate_content(
            model=self.model_id,
            contents=prompt,
            config=config
        )
        return core_types.ScoredOutput(score=1.0, output=response.text)
```

### **C) Concrete Strategy 2: OpenAILanguageModel**

```python
# ============================================
# ARQUIVO: langextract/providers/openai.py
# ============================================

class OpenAILanguageModel(base_model.BaseLanguageModel):
    """
    CONCRETE STRATEGY 2: Implementa√ß√£o OpenAI.
    
    Algoritmo espec√≠fico:
    - Usa SDK openai
    - JSON mode (response_format)
    - Reasoning effort para o1/o3 models
    """
    
    model_id: str = 'gpt-4o-mini'
    api_key: str | None = None
    temperature: float | None = None
    
    def __init__(
        self,
        model_id: str = 'gpt-4o-mini',
        api_key: str | None = None,
        temperature: float | None = None,
        **kwargs
    ):
        """Inicializa cliente OpenAI"""
        import openai
        
        self.model_id = model_id
        self.api_key = api_key
        self.temperature = temperature
        
        # Cliente OpenAI SDK
        self._client = openai.OpenAI(api_key=self.api_key)
        
        super().__init__(constraint=schema.Constraint())
    
    def infer(
        self,
        batch_prompts: Sequence[str],
        **kwargs
    ) -> Iterator[Sequence[core_types.ScoredOutput]]:
        """
        ALGORITMO OPENAI: Infer√™ncia via OpenAI SDK.
        
        Espec√≠fico do OpenAI:
        - Chat completions API
        - response_format: json_object
        - reasoning_effort para o1 models
        """
        config = {}
        
        if (temp := kwargs.get('temperature', self.temperature)) is not None:
            config['temperature'] = temp
        
        # OpenAI-specific: JSON mode
        if self.format_type == data.FormatType.JSON:
            config.setdefault('response_format', {'type': 'json_object'})
        
        # OpenAI-specific: Reasoning effort (o1 models)
        if 'reasoning_effort' in kwargs:
            config['reasoning'] = {'effort': kwargs['reasoning_effort']}
        
        # Parallel ou sequential
        for prompt in batch_prompts:
            result = self._process_single_prompt(prompt, config)
            yield [result]
    
    def _process_single_prompt(self, prompt: str, config: dict):
        """Processa um √∫nico prompt via OpenAI API"""
        messages = [
            {'role': 'system', 'content': 'You respond in JSON format.'},
            {'role': 'user', 'content': prompt}
        ]
        
        response = self._client.chat.completions.create(
            model=self.model_id,
            messages=messages,
            **config
        )
        
        output = response.choices[0].message.content
        return core_types.ScoredOutput(score=1.0, output=output)
    
    @property
    def requires_fence_output(self) -> bool:
        """OpenAI JSON mode retorna JSON puro (sem fences)"""
        if self.format_type == data.FormatType.JSON:
            return False  # ‚Üê Comportamento espec√≠fico OpenAI
        return super().requires_fence_output
```

### **D) Concrete Strategy 3: OllamaLanguageModel**

```python
# ============================================
# ARQUIVO: langextract/providers/ollama.py
# ============================================

class OllamaLanguageModel(base_model.BaseLanguageModel):
    """
    CONCRETE STRATEGY 3: Implementa√ß√£o Ollama (local).
    
    Algoritmo espec√≠fico:
    - Usa HTTP requests (n√£o tem SDK oficial)
    - API local (localhost:11434)
    - JSON format mode
    """
    
    _model: str
    _model_url: str = 'http://localhost:11434'
    format_type: core_types.FormatType = core_types.FormatType.JSON
    
    def __init__(
        self,
        model_id: str,
        model_url: str = 'http://localhost:11434',
        format_type: core_types.FormatType = core_types.FormatType.JSON,
        **kwargs
    ):
        """Inicializa cliente Ollama (HTTP)"""
        self._model = model_id
        self._model_url = model_url
        self.format_type = format_type
        self._requests = requests  # HTTP client
        
        super().__init__(constraint=schema.Constraint())
    
    def infer(
        self,
        batch_prompts: Sequence[str],
        **kwargs
    ) -> Iterator[Sequence[core_types.ScoredOutput]]:
        """
        ALGORITMO OLLAMA: Infer√™ncia via HTTP requests.
        
        Espec√≠fico do Ollama:
        - POST para /api/generate
        - format: 'json' no payload
        - Timeout configur√°vel
        """
        for prompt in batch_prompts:
            try:
                response = self._ollama_query(
                    prompt=prompt,
                    model=self._model,
                    structured_output_format='json',
                    model_url=self._model_url,
                    **kwargs
                )
                yield [core_types.ScoredOutput(
                    score=1.0,
                    output=response['response']
                )]
            except Exception as e:
                raise exceptions.InferenceRuntimeError(
                    f'Ollama API error: {str(e)}'
                ) from e
    
    def _ollama_query(
        self,
        prompt: str,
        model: str,
        structured_output_format: str,
        model_url: str,
        **kwargs
    ) -> dict:
        """
        Query Ollama via HTTP POST.
        
        Ollama-specific implementation:
        - Endpoint: /api/generate
        - Payload: {model, prompt, format, options}
        """
        api_url = urljoin(model_url, 'api/generate')
        
        payload = {
            'model': model,
            'prompt': prompt,
            'stream': False,  # ‚Üê Ollama-specific
            'format': structured_output_format,  # ‚Üê Ollama JSON mode
            'options': {
                'temperature': kwargs.get('temperature', 0.1),
                'num_ctx': kwargs.get('num_ctx', 2048),
            }
        }
        
        # HTTP POST (sem SDK oficial)
        response = self._requests.post(
            api_url,
            json=payload,
            timeout=kwargs.get('timeout', 120)
        )
        
        if response.status_code == 200:
            return response.json()
        elif response.status_code == 404:
            raise exceptions.InferenceConfigError(
                f"Model {model} not found. Try: ollama run {model}"
            )
        else:
            raise exceptions.InferenceRuntimeError(
                f'Ollama error: {response.status_code}'
            )
```

---

## üìä **3. COMPARA√á√ÉO DIRETA**

| Aspecto | Strategy Cl√°ssico | LangExtract |
|---------|-------------------|-------------|
| **Interface abstrata** | `CompressionStrategy` | `BaseLanguageModel` |
| **M√©todo abstrato** | `compress(data)` | `infer(batch_prompts, **kwargs)` |
| **Concrete Strategy 1** | `ZipCompression` | `GeminiLanguageModel` |
| **Concrete Strategy 2** | `RarCompression` | `OpenAILanguageModel` |
| **Concrete Strategy 3** | `GzipCompression` | `OllamaLanguageModel` |
| **Context** | `FileCompressor` | `Annotator` (usa model.infer()) |
| **Troca de estrat√©gia** | `set_strategy()` | Factory cria estrat√©gia correta |
| **Algoritmo varia** | ‚úÖ SIM (ZIP vs RAR vs GZIP) | ‚úÖ SIM (Gemini vs OpenAI vs Ollama) |
| **Interface comum** | ‚úÖ SIM | ‚úÖ SIM (infer + parse_output) |
| **Polimorfismo** | ‚úÖ SIM | ‚úÖ SIM (BaseLanguageModel) |

---

## üéØ **4. MAPEAMENTO CONCEITUAL**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  PADR√ÉO GoF                ‚Üí    LANGEXTRACT                ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Strategy (interface)      ‚Üí    BaseLanguageModel          ‚îÇ
‚îÇ  compress() abstrato       ‚Üí    infer() abstrato           ‚îÇ
‚îÇ  ConcreteStrategyA         ‚Üí    GeminiLanguageModel        ‚îÇ
‚îÇ  ConcreteStrategyB         ‚Üí    OpenAILanguageModel        ‚îÇ
‚îÇ  ConcreteStrategyC         ‚Üí    OllamaLanguageModel        ‚îÇ
‚îÇ  Context                   ‚Üí    Annotator                  ‚îÇ
‚îÇ  set_strategy()            ‚Üí    create_model(config)       ‚îÇ
‚îÇ  Cliente escolhe           ‚Üí    ModelConfig especifica     ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### **Fluxo de Execu√ß√£o**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Cliente (extract)   ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
           ‚îÇ
           ‚îÇ ModelConfig(model_id="gemini-2.5-flash")
           ‚îÇ
           ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Factory             ‚îÇ
‚îÇ  create_model()      ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
           ‚îÇ
           ‚îÇ Cria estrat√©gia apropriada
           ‚îÇ
           ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  BaseLanguageModel (Strategy Interface)     ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
           ‚îÇ
           ‚îú‚îÄ‚îÄ‚îÄ GeminiLanguageModel
           ‚îÇ    ‚îî‚îÄ infer() ‚Üí google.genai SDK
           ‚îÇ
           ‚îú‚îÄ‚îÄ‚îÄ OpenAILanguageModel
           ‚îÇ    ‚îî‚îÄ infer() ‚Üí openai SDK
           ‚îÇ
           ‚îî‚îÄ‚îÄ‚îÄ OllamaLanguageModel
                ‚îî‚îÄ infer() ‚Üí HTTP requests
           ‚îÇ
           ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Annotator (Context) ‚îÇ
‚îÇ  model.infer(prompt) ‚îÇ  ‚óÑ‚îÄ‚îÄ‚îÄ Usa estrat√©gia polimorficamente
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

---

## üí° **5. EXEMPLO PR√ÅTICO COMPLETO**

### **A) Strategy Cl√°ssico (Pagamento)**

```python
from abc import ABC, abstractmethod

# ============================================
# STRATEGY INTERFACE
# ============================================

class PaymentStrategy(ABC):
    @abstractmethod
    def pay(self, amount: float) -> str:
        pass

# ============================================
# CONCRETE STRATEGIES
# ============================================

class CreditCardPayment(PaymentStrategy):
    def __init__(self, card_number: str):
        self.card_number = card_number
    
    def pay(self, amount: float) -> str:
        return f"Paid ${amount} with Credit Card {self.card_number}"

class PayPalPayment(PaymentStrategy):
    def __init__(self, email: str):
        self.email = email
    
    def pay(self, amount: float) -> str:
        return f"Paid ${amount} with PayPal {self.email}"

class BitcoinPayment(PaymentStrategy):
    def __init__(self, wallet_address: str):
        self.wallet_address = wallet_address
    
    def pay(self, amount: float) -> str:
        return f"Paid ${amount} with Bitcoin {self.wallet_address}"

# ============================================
# CONTEXT
# ============================================

class ShoppingCart:
    def __init__(self):
        self._items = []
        self._payment_strategy: PaymentStrategy | None = None
    
    def add_item(self, item: str, price: float):
        self._items.append((item, price))
    
    def set_payment_strategy(self, strategy: PaymentStrategy):
        """Troca estrat√©gia de pagamento"""
        self._payment_strategy = strategy
    
    def checkout(self) -> str:
        """Usa estrat√©gia atual para pagamento"""
        total = sum(price for _, price in self._items)
        
        if not self._payment_strategy:
            raise ValueError("No payment strategy set")
        
        return self._payment_strategy.pay(total)

# ============================================
# USO
# ============================================

cart = ShoppingCart()
cart.add_item("Book", 29.99)
cart.add_item("Pen", 1.50)

# Cliente 1: Paga com cart√£o
cart.set_payment_strategy(CreditCardPayment("1234-5678-9012-3456"))
print(cart.checkout())  # Paid $31.49 with Credit Card 1234-...

# Cliente 2: Troca para PayPal
cart.set_payment_strategy(PayPalPayment("user@example.com"))
print(cart.checkout())  # Paid $31.49 with PayPal user@example.com

# Cliente 3: Troca para Bitcoin
cart.set_payment_strategy(BitcoinPayment("1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa"))
print(cart.checkout())  # Paid $31.49 with Bitcoin 1A1z...
```

### **B) LangExtract Strategy (Real)**

```python
# ============================================
# USO NO LANGEXTRACT
# ============================================

import langextract as lx

# ============================================
# ESTRAT√âGIA 1: Gemini
# ============================================

result1 = lx.extract(
    text_or_documents="Extract info from this text",
    model_id="gemini-2.5-flash",  # ‚Üê Factory escolhe GeminiLanguageModel
    api_key="...",
    prompt_description="Extract entities"
)
# Internamente:
# - Factory cria GeminiLanguageModel
# - Annotator chama model.infer()
# - GeminiLanguageModel.infer() usa google-genai SDK

# ============================================
# ESTRAT√âGIA 2: OpenAI (troca em runtime)
# ============================================

result2 = lx.extract(
    text_or_documents="Same text",
    model_id="gpt-4o-mini",  # ‚Üê Factory escolhe OpenAILanguageModel
    api_key="...",
    prompt_description="Extract entities"
)
# Internamente:
# - Factory cria OpenAILanguageModel
# - Annotator chama model.infer()
# - OpenAILanguageModel.infer() usa openai SDK

# ============================================
# ESTRAT√âGIA 3: Ollama (local)
# ============================================

result3 = lx.extract(
    text_or_documents="Same text",
    model_id="llama3.2:1b",  # ‚Üê Factory escolhe OllamaLanguageModel
    prompt_description="Extract entities"
)
# Internamente:
# - Factory cria OllamaLanguageModel
# - Annotator chama model.infer()
# - OllamaLanguageModel.infer() usa HTTP requests

# ============================================
# POLIMORFISMO EM A√á√ÉO
# ============================================

from langextract.core.base_model import BaseLanguageModel

def process_with_any_model(model: BaseLanguageModel, prompt: str):
    """Aceita QUALQUER estrat√©gia (Gemini, OpenAI, Ollama)"""
    outputs = list(model.infer([prompt]))
    return outputs[0][0].output

# Funciona com todas as estrat√©gias:
gemini_model = GeminiLanguageModel(api_key="...")
openai_model = OpenAILanguageModel(api_key="...")
ollama_model = OllamaLanguageModel(model_id="llama3.2:1b")

result_a = process_with_any_model(gemini_model, "Extract...")
result_b = process_with_any_model(openai_model, "Extract...")
result_c = process_with_any_model(ollama_model, "Extract...")
```

---

## ‚úÖ **6. CHECKLIST DE VERIFICA√á√ÉO**

| Caracter√≠stica do Strategy Pattern | LangExtract |
|-------------------------------------|-------------|
| ‚úÖ Interface abstrata com m√©todo abstrato | ‚úÖ SIM (`BaseLanguageModel.infer()`) |
| ‚úÖ M√∫ltiplas implementa√ß√µes concretas | ‚úÖ SIM (Gemini, OpenAI, Ollama) |
| ‚úÖ Algoritmo varia entre estrat√©gias | ‚úÖ SIM (SDK vs HTTP, parallel vs sequential) |
| ‚úÖ Interface comum para todas | ‚úÖ SIM (`infer()` assinatura id√™ntica) |
| ‚úÖ Polimorfismo em tempo de execu√ß√£o | ‚úÖ SIM (Annotator usa BaseLanguageModel) |
| ‚úÖ Estrat√©gias intercambi√°veis | ‚úÖ SIM (Factory cria estrat√©gia certa) |
| ‚úÖ Cliente n√£o conhece implementa√ß√£o | ‚úÖ SIM (Annotator n√£o sabe se √© Gemini/OpenAI) |
| ‚úÖ Facilita adi√ß√£o de novas estrat√©gias | ‚úÖ SIM (basta herdar BaseLanguageModel) |

---

## üìà **7. COMPARA√á√ÉO DE ALGORITMOS**

### **Diferen√ßas entre Estrat√©gias**

| Aspecto | GeminiLanguageModel | OpenAILanguageModel | OllamaLanguageModel |
|---------|---------------------|---------------------|---------------------|
| **SDK** | google-genai | openai | requests (HTTP) |
| **Endpoint** | genai.Client | chat.completions | POST /api/generate |
| **JSON mode** | response_schema | response_format | format: 'json' |
| **Parallel** | ‚úÖ ThreadPoolExecutor | ‚úÖ ThreadPoolExecutor | ‚ùå Sequential |
| **Auth** | api_key ou Vertex AI | api_key | Opcional (local) |
| **Structured output** | ‚úÖ Native (schema_dict) | ‚ö†Ô∏è JSON mode only | ‚ö†Ô∏è JSON mode only |
| **Timeout** | Configur√°vel | Configur√°vel | Configur√°vel |
| **Streaming** | ‚ùå N√£o (batch) | ‚ùå N√£o (batch) | ‚ùå N√£o (stream=False) |

### **Exemplo de Diferen√ßa de Algoritmo**

```python
# ============================================
# GEMINI: Parallel processing
# ============================================

def infer(self, batch_prompts):
    with ThreadPoolExecutor(max_workers=10) as executor:
        futures = {
            executor.submit(self._process_single_prompt, p): i
            for i, p in enumerate(batch_prompts)
        }
        # Processa todos em paralelo
        for future in as_completed(futures):
            yield [future.result()]

# ============================================
# OPENAI: Parallel processing similar
# ============================================

def infer(self, batch_prompts):
    # Mesma estrutura de Gemini
    with ThreadPoolExecutor(max_workers=10) as executor:
        # ... parallel processing

# ============================================
# OLLAMA: Sequential processing
# ============================================

def infer(self, batch_prompts):
    # Sem parallel processing
    for prompt in batch_prompts:
        response = self._ollama_query(prompt, ...)
        yield [ScoredOutput(output=response['response'])]
```

---

## üîß **8. VANTAGENS DA IMPLEMENTA√á√ÉO**

| Vantagem | Descri√ß√£o |
|----------|-----------|
| **Intercambi√°vel** | Troca entre Gemini/OpenAI/Ollama sem mudar c√≥digo cliente |
| **Extens√≠vel** | Nova estrat√©gia = nova classe herda BaseLanguageModel |
| **Test√°vel** | Pode criar MockLanguageModel para testes |
| **Encapsulamento** | Detalhes de API (SDK, HTTP) ocultos do cliente |
| **Polim√≥rfico** | Annotator usa BaseLanguageModel (n√£o importa qual) |
| **Configur√°vel** | Cada estrat√©gia tem seus par√¢metros espec√≠ficos |

---

## üÜï **9. ADICIONANDO NOVA ESTRAT√âGIA**

### **Como adicionar novo provider (ex: Anthropic Claude)**

```python
# ============================================
# NOVA ESTRAT√âGIA: AnthropicLanguageModel
# ============================================

from langextract.core import base_model
from langextract.providers import router

@router.register("claude-*", priority=5)  # Registra padr√£o
class AnthropicLanguageModel(base_model.BaseLanguageModel):
    """
    NOVA CONCRETE STRATEGY: Anthropic Claude.
    
    Algoritmo espec√≠fico:
    - Usa anthropic SDK
    - Messages API
    """
    
    model_id: str = "claude-3-5-sonnet-20241022"
    api_key: str | None = None
    
    def __init__(
        self,
        model_id: str = "claude-3-5-sonnet-20241022",
        api_key: str | None = None,
        **kwargs
    ):
        """Inicializa cliente Anthropic"""
        import anthropic
        
        self.model_id = model_id
        self.api_key = api_key
        
        # Cliente Anthropic SDK
        self._client = anthropic.Anthropic(api_key=api_key)
        
        super().__init__(constraint=schema.Constraint())
    
    def infer(
        self,
        batch_prompts: Sequence[str],
        **kwargs
    ) -> Iterator[Sequence[core_types.ScoredOutput]]:
        """
        ALGORITMO ANTHROPIC: Infer√™ncia via Anthropic SDK.
        
        Espec√≠fico do Claude:
        - Messages API
        - System prompts
        - Temperature, max_tokens
        """
        for prompt in batch_prompts:
            response = self._client.messages.create(
                model=self.model_id,
                max_tokens=kwargs.get('max_tokens', 1024),
                temperature=kwargs.get('temperature', 0.0),
                messages=[{
                    "role": "user",
                    "content": prompt
                }]
            )
            
            output = response.content[0].text
            yield [core_types.ScoredOutput(score=1.0, output=output)]

# ============================================
# USO IMEDIATO
# ============================================

import langextract as lx

result = lx.extract(
    text_or_documents="Extract entities",
    model_id="claude-3-5-sonnet-20241022",  # ‚Üê Factory detecta automaticamente
    api_key="sk-ant-...",
    prompt_description="Extract names"
)
# Funciona! Factory cria AnthropicLanguageModel via @router.register
```

---

## üìù **10. CONCLUS√ÉO**

### **‚úÖ √â Strategy Pattern? SIM! 100% GoF Compliant**

**Evid√™ncias:**

1. ‚úÖ **Interface abstrata** - `BaseLanguageModel(ABC)` com `@abstractmethod infer()`
2. ‚úÖ **M√∫ltiplas estrat√©gias** - Gemini, OpenAI, Ollama (+ extens√≠vel)
3. ‚úÖ **Algoritmo intercambi√°vel** - SDK google-genai vs openai vs HTTP
4. ‚úÖ **Polimorfismo** - Annotator usa `BaseLanguageModel` (n√£o conhece implementa√ß√£o)
5. ‚úÖ **Troca em runtime** - Factory cria estrat√©gia baseado em `model_id`
6. ‚úÖ **Interface comum** - Todas implementam `infer()` com mesma assinatura
7. ‚úÖ **Comportamento varia** - Parallel (Gemini) vs Sequential (Ollama)

### **Classifica√ß√£o:**
üèÜ **Strategy Pattern (GoF) - Implementa√ß√£o Pura e Completa**

### **Por que √© Strategy perfeito?**

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  OBJETIVO DO PADR√ÉO                        ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  "Permitir que algoritmo varie             ‚îÇ
‚îÇ   independentemente dos clientes"          ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                    ‚îÇ
                    ‚îÇ
                    ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  LANGEXTRACT IMPLEMENTA√á√ÉO                 ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  ‚úÖ Annotator (cliente) n√£o sabe qual LLM  ‚îÇ
‚îÇ  ‚úÖ GeminiLanguageModel.infer() ‚â†          ‚îÇ
‚îÇ      OpenAILanguageModel.infer()           ‚îÇ
‚îÇ  ‚úÖ Mesma interface, algoritmos diferentes ‚îÇ
‚îÇ  ‚úÖ F√°cil adicionar nova estrat√©gia        ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

### **Evid√™ncia Visual:**

```python
# ============================================
# SEM STRATEGY (‚ùå C√≥digo acoplado)
# ============================================

def extract(text, provider):
    if provider == "gemini":
        from google import genai
        client = genai.Client(api_key="...")
        response = client.models.generate_content(...)
        return response.text
    elif provider == "openai":
        import openai
        client = openai.OpenAI(api_key="...")
        response = client.chat.completions.create(...)
        return response.choices[0].message.content
    elif provider == "ollama":
        import requests
        response = requests.post("http://localhost:11434/api/generate", ...)
        return response.json()['response']
    # ‚ùå Dif√≠cil manter, adicionar novos providers

# ============================================
# COM STRATEGY (‚úÖ Desacoplado)
# ============================================

def extract(text, model: BaseLanguageModel):
    """Aceita QUALQUER estrat√©gia"""
    outputs = model.infer([text])
    return list(outputs)[0][0].output

# Uso com qualquer provider:
extract(text, GeminiLanguageModel(api_key="..."))
extract(text, OpenAILanguageModel(api_key="..."))
extract(text, OllamaLanguageModel(model_id="llama3.2"))
# ‚úÖ F√°cil adicionar AnthropicLanguageModel
```

