# Lesson 3: Prompting for structure and setting up a retry method 

In this notebook, you'll learn how to combine Pydantic with retry strategies to reliably extract structured output from an LLM.

By the end, you'll be able to:
- Define structured data models for LLM responses
- Build robust retry mechanisms for validation errors
- Create reusable functions for LLM interactions

---

### Import packages and initialize the OpenAI client

In [None]:
# Import necessary packages
import json
from datetime import date
from typing import Literal, Type

import openai
from dotenv import load_dotenv
from pydantic import BaseModel, EmailStr, Field, ValidationError

In [None]:
# Load environment variables for API access
load_dotenv()
# Initialize OpenAI client for API calls
client = openai.OpenAI()

### Define some sample input data

In [None]:
# Define a JSON string representing user input
user_input_json = """
{
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": "I forgot my password.",
    "order_number": null,
    "purchase_date": null
}
"""

### Define your UserInput data model

In [None]:
# Define UserInput model
class UserInput(BaseModel):
    """Pydantic model for validating user input with optional fields.

    Data model represents comprehensive user input containing name, email, query,
    order ID, and purchase date.

    Attributes:
        name (str): Full name of the user.
        email (EmailStr): Email address of the user.
        query (str): User's query or issue description.
        order_id (int | None): 5-digit order number (cannot start with 0).
        purchase_date (date | None): Date of purchase.

    Raises:
        ValidationError: If any validation constraint is violated:
            - Invalid email format
            - order_id outside range 10000-99999
            - Invalid date format for purchase_date

    Note:
        This class leverages Pydantic's Field for advanced validation.
        Optional fields can be omitted and will default to None.

    """

    name: str
    email: EmailStr
    query: str
    order_id: int | None = Field(
        None,
        description=("5-digit order number (cannot start with 0)"),
        ge=10000,
        le=99999,
    )
    purchase_date: date | None = None

In [None]:
# Create UserInput instance from JSON data
user_input = UserInput.model_validate_json(user_input_json)

### Create a new data model called CustomerQuery

In [None]:
# Define the CustomerQuery model that inherits from UserInput
class CustomerQuery(UserInput):
    """Represents a structured customer query with classification metadata.

    This model extends `UserInput` by adding fields necessary for automated
    processing, routing, and prioritization within a customer support system.
    It enriches the raw user input with structured data like priority,
    category, and tags.

    Attributes:
        priority: The urgency of the query. Must be one of 'low', 'medium',
            or 'high'.
        category: The general topic of the query. Must be one of
            'refund_request', 'information_request', or 'other'.
        is_complaint: A boolean flag indicating if the query should be
            treated as a formal complaint.
        tags: A list of keywords that provide additional context for filtering
            and analysis.

    Example:
        A typical use case involves parsing a dictionary of data, perhaps from
        an API request or a form submission, into a validated model instance.

        >>> query_data = {
        ...     "name": "John Doe",
        ...     "email": "john.doe@example.com",
        ...     "query": "My package hasn't arrived yet.",
        ...     "order_id": 54321,
        ...     "priority": "high",
        ...     "category": "other",
        ...     "is_complaint": True,
        ...     "tags": ["shipping", "delay", "package_tracking"]
        ... }
        >>> customer_query = CustomerQuery(**query_data)
        >>> print(customer_query.priority)
        high
        >>> print(customer_query.is_complaint)
        True
        >>> print(customer_query.name)  # Inherited from UserInput
        John Doe

    """

    priority: str = Field(..., description="Priority level: low, medium, high")
    category: Literal["refund_request", "information_request", "other"] = Field(
        ..., description="Query category"
    )
    is_complaint: bool = Field(..., description="Whether this is a complaint")
    tags: list[str] = Field(..., description="Relevant keyword tags")

### Construct a prompt with example output

In [None]:
# Create a prompt with generic example data to guide LLM.
example_response_structure = """{
    "name": "Example User",
    "email": "user@example.com",
    "query": "I ordered a new computer monitor and it arrived with the screen cracked.
    I need to exchange it for a new one.",
    "order_id": 12345,
    "purchase_date": "2025-12-31",
    "priority": "medium",
    "category": "refund_request",
    "is_complaint": true,
    "tags": ["monitor", "support", "exchange"]
}"""

In [None]:
# Example response structure to guide the LLM.
# Using a dictionary instead of a formatted string makes it cleaner and easier to maintain.
example_response_structure = {
    "name": "Example User",
    "email": "user@example.com",
    "query": (
        "I ordered a new computer monitor and it arrived with the screen cracked. "
        "I need to exchange it for a new one."
    ),
    "order_id": 12345,
    "purchase_date": "2025-12-31",
    "priority": "medium",
    "category": "refund_request",
    "is_complaint": True,
    "tags": ["monitor", "support", "exchange"],
}

# (Optional) Pretty print the structure for quick inspection

print(json.dumps(example_response_structure, indent=4))

{
    "name": "Example User",
    "email": "user@example.com",
    "query": "I ordered a new computer monitor and it arrived with the screen cracked. I need to exchange it for a new one.",
    "order_id": 12345,
    "purchase_date": "2025-12-31",
    "priority": "medium",
    "category": "refund_request",
    "is_complaint": true,
    "tags": [
        "monitor",
        "support",
        "exchange"
    ]
}


In [None]:
# Create prompt with user data and expected JSON structure
prompt = f"""
Please analyze this user query\n {user_input.model_dump_json(indent=2)}:

Return your analysis as a JSON object matching this exact structure 
and data types:
{example_response_structure}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.
"""

print(prompt)


Please analyze this user query
 {
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null
}:

Return your analysis as a JSON object matching this exact structure 
and data types:
{'name': 'Example User', 'email': 'user@example.com', 'query': 'I ordered a new computer monitor and it arrived with the screen cracked. I need to exchange it for a new one.', 'order_id': 12345, 'purchase_date': '2025-12-31', 'priority': 'medium', 'category': 'refund_request', 'is_complaint': True, 'tags': ['monitor', 'support', 'exchange']}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.



### Define a function to call an LLM and try it with your prompt

In [None]:
# Define a function to call the LLM
def call_llm(prompt: str, model: str = "gpt-4o"):  # noqa: ANN201
    """Call a Large Language Model (LLM) with a given prompt and return the response.

    This function sends the provided prompt to the configured LLM client using the
    chat completion API. It extracts the first message content from the response.
    If the response object does not match the expected structure, a ``ValueError``
    is raised.

    Args:
        prompt (str): The input text or instruction to send to the LLM.
        model (str, optional): The model identifier to use for the request.
            Defaults to ``"gpt-4o"``.

    Returns:
        str: The generated text content from the model's first response choice.

    Raises:
        ValueError: If the response object does not contain the expected structure.

    Example:
        >>> call_llm("Write a haiku about autumn leaves.")
        'Golden leaves drifting...'

    """
    response = client.chat.completions.create(
        model=model, messages=[{"role": "user", "content": prompt}]
    )
    if not response or not response.choices or not response.choices[0].message.content:
        error_msg: str = "Response object does not contain expected structure."
        raise ValueError(error_msg)
    return response.choices[0].message.content

In [None]:
# Get response from LLM
response_content = call_llm(prompt)
print(response_content)

```json
{
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null,
  "priority": "high",
  "category": "account_help",
  "is_complaint": false,
  "tags": ["password", "support", "login"]
}
```


### Validate the LLM output using your CustomerQuery model

### Note: the following cell will produce a validation error. This is expected. You can simply proceed with the rest of the notebook as cells below do not depend on this cell. 

In [None]:
# Attempt to parse the response into CustomerQuery model
valid_data = CustomerQuery.model_validate_json(response_content)

ValidationError: 1 validation error for CustomerQuery
  Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value='```json\n{\n  "name": "J...port", "login"]\n}\n```', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/json_invalid

### Define a function for error handling

## Limpeza de respostas JSON

Problema: LLMs frequentemente retornam JSON envolto em blocos de código markdown. Para resolver isso vamos adicionar uma função clean_json_response() que remove formatação extra.

In [None]:
import re


# Function to clean JSON response by removing markdown formatting and fixing quotes
def clean_json_response(response: str) -> str:
    """Clean JSON response by removing markdown code blocks and extra formatting.

    Args:
        response (str): The raw response string from the LLM.

    Returns:
        str: Cleaned JSON string with markdown formatting removed and double quotes enforced.

    """
    # Remove markdown code blocks
    if response.strip().startswith("```"):
        lines = response.strip().split("\n")
        # Remove first line if it starts with ```
        if lines[0].startswith("```"):
            lines = lines[1:]
        # Remove last line if it starts with ```
        if lines and lines[-1].startswith("```"):
            lines = lines[:-1]
        response = "\n".join(lines)

    # Replace single quotes with double quotes only outside of values
    # This regex replaces only the quotes around keys and values, not inside values
    # If you want a more robust solution, use json.loads after fixing the format
    # Here, we use a simple regex for demonstration
    response = re.sub(r"(?<!\\)'", '"', response)

    return response.strip()


# Try to parse the response with cleaning
try:
    cleaned_response = clean_json_response(response_content)
    valid_data = CustomerQuery.model_validate_json(cleaned_response)
    print("Validation successful!")
    print(valid_data.model_dump_json(indent=2))
except ValidationError as e:
    print(f"Validation failed: {e}")
    print(f"Cleaned response: {cleaned_response}")

Validation failed: 1 validation error for CustomerQuery
category
  Input should be 'refund_request', 'information_request' or 'other' [type=literal_error, input_value='account_help', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/literal_error
Cleaned response: {
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null,
  "priority": "high",
  "category": "account_help",
  "is_complaint": false,
  "tags": ["password", "support", "login"]
}


In [None]:
# Define a function to validate an LLM response
def validate_with_model(
    data_model: type[BaseModel], llm_response: str
) -> tuple[BaseModel | None, str | None]:
    """Valida a resposta de um LLM contra um modelo Pydantic, limpando-a primeiro.

    Esta função recebe a resposta em string de um LLM, tenta limpá-la de
    artefatos comuns (como blocos de código Markdown ```json), e então valida
    o JSON resultante contra o modelo Pydantic fornecido. Erros de parsing
    ou validação são capturados e retornados de forma controlada.

    Args:
        data_model (Type[BaseModel]): A classe do modelo Pydantic a ser usada
            para a validação.
        llm_response (str): A resposta em string do LLM, que pode conter o JSON
            puro ou estar envolvida em blocos de código Markdown.

    Returns:
        tuple[BaseModel | None, str | None]: Uma tupla contendo:
            - O objeto Pydantic validado em caso de sucesso, ou `None` em caso de falha.
            - `None` em caso de sucesso, ou uma string com a mensagem de erro
              detalhada em caso de falha de parsing ou validação.

    Example:
        >>> from pydantic import BaseModel
        >>> class User(BaseModel):
        ...     name: str
        ...     age: int
        ...
        >>> # Exemplo com JSON "sujo" (em um bloco de Markdown)
        >>> dirty_json = '```json\\n{"name": "Bob", "age": 25}\\n```'
        >>> validated, error = validate_with_model(User, dirty_json)
        >>> print(validated.name)
        Bob
        >>> print(error)
        None
    """
    try:
        # Passo 1: Limpar a string de resposta do LLM para remover ```json
        if "```json" in llm_response:
            clean_response = llm_response.split("```json\n")[1].split("\n```")[0]
        else:
            # Caso o LLM retorne o JSON sem o bloco de markdown
            clean_response = llm_response.strip().replace("```", "")

        # Passo 2: Validar o JSON limpo com o modelo Pydantic
        validated_data = data_model.model_validate_json(clean_response)

        # Se a validação for bem-sucedida
        print("✅ Validação bem-sucedida!")
        print(validated_data.model_dump_json(indent=2))
        return validated_data, None

    # Passo 3: Capturar os erros esperados (JSON inválido ou dados incorretos)
    except (ValidationError, json.JSONDecodeError, IndexError) as err:
        error_message = f"A resposta gerou um erro de validação ou parsing: {err}"
        print(f"❌ {error_message}")
        return None, error_message

In [None]:
# Test your validation function with the LLM response
validated_data, validation_error = validate_with_model(CustomerQuery, response_content)

❌ A resposta gerou um erro de validação ou parsing: 1 validation error for CustomerQuery
category
  Input should be 'refund_request', 'information_request' or 'other' [type=literal_error, input_value='account_help', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/literal_error


### Define a function to create a retry prompt including error details

In [None]:
# Define a function to create a retry prompt with error feedback
def create_retry_prompt(
    original_prompt: str, original_response: str, error_message: str
) -> str:
    r"""Create a retry prompt that incorporates error feedback for the LLM.

    This function generates a new prompt intended to help the LLM correct errors
    in its previous response. It embeds the original prompt, the original LLM
    response, and the associated error message into a structured format. The
    instruction at the end ensures that the model outputs only valid JSON without
    additional text or explanations.

    Args:
        original_prompt (str): The original user request sent to the LLM.
        original_response (str): The raw LLM response that contained an error.
        error_message (str): A description of the validation error or issue with
            the original response.

    Returns:
        str: A formatted retry prompt string that guides the LLM to produce a
        corrected response in valid JSON format.

    Example:
        >>> original_prompt = "Generate user details in JSON format."
        >>> original_response = "{'name': 'Alice', 'age': 'twenty'}"
        >>> error_message = "Field 'age' must be an integer."
        >>> retry = create_retry_prompt(original_prompt, original_response,
    error_message)
        >>> print(retry[:100])  # preview the start of the retry prompt
        "\\nThis is a request to fix an error in the structure of an llm_response..."

    """
    retry_prompt: str = (
        "This is a request to fix an error in the structure of an llm_response.\n"
        "Here is the original request:\n"
        "<original_prompt>\n"
        f"{original_prompt}\n"
        "</original_prompt>\n"
        "\n"
        "Here is the original llm_response:\n"
        "<llm_response>\n"
        f"{original_response}\n"
        "</llm_response>\n"
        "\n"
        "This response generated an error:\n"
        "<error_message>\n"
        f"{error_message}\n"
        "</error_message>\n"
        "\n"
        "Compare the error message and the llm_response and identify what needs to be\
        fixed or removed in the llm_response to resolve this error.\n"
        "Respond ONLY with valid JSON. Do not include any explanations or other text or\
        formatting before or after the JSON string."
    )
    return retry_prompt

In [None]:
# Create a retry prompt for validation errors
validation_retry_prompt = create_retry_prompt(
    original_prompt=prompt,
    original_response=response_content,
    error_message=validation_error,
)

print(validation_retry_prompt)

This is a request to fix an error in the structure of an llm_response.
Here is the original request:
<original_prompt>

Please analyze this user query
 {
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null
}:

Return your analysis as a JSON object matching this exact structure 
and data types:
{'name': 'Example User', 'email': 'user@example.com', 'query': 'I ordered a new computer monitor and it arrived with the screen cracked. I need to exchange it for a new one.', 'order_id': 12345, 'purchase_date': '2025-12-31', 'priority': 'medium', 'category': 'refund_request', 'is_complaint': True, 'tags': ['monitor', 'support', 'exchange']}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.

</original_prompt>

Here is the original llm_response:
<llm_response>
```json
{
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I

### Call the LLM with your retry prompt

In [None]:
# Call the LLM with the validation retry prompt
validation_retry_response = call_llm(validation_retry_prompt)
print(validation_retry_response)

```json
{
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null,
  "priority": "high",
  "category": "other",
  "is_complaint": false,
  "tags": ["password", "support", "login"]
}
```


In [None]:
# Attempt to validate retry response from LLM
validated_data, validation_error = validate_with_model(
    CustomerQuery, validation_retry_response
)

✅ Validação bem-sucedida!
{
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null,
  "priority": "high",
  "category": "other",
  "is_complaint": false,
  "tags": [
    "password",
    "support",
    "login"
  ]
}


### Create a second retry prompt

In [None]:
# Create a second retry prompt for validation errors
second_validation_retry_prompt = create_retry_prompt(
    original_prompt=validation_retry_prompt,
    original_response=validation_retry_response,
    error_message=validation_error,
)

print(second_validation_retry_prompt)

This is a request to fix an error in the structure of an llm_response.
Here is the original request:
<original_prompt>
This is a request to fix an error in the structure of an llm_response.
Here is the original request:
<original_prompt>

Please analyze this user query
 {
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null
}:

Return your analysis as a JSON object matching this exact structure 
and data types:
{'name': 'Example User', 'email': 'user@example.com', 'query': 'I ordered a new computer monitor and it arrived with the screen cracked. I need to exchange it for a new one.', 'order_id': 12345, 'purchase_date': '2025-12-31', 'priority': 'medium', 'category': 'refund_request', 'is_complaint': True, 'tags': ['monitor', 'support', 'exchange']}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.

</original_prompt>

Here is the

In [None]:
# Call the LLM with the second validation retry prompt
second_validation_retry_response = call_llm(second_validation_retry_prompt)
print(second_validation_retry_response)

```json
{
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null,
  "priority": "high",
  "category": "information_request",
  "is_complaint": false,
  "tags": ["password", "support", "login"]
}
```


### Define a function to handle multiple retries in a feedback loop

In [None]:
# Define a function to automatically retry an LLM call multiple times
def validate_llm_response(prompt, data_model, n_retry=5, model="gpt-4o"):
    # Initial LLM call
    response_content = call_llm(prompt, model=model)
    current_prompt = prompt

    # Try to validate with the model
    # attempt: 0=initial, 1=first retry, ...
    for attempt in range(n_retry + 1):
        validated_data, validation_error = validate_with_model(
            data_model, response_content
        )

        if validation_error:
            if attempt < n_retry:
                print(f"retry {attempt} of {n_retry} failed, trying again...")
            else:
                print(f"Max retries reached. Last error: {validation_error}")
                return None, (f"Max retries reached. Last error: {validation_error}")

            validation_retry_prompt = create_retry_prompt(
                original_prompt=current_prompt,
                original_response=response_content,
                error_message=validation_error,
            )
            response_content = call_llm(validation_retry_prompt, model=model)
            current_prompt = validation_retry_prompt
            continue

        # If you get here, both parsing and validation succeeded
        return validated_data, None

In [None]:
# Test your complete solution with the original prompt
validated_data, error = validate_llm_response(prompt, CustomerQuery)

❌ A resposta gerou um erro de validação ou parsing: 1 validation error for CustomerQuery
category
  Input should be 'refund_request', 'information_request' or 'other' [type=literal_error, input_value='account_issue', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/literal_error
retry 0 of 5 failed, trying again...
✅ Validação bem-sucedida!
{
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null,
  "priority": "high",
  "category": "information_request",
  "is_complaint": false,
  "tags": [
    "password",
    "account",
    "support"
  ]
}


### Have a look at the JSON schema of your CustomerQuery data model

In [None]:
# Investigate the model_json_schema for CustomerQuery
data_model_schema = json.dumps(CustomerQuery.model_json_schema(), indent=2)
print(data_model_schema)

{
  "description": "Represents a structured customer query with classification metadata.\n\nThis model extends `UserInput` by adding fields necessary for automated\nprocessing, routing, and prioritization within a customer support system.\nIt enriches the raw user input with structured data like priority,\ncategory, and tags.\n\nAttributes:\n    priority: The urgency of the query. Must be one of 'low', 'medium',\n        or 'high'.\n    category: The general topic of the query. Must be one of\n        'refund_request', 'information_request', or 'other'.\n    is_complaint: A boolean flag indicating if the query should be\n        treated as a formal complaint.\n    tags: A list of keywords that provide additional context for filtering\n        and analysis.\n\nExample:\n    A typical use case involves parsing a dictionary of data, perhaps from\n    an API request or a form submission, into a validated model instance.\n\n    >>> query_data = {\n    ...     \"name\": \"John Doe\",\n    ..

In [None]:
# Print the original prompt from above
print(prompt)


Please analyze this user query
 {
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null
}:

Return your analysis as a JSON object matching this exact structure 
and data types:
{'name': 'Example User', 'email': 'user@example.com', 'query': 'I ordered a new computer monitor and it arrived with the screen cracked. I need to exchange it for a new one.', 'order_id': 12345, 'purchase_date': '2025-12-31', 'priority': 'medium', 'category': 'refund_request', 'is_complaint': True, 'tags': ['monitor', 'support', 'exchange']}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.



### Construct a new prompt using the JSON schema of your data model

In [None]:
# Create new prompt with user input and model_json_schema
prompt = f"""
Please analyze this user query\n {user_input.model_dump_json(indent=2)}:

Return your analysis as a JSON object matching the following schema:
{data_model_schema}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.
"""

In [None]:
# Run your validate_llm_response function with the new prompt
final_analysis, error = validate_llm_response(prompt, CustomerQuery)

✅ Validação bem-sucedida!
{
  "name": "Joe User",
  "email": "joe.user@example.com",
  "query": "I forgot my password.",
  "order_id": null,
  "purchase_date": null,
  "priority": "medium",
  "category": "information_request",
  "is_complaint": false,
  "tags": [
    "password",
    "account_access",
    "support"
  ]
}


---

## Conclusion

In this lesson, you explored how to combine Pydantic models with retry logic to reliably extract structured data from LLM outputs. You practiced building reusable validation functions and prompts, and saw how robust error handling can help you get consistent, usable results from language models. These techniques will help you confidently scale up your LLM-powered workflows.

### Foco Geral da Lição: Engenharia de Confiabilidade para LLMs

O foco central desta lição não é apenas "usar Pydantic com um LLM". O foco é **resolver o problema fundamental da imprevisibilidade dos Modelos de Linguagem**.

LLMs são modelos probabilísticos; eles não garantem que a saída será sempre perfeita ou no formato exato que você precisa. Esta lição ensina um padrão de design crucial que podemos chamar de **"Circuito de Auto-correção para Extração de Dados"**.

A lição demonstra duas estratégias para alcançar essa confiabilidade:

1.  **Correção Programática (Retry Loop)**: Aceitar que o LLM vai errar e criar um sistema que:
    * **Valida** a saída rigorosamente (com Pydantic).
    * **Diagnostica** o erro (capturando a `ValidationError`).
    * **Fornece Feedback** ao LLM, pedindo para ele mesmo corrigir o erro.
    * **Automatiza** esse ciclo até que o resultado seja satisfatório.

2.  **Prevenção via Engenharia de Prompt (JSON Schema)**: A segunda parte da lição mostra que a melhor maneira de corrigir um erro é, na verdade, evitar que ele aconteça. Ao substituir um exemplo vago por um **JSON Schema** formal no prompt, você fornece instruções técnicas e inequívocas ao LLM, diminuindo drasticamente a probabilidade de um erro inicial.

Em suma, a lição ensina a tratar a saída de um LLM não como um resultado final, mas como uma primeira tentativa que deve ser validada e, se necessário, refinada através de um loop de feedback inteligente.

### Aplicação em LangChain

O LangChain foi criado exatamente para abstrair e simplificar os padrões que você construiu manualmente nesta lição.

* **Output Parsers**: A ideia de validar e formatar a saída de um LLM é tão central que o LangChain tem um componente inteiro dedicado a isso: os `OutputParsers`.
    * O `PydanticOutputParser` faz exatamente o que a segunda parte da lição ensina: ele pega seu modelo Pydantic, gera as instruções de formatação (muitas vezes usando o JSON Schema) para o prompt, e depois valida a resposta do LLM, convertendo-a para uma instância do seu modelo.
    * O `RetryWithErrorOutputParser` implementa a lógica de retentativa que você construiu na função `validate_llm_response`. Ele pode ser combinado com outro parser (como o Pydantic parser) e, se a validação inicial falhar, ele automaticamente cria um novo prompt com a mensagem de erro e chama o LLM novamente.

Entender esta lição significa que você agora compreende profundamente *o que* esses componentes do LangChain estão fazendo por baixo dos panos, permitindo que você os utilize com muito mais eficácia.

### Aplicação em LangGraph

Em LangGraph, onde você constrói aplicações com múltiplos passos como um grafo, a confiabilidade de cada passo é ainda mais crítica.

* **Nós (Nodes) Robustos**: A função `validate_llm_response` que você analisou é a receita perfeita para um **nó robusto** em um grafo. Você pode criar um "Nó de Extração de Dados" que recebe um texto do `Estado (State)` do grafo e tem a responsabilidade de retornar os dados estruturados. Todo o circuito de auto-correção (chamar, validar, tentar novamente) viveria dentro deste nó.
* **Arestas Condicionais (Conditional Edges)**: A beleza do LangGraph está no controle explícito do fluxo. Após o seu "Nó de Extração" ser executado, uma aresta condicional pode verificar: "A extração foi bem-sucedida?".
    * **Se sim**: O grafo segue para o próximo passo (ex: "Tomar Decisão com Base nos Dados").
    * **Se não** (mesmo após as retentativas): O grafo pode ser roteado para um caminho de falha, como um nó de "Escalonamento para Humano" ou um que tenta uma abordagem completamente diferente.

Esta lição fornece o padrão para construir os blocos de construção (nós) confiáveis que são essenciais para um grafo LangGraph funcional.

### Como Usar a Ideia para Criar Agentes Melhores

Agentes são a manifestação mais avançada dessas ideias. Um agente precisa constantemente interpretar o mundo, a saída de suas ferramentas e seu próprio "pensamento" para tomar decisões.

1.  **Chamadas de Ferramentas (Tool Calling) Confiáveis**: Quando um agente decide usar uma ferramenta, a decisão de qual ferramenta chamar e com quais argumentos é uma saída do LLM. Usar o padrão desta lição garante que os argumentos gerados para a ferramenta sejam válidos *antes* da ferramenta ser executada. Se o agente "alucinar" um argumento inválido, um loop de retentativa pode forçá-lo a corrigir seu próprio raciocínio.

2.  **Parsing da Saída de Ferramentas**: Quando uma ferramenta retorna uma informação (ex: o resultado de uma busca em API), essa informação pode ser complexa. O agente precisa "entender" essa saída para planejar seu próximo passo. Aplicar a lógica de extração estruturada e validação nesta saída garante que o agente baseie sua próxima decisão em dados limpos e corretos, e não em uma interpretação errada de uma string complexa.

3.  **Agentes Auto-Corretivos**: O conceito de `create_retry_prompt` é a semente de um agente que pode refletir sobre seus próprios erros. Um agente avançado, ao encontrar um erro (seja de uma ferramenta ou de seu próprio plano), pode usar um "meta-prompt" similar ao de retentativa para analisar o erro, o contexto e o objetivo, e então formular um novo plano de ação.

Em conclusão, a lição ensina o princípio fundamental que transforma um protótipo de IA em um sistema de produção: **nunca confie cegamente na saída do LLM; em vez disso, valide-a rigorosamente e construa mecanismos para que o sistema se recupere de falhas de forma autônoma.**