# Structured output from documents

This example uses the Pydantic AI framework integrating the **Docling MCP** tools via the `streamable-http` transport.

The task is finding and extracting the hardware details of a single board computer datasheet.

### Tools

- `docling/convert_document_into_docling_document` for converting PDF documents
- `docling/export_docling_document_to_markdown` for exporting to markdown

### Pre-Requisites

Before starting this notebook, ensure that you have:
1. Followed the instructions in the [Llama Stack README](../llama-stack/README.md) to set up the following resources:
  - Inference model with [LM Studio](https://lmstudio.ai/)
  - Llama Stack server with the starter template [distribution-starter](https://hub.docker.com/r/llamastack/distribution-starter)

2. Started the Docling MCP server with the `conversion` and `generation` groups. See the details in the [README](./README.md)

You may want to create a virtual environment to run this notebook, for instance, with [uv](https://docs.astral.sh/uv/).

```bash
uv venv
source .venv/bin/activate
uv pip install ipykernel rich pydantic "pydantic-ai-slim[openai,mcp]"
```

#### Utilities

In [1]:
# this is needed to run asyncio functions in notebooks
import nest_asyncio

nest_asyncio.apply()

In [2]:
# Utilities for printing results

from rich.pretty import pprint
from rich.console import Console
from pydantic_ai.messages import ModelMessage, ModelRequest

console = Console()

VERBOSE_STEPS = True


def print_user(text: str):
    console.print(f"üë§ [cyan]{text}[/cyan]")


def print_assistant(text: str):
    console.print(f"ü§ñ [green]{text}[/green]")


def print_steps(messages: list[ModelMessage]):
    if not VERBOSE_STEPS:
        return

    step_nr = 0
    for message in messages:
        if isinstance(message, ModelRequest):
            step_nr += 1
            console.print(f"[orange]----- üìç Reasoning step {step_nr} -----[/orange]")
        console.print(message)

## Pydantic AI agents

In [None]:
from pydantic_ai.models.openai import OpenAIResponsesModel
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.mcp import MCPServerStreamableHTTP

docling_mcp = MCPServerStreamableHTTP(url="http://localhost:8000/mcp")

model = OpenAIResponsesModel(
    model_name="lms/openai/gpt-oss-20b",  # name of the model configured in Llama Stack
    provider=OpenAIProvider(
        base_url="http://localhost:8321/v1/openai/v1/"
    ),  # Pointing to the local Llama Stack OpenAI-compatible API
)

In [8]:
from typing import Annotated
from pydantic import BaseModel, Field

from pydantic_ai import Agent

# Define the agent
agent = Agent(model=model, toolsets=[docling_mcp])


# Define the schema
class HardwareSpecs(BaseModel):
    cpu_name: Annotated[str, Field(description="Name of the CPU")]
    cpu_num_cores: Annotated[int, Field(description="Number of CPU cores")]

    memory_type: Annotated[str, Field(description="The type of RAM memory")]
    memory_size_gb: Annotated[
        int, Field(description="The size of the RAM memory in GB")
    ]
    memory_channels: Annotated[int, Field(description="Number of RAM memory channels")]

    flash_type: Annotated[str, Field(description="The type of flash memory")]
    flash_size_gb: Annotated[
        int, Field(description="The size of the flash memory in GB")
    ]


# Run the agent
prompt = (
    "Convert the document on https://www.xes-inc.com/assets/products/datasheets/XPedite7871-DS.pdf "
    "to DoclingDocument. And use its markdown representation for extracting "
    "the hardware specs of the card."
)

print_user(prompt)
result = agent.run_sync(prompt, output_type=HardwareSpecs)
print_steps(result.new_messages())

# Final result
console.print("----- üéØ Final answer -----")
console.print(f"[yellow]The agent ü§ñ extracted the following content:[/yellow]")
pprint(result.output)