# A Simple Guide to Structured Outputs

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/structured_outputs/structured_outputs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is a simple guide to structured outputs with LLMs. At a high-level, we can attach a Pydantic class to any LLM and have the output format be natively structured, even if the LLM is used in upstream modules.

We start with the simple syntax around LLMs, and then move on to how to plug it in within query pipelines, and also higher-level modules like a query engine and agent.

A lot of the underlying behavior around structured outputs is powered by our Pydantic Program modules. Check out our [in-depth structured outputs guide](https://docs.llamaindex.ai/en/stable/module_guides/querying/structured_outputs/) for more details.

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
import nest_asyncio

nest_asyncio.apply()

In [3]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

llm = OpenAI(model="gpt-4o")
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Settings.llm = llm
Settings.embed_model = embed_model

## 1. Simple Structured Extraction

You can convert any LLM to a "structured LLM" by attaching an output class to it through `as_structured_llm`.

Here we pass a simple `Album` class which contains a list of songs. We can then use the normal LLM endpoints like chat/complete.

**NOTE**: async is supported but streaming is coming soon.

In [4]:
from typing import List
from pydantic.v1 import BaseModel, Field


class Song(BaseModel):
    """Data model for a song."""

    title: str
    length_seconds: int


class Album(BaseModel):
    """Data model for an album."""

    name: str
    artist: str
    songs: List[Song]

In [5]:
from llama_index.core.llms import ChatMessage

sllm = llm.as_structured_llm(output_cls=Album)
input_msg = ChatMessage.from_str("Generate an example album from The Shining")

#### Sync

In [6]:
output = sllm.chat([input_msg])
# get actual object
output_obj = output.raw

In [7]:
print(str(output))
print(output_obj)

assistant: {"name": "The Shining: Music from the Motion Picture", "artist": "Various Artists", "songs": [{"title": "Main Title (The Shining)", "length_seconds": 180}, {"title": "Rocky Mountains", "length_seconds": 210}, {"title": "Lontano", "length_seconds": 720}, {"title": "Music for Strings, Percussion and Celesta", "length_seconds": 540}, {"title": "Utrenja (Ewangelia)", "length_seconds": 300}, {"title": "The Awakening of Jacob", "length_seconds": 480}, {"title": "De Natura Sonoris No. 2", "length_seconds": 600}, {"title": "Home", "length_seconds": 240}, {"title": "Midnight, the Stars and You", "length_seconds": 180}]}
name='The Shining: Music from the Motion Picture' artist='Various Artists' songs=[Song(title='Main Title (The Shining)', length_seconds=180), Song(title='Rocky Mountains', length_seconds=210), Song(title='Lontano', length_seconds=720), Song(title='Music for Strings, Percussion and Celesta', length_seconds=540), Song(title='Utrenja (Ewangelia)', length_seconds=300), So

#### Async

In [8]:
output = await sllm.achat([input_msg])
# get actual object
output_obj = output.raw
print(str(output))

assistant: {"name": "The Shining: Music from the Motion Picture", "artist": "Various Artists", "songs": [{"title": "Main Title (The Shining)", "length_seconds": 180}, {"title": "Rocky Mountains", "length_seconds": 210}, {"title": "Lontano", "length_seconds": 720}, {"title": "Music for Strings, Percussion and Celesta", "length_seconds": 540}, {"title": "Utrenja (Excerpt)", "length_seconds": 300}, {"title": "The Awakening of Jacob", "length_seconds": 480}, {"title": "De Natura Sonoris No. 2", "length_seconds": 600}, {"title": "Home", "length_seconds": 240}]}


#### Streaming

In [9]:
from IPython.display import clear_output
from pprint import pprint

stream_output = sllm.stream_chat([input_msg])
for partial_output in stream_output:
    clear_output(wait=True)
    pprint(partial_output.raw.dict())

output_obj = partial_output.raw
print(str(output))

{'artist': 'Various Artists',
 'name': 'The Shining: Music from the Motion Picture',
 'songs': [{'length_seconds': 180, 'title': 'Main Title (The Shining)'},
           {'length_seconds': 210, 'title': 'Rocky Mountains'},
           {'length_seconds': 720, 'title': 'Lontano'},
           {'length_seconds': 540,
            'title': 'Music for Strings, Percussion and Celesta'},
           {'length_seconds': 600, 'title': 'Utrenja'},
           {'length_seconds': 480, 'title': 'The Awakening of Jacob'},
           {'length_seconds': 540, 'title': 'De Natura Sonoris No. 2'},
           {'length_seconds': 300, 'title': 'Home'},
           {'length_seconds': 240, 'title': 'Heartbeats and Worry'},
           {'length_seconds': 360, 'title': 'The Overlook'}]}
assistant: {"name": "The Shining: Music from the Motion Picture", "artist": "Various Artists", "songs": [{"title": "Main Title (The Shining)", "length_seconds": 180}, {"title": "Rocky Mountains", "length_seconds": 210}, {"title": "Lontan

#### Async Streaming

In [10]:
from IPython.display import clear_output
from pprint import pprint

stream_output = await sllm.astream_chat([input_msg])
async for partial_output in stream_output:
    clear_output(wait=True)
    pprint(partial_output.raw.dict())

{'artist': 'Various Artists',
 'name': 'The Shining: Original Soundtrack',
 'songs': [{'length_seconds': 180, 'title': 'Main Title'},
           {'length_seconds': 210, 'title': 'Rocky Mountains'},
           {'length_seconds': 720, 'title': 'Lontano'},
           {'length_seconds': 540,
            'title': 'Music for Strings, Percussion and Celesta'},
           {'length_seconds': 300, 'title': 'Utrenja (Excerpt)'},
           {'length_seconds': 480, 'title': 'The Awakening of Jacob'},
           {'length_seconds': 540, 'title': 'De Natura Sonoris No. 2'},
           {'length_seconds': 180, 'title': 'Home'},
           {'length_seconds': 180, 'title': 'Midnight, the Stars and You'}]}


### 1.b Example using Query Pipelines

You can plug in structured LLMs in query pipelines - the output will be directly the structured object.

In [11]:
# use query pipelines
from llama_index.core.prompts import ChatPromptTemplate
from llama_index.core.query_pipeline import QueryPipeline as QP
from llama_index.core.llms import ChatMessage

chat_prompt_tmpl = ChatPromptTemplate(
    message_templates=[
        ChatMessage.from_str(
            "Generate an example album from {movie_name}", role="user"
        )
    ]
)

qp = QP(chain=[chat_prompt_tmpl, sllm])
response = qp.run(movie_name="Inside Out")
response

Album(name='Inside Out Soundtrack', artist='Michael Giacchino', songs=[Song(title='Bundle of Joy', length_seconds=150), Song(title='Team Building', length_seconds=120), Song(title='Nomanisone Island/National Movers', length_seconds=180), Song(title='Overcoming Sadness', length_seconds=140), Song(title='Free Skating', length_seconds=130), Song(title='First Day of School', length_seconds=160), Song(title='Riled Up', length_seconds=110), Song(title='Goofball No Longer', length_seconds=170), Song(title='Memory Lanes', length_seconds=200), Song(title='The Forgetters', length_seconds=145), Song(title='Chasing the Pink Elephant', length_seconds=155), Song(title='Abstract Thought', length_seconds=135), Song(title='Imagination Land', length_seconds=175), Song(title='Down in the Dumps', length_seconds=165), Song(title='Dream Productions', length_seconds=190), Song(title='Dream a Little Nightmare', length_seconds=125), Song(title='The Subconscious Basement', length_seconds=185), Song(title='Escap

### 1.c Use the `structured_predict` Function

Instead of explicitly doing `llm.as_structured_llm(...)`, every LLM class has a `structured_predict` function which allows you to more easily call the LLM with a prompt template + template variables to return a strutured output in one line of code.

In [12]:
# use query pipelines
from llama_index.core.prompts import ChatPromptTemplate
from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

chat_prompt_tmpl = ChatPromptTemplate(
    message_templates=[
        ChatMessage.from_str(
            "Generate an example album from {movie_name}", role="user"
        )
    ]
)

llm = OpenAI(model="gpt-4o")
album = llm.structured_predict(
    Album, chat_prompt_tmpl, movie_name="Lord of the Rings"
)
album

Album(name='Songs of Middle-earth', artist='Various Artists', songs=[Song(title='The Shire', length_seconds=180), Song(title="The Fellowship's Journey", length_seconds=240), Song(title="Gollum's Theme", length_seconds=200), Song(title="The Battle of Helm's Deep", length_seconds=300), Song(title='Lament for Gandalf', length_seconds=150), Song(title="Rohan's Call", length_seconds=210), Song(title="The Ring's Temptation", length_seconds=190), Song(title="Mordor's Shadow", length_seconds=220), Song(title='The Return of the King', length_seconds=260), Song(title='Into the West', length_seconds=230)])

## 2. Plug into RAG Pipeline

You can also plug this into a RAG pipeline. Below we show structured extraction from an Apple 10K report.

In [13]:
!mkdir data
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf" -O data/apple_2021_10k.pdf

mkdir: data: File exists
--2024-08-08 14:07:37--  https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf
Resolving s2.q4cdn.com (s2.q4cdn.com)... 197.189.228.114
Connecting to s2.q4cdn.com (s2.q4cdn.com)|197.189.228.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 789896 (771K) [application/pdf]
Saving to: 'data/apple_2021_10k.pdf'


2024-08-08 14:07:44 (134 KB/s) - 'data/apple_2021_10k.pdf' saved [789896/789896]



#### Option 1: Use LlamaParse

You will need an account at https://cloud.llamaindex.ai/ and an API Key to use LlamaParse, our document parser for 10K filings.

In [14]:
from llama_parse import LlamaParse

# os.environ["LLAMA_CLOUD_API_KEY"] = "llx-..."
orig_docs = LlamaParse(result_type="text").load_data(
    "./data/apple_2021_10k.pdf"
)

Started parsing the file under job_id df000dbf-ffe3-4cf6-a14c-0024cc313167


In [15]:
from copy import deepcopy
from llama_index.core.schema import TextNode


def get_page_nodes(docs, separator="\n---\n"):
    """Split each document into page node, by separator."""
    nodes = []
    for doc in docs:
        doc_chunks = doc.text.split(separator)
        for doc_chunk in doc_chunks:
            node = TextNode(
                text=doc_chunk,
                metadata=deepcopy(doc.metadata),
            )
            nodes.append(node)

    return nodes


docs = get_page_nodes(orig_docs)
print(docs[0].get_content())

                                                              UNITED STATES
                                    SECURITIES AND EXCHANGE COMMISSION
                                                          Washington, D.C. 20549

                                                               FORM 10-K
(Mark One)
        ☒ ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934
                                                  For the fiscal year ended September 25, 2021
                                                                            or
     ☐ TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934
                                                For the transition period from               to          .
                                                       Commission File Number: 001-36743

                                                               Apple Inc.
                                              (Exac

#### Option 2: Use SimpleDirectoryReader

You can also choose to use the free PDF parser bundled into our `SimpleDirectoryReader`.

In [16]:
# OPTION 2: Use SimpleDirectoryReader
# from llama_index.core import SimpleDirectoryReader

# reader = SimpleDirectoryReader(input_files=["apple_2021_10k.pdf"])
# docs = reader.load_data()

#### Build RAG Pipeline, Define Structured Output Schema

We build a RAG pipeline with our trusty VectorStoreIndex and reranker module. We then define the output as a Pydantic model. This allows us to create a structured LLM with the output class attached.

In [17]:
from llama_index.core import VectorStoreIndex

# skip chunking since we're doing page-level chunking
index = VectorStoreIndex(docs)

In [18]:
from llama_index.postprocessor.flag_embedding_reranker import (
    FlagEmbeddingReranker,
)

reranker = FlagEmbeddingReranker(
    top_n=5,
    model="BAAI/bge-reranker-large",
)

ImportError: ('Cannot import FlagReranker package, please install it: ', 'pip install git+https://github.com/FlagOpen/FlagEmbedding.git')

In [19]:
from pydantic.v1 import BaseModel, Field
from typing import List


class Output(BaseModel):
    """Output containing the response, page numbers, and confidence."""

    response: str = Field(..., description="The answer to the question.")
    page_numbers: List[int] = Field(
        ...,
        description="The page numbers of the sources used to answer this question. Do not include a page number if the context is irrelevant.",
    )
    confidence: float = Field(
        ...,
        description="Confidence value between 0-1 of the correctness of the result.",
    )
    confidence_explanation: str = Field(
        ..., description="Explanation for the confidence score"
    )


sllm = llm.as_structured_llm(output_cls=Output)

#### Run Queries

In [20]:
query_engine = index.as_query_engine(
    similarity_top_k=5,
    node_postprocessors=[reranker],
    llm=sllm,
    response_mode="tree_summarize",  # you can also select other modes like `compact`, `refine`
)

NameError: name 'reranker' is not defined

In [None]:
response = query_engine.query("Net sales for each product category in 2021")
print(str(response))

In [None]:
response.response.dict()