<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/output_parsing/openai_pydantic_program.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LLM Pydantic Program

This guide shows you how to generate structured data with our `LLMTextCompletionProgram`. Given an LLM as well as an output Pydantic class, generate a structured Pydantic object.

In terms of the target object, you can choose to directly specify `output_cls`, or specify a `PydanticOutputParser` or any other BaseOutputParser that generates a Pydantic object.

We demonstrate two settings:
- Extraction into an `Album` object (which can contain a list of Song objects)
- Extraction into a `DirectoryTree` object (which can contain recursive Node objects)

## Extraction into `Album`

This is a simple example of parsing an output into an `Album` schema, which can contain multiple songs.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [None]:
!pip install llama-index

In [1]:
from pydantic import BaseModel
from typing import List

from llama_index.program import LLMTextCompletionProgram

Define output schema

In [2]:
class Song(BaseModel):
    """Data model for a song."""

    title: str
    length_seconds: int


class Album(BaseModel):
    """Data model for an album."""

    name: str
    artist: str
    songs: List[Song]

Define LLM pydantic program

In [3]:
from llama_index.program import LLMTextCompletionProgram

In [4]:
prompt_template_str = """\
Generate an example album, with an artist and a list of songs. \
Using the movie {movie_name} as inspiration.\
"""
program = LLMTextCompletionProgram.from_defaults(
    output_cls=Album,
    prompt_template_str=prompt_template_str,
    verbose=True,
)

Run program to get structured output.  

In [5]:
output = program(movie_name="The Shining")

The output is a valid Pydantic object that we can then use to call functions/APIs. 

In [6]:
output

Album(name='The Overlook', artist='Jack Torrance', songs=[Song(title='Redrum', length_seconds=240), Song(title="Here's Johnny", length_seconds=180), Song(title='Room 237', length_seconds=300), Song(title='All Work and No Play', length_seconds=210), Song(title='The Maze', length_seconds=270)])

## Define a Custom Output Parser

Sometimes you may want to parse an output your own way into a JSON object. 