<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/output_parsing/lmformatenforcer_pydantic_program.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LM Format Enforcer Pydantic Program

Generate structured data with [**lm-format-enforcer**](https://github.com/noamgat/lm-format-enforcer) via LlamaIndex.  


With lm-format-enforcer, you can guarantee the output structure is correct by *forcing* the LLM to output desired tokens.  
This is especialy helpful when you are using lower-capacity model (e.g. the current open source models), which otherwise would struggle to generate valid output that fits the desired output schema.

[lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) supports regular expressions and JSON Schema, this demo focuses on JSON Schema. For regular expressions, see the [sample regular expressions notebook](https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/output_parsing/lmformatenforcer_regular_expressions.ipynb).

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [None]:
%pip install llama-index-program-lmformatenforcer
%pip install llama-index-llms-llama-cpp

In [None]:
!pip install llama-index lm-format-enforcer llama-cpp-python

In [None]:
import sys

from pydantic import BaseModel, Field
from typing import List

from llama_index.program.lmformatenforcer import (
    LMFormatEnforcerPydanticProgram,
)

Define output schema

In [None]:
class Song(BaseModel):
    title: str
    length_seconds: int


class Album(BaseModel):
    name: str
    artist: str
    songs: List[Song] = Field(min_items=3, max_items=10)

Create the program. We use `LlamaCPP` as the LLM in this demo, but `HuggingFaceLLM` is also supported.

Note that the prompt template has two parameters:
- `movie_name` which will be used in the function called
- `json_schema` which will automatically have the JSON Schema of the output class injected into it.

In [None]:
from llama_index.llms.llama_cpp import LlamaCPP

llm = LlamaCPP()

program = LMFormatEnforcerPydanticProgram(
    output_cls=Album,
    prompt_template_str=(
        "Your response should be according to the following json schema: \n"
        "{json_schema}\n"
        "Generate an example album, with an artist and a list of songs. Using"
        " the movie {movie_name} as inspiration. "
    ),
    llm=llm,
    verbose=True,
)

llama_model_loader: loaded meta data with 19 key-value pairs and 363 tensors from /mnt/wsl/PHYSICALDRIVE1p3/llama_index/models/llama-2-13b-chat.Q4_0.gguf (version GGUF V2 (latest))
llama_model_loader: - tensor    0:                token_embd.weight q4_0     [  5120, 32000,     1,     1 ]
llama_model_loader: - tensor    1:           blk.0.attn_norm.weight f32      [  5120,     1,     1,     1 ]
llama_model_loader: - tensor    2:            blk.0.ffn_down.weight q4_0     [ 13824,  5120,     1,     1 ]
llama_model_loader: - tensor    3:            blk.0.ffn_gate.weight q4_0     [  5120, 13824,     1,     1 ]
llama_model_loader: - tensor    4:              blk.0.ffn_up.weight q4_0     [  5120, 13824,     1,     1 ]
llama_model_loader: - tensor    5:            blk.0.ffn_norm.weight f32      [  5120,     1,     1,     1 ]
llama_model_loader: - tensor    6:              blk.0.attn_k.weight q4_0     [  5120,  5120,     1,     1 ]
llama_model_loader: - tensor    7:         blk.0.attn_output.we

Run program to get structured output.  

In [None]:
output = program(movie_name="The Shining")


llama_print_timings:        load time = 21703.16 ms
llama_print_timings:      sample time =    45.01 ms /   134 runs   (    0.34 ms per token,  2976.92 tokens per second)
llama_print_timings: prompt eval time = 21703.02 ms /   223 tokens (   97.32 ms per token,    10.28 tokens per second)
llama_print_timings:        eval time = 20702.37 ms /   133 runs   (  155.66 ms per token,     6.42 tokens per second)
llama_print_timings:       total time = 43127.74 ms


The output is a valid Pydantic object that we can then use to call functions/APIs. 

In [None]:
output

Album(name='The Shining: A Musical Journey Through the Haunted Halls of the Overlook Hotel', artist='The Shining Choir', songs=[Song(title='Redrum', length_seconds=300), Song(title='All Work and No Play Makes Jack a Dull Boy', length_seconds=240), Song(title="Heeeeere's Johnny!", length_seconds=180)])