# 结构化提取的函数调用程序

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/output_parsing/function_program.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="在Colab中打开"/></a>

本指南将向您展示如何使用我们的`FunctionCallingProgram`进行结构化数据提取。给定一个函数调用的LLM以及一个输出的Pydantic类，可以生成一个结构化的Pydantic对象。我们使用三种不同的函数调用LLM：
- OpenAI
- Anthropic Claude
- Mistral

在目标对象方面，您可以选择直接指定`output_cls`，或者指定一个`PydanticOutputParser`或任何其他生成Pydantic对象的`BaseOutputParser`。

在下面的示例中，我们将展示不同的提取方式，将其提取到`Album`对象中（其中可以包含一系列的`Song`对象）。

**注意**：`FunctionCallingProgram`仅适用于本身支持函数调用的LLM，通过将Pydantic对象的模式插入为工具的“工具参数”来实现。对于所有其他LLM，请使用我们的`LLMTextCompletionProgram`，它将直接通过文本提示模型返回结构化输出。


## 定义`Album`类

这是一个简单的示例，将输出解析为一个`Album`模式，其中可以包含多首歌曲。

只需在初始化`FunctionCallingProgram`时将`Album`传递给`output_cls`属性即可。


如果您在colab上打开这个笔记本，您可能需要安装LlamaIndex 🦙。


In [None]:
!pip install llama-index

In [None]:
from pydantic import BaseModel
from typing import List

from llama_index.core.program import FunctionCallingProgram

定义输出模式


In [None]:
class Song(BaseModel):    """歌曲的数据模型。"""    title: str  # 标题    length_seconds: int  # 时长（秒）class Album(BaseModel):    """专辑的数据模型。"""    name: str  # 名称    artist: str  # 艺术家    songs: List[Song]  # 歌曲列表

## 定义函数调用程序

我们定义了一个具有三个函数调用LLM的函数调用程序：
- OpenAI
- Anthropic
- Mistral


### 使用OpenAI进行函数调用程序

在这里我们使用gpt-3.5-turbo。

我们展示了结构化数据提取中的“单个”函数调用，还演示了并行函数调用，使我们能够提取出多个对象。


#### 函数调用（单个对象）


In [None]:
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.openai import OpenAI

In [None]:
prompt_template_str = """\生成一个示例专辑，包括一个艺术家和一组歌曲。以电影 {movie_name} 为灵感。\"""llm = OpenAI(model="gpt-3.5-turbo")program = FunctionCallingProgram.from_defaults(    output_cls=Album,    prompt_template_str=prompt_template_str,    verbose=True,)

把程序运行起来，以获得结构化的输出。


In [None]:
output = program(movie_name="The Shining")

=== Calling Function ===
Calling function: Album with args: {"name": "The Shining Soundtrack", "artist": "Various Artists", "songs": [{"title": "Main Title", "length_seconds": 180}, {"title": "Rocky Mountains", "length_seconds": 240}, {"title": "Lullaby", "length_seconds": 200}, {"title": "The Overlook Hotel", "length_seconds": 220}, {"title": "Grady's Story", "length_seconds": 180}, {"title": "The Maze", "length_seconds": 210}]}
=== Function Output ===
name='The Shining Soundtrack' artist='Various Artists' songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title="Grady's Story", length_seconds=180), Song(title='The Maze', length_seconds=210)]


输出是一个有效的Pydantic对象，我们可以使用它来调用函数/API。


In [None]:
output

Album(name='The Shining Soundtrack', artist='Various Artists', songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title="Grady's Story", length_seconds=180), Song(title='The Maze', length_seconds=210)])

#### 函数调用（并行函数调用，多个对象）


In [None]:
prompt_template_str = """\使用以下每部电影作为灵感，生成带有艺术家和歌曲列表的示例专辑。以下是电影：{movie_names}"""llm = OpenAI(model="gpt-3.5-turbo")program = FunctionCallingProgram.from_defaults(    output_cls=Album,    prompt_template_str=prompt_template_str,    verbose=True,    allow_parallel_tool_calls=True,)output = program(movie_names="The Shining, The Blair Witch Project, Saw")

=== Calling Function ===
Calling function: Album with args: {"name": "The Shining", "artist": "Various Artists", "songs": [{"title": "Main Theme", "length_seconds": 180}, {"title": "The Overlook Hotel", "length_seconds": 240}, {"title": "Redrum", "length_seconds": 200}]}
=== Function Output ===
name='The Shining' artist='Various Artists' songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)]
=== Calling Function ===
Calling function: Album with args: {"name": "The Blair Witch Project", "artist": "Soundtrack Ensemble", "songs": [{"title": "Into the Woods", "length_seconds": 210}, {"title": "The Rustling Leaves", "length_seconds": 180}, {"title": "The Witch's Curse", "length_seconds": 240}]}
=== Function Output ===
name='The Blair Witch Project' artist='Soundtrack Ensemble' songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title=

In [None]:
output

[Album(name='The Shining', artist='Various Artists', songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)]),
 Album(name='The Blair Witch Project', artist='Soundtrack Ensemble', songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title="The Witch's Curse", length_seconds=240)]),
 Album(name='Saw', artist='Horror Soundscapes', songs=[Song(title='The Reverse Bear Trap', length_seconds=220), Song(title="Jigsaw's Game", length_seconds=260), Song(title='Bathroom Escape', length_seconds=180)])]

### 使用Anthropic进行函数调用程序

在这里，我们使用Claude Sonnet（所有三个模型都支持函数调用）。


In [None]:
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.anthropic import Anthropic

In [None]:
prompt_template_str = "Generate a song about {topic}."
llm = Anthropic(model="claude-3-sonnet-20240229")

program = FunctionCallingProgram.from_defaults(
    output_cls=Song,
    prompt_template_str=prompt_template_str,
    llm=llm,
    verbose=True,
)

In [None]:
output = program(topic="harry potter")

=== Calling Function ===
Calling function: Song with args: {"title": "The Boy Who Lived", "length_seconds": 180}
=== Function Output ===
title='The Boy Who Lived' length_seconds=180


In [None]:
output

Song(title='The Boy Who Lived', length_seconds=180)

### 使用Mistral进行函数调用程序

这里我们使用mistral-large。


In [None]:
from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.mistralai import MistralAI

In [None]:
# prompt_template_str = """\# 生成一个示例专辑，包括一个艺术家和一组歌曲。\# 以百老汇音乐剧 {broadway_show} 为灵感。\# 确保使用该工具。# """prompt_template_str = "生成一首关于 {topic} 的歌曲。"llm = MistralAI(model="mistral-large-latest")program = FunctionCallingProgram.from_defaults(    output_cls=Song,    prompt_template_str=prompt_template_str,    llm=llm,    verbose=True,)

In [None]:
output = program(topic="the broadway show Wicked")

=== Calling Function ===
Calling function: Song with args: {"title": "Defying Gravity", "length_seconds": 240}
=== Function Output ===
title='Defying Gravity' length_seconds=240


In [None]:
output

Song(title='Defying Gravity', length_seconds=240)

In [None]:
from llama_index.core.output_parsers import PydanticOutputParser

program = LLMTextCompletionProgram.from_defaults(
    output_parser=PydanticOutputParser(output_cls=Album),
    prompt_template_str=prompt_template_str,
    verbose=True,
)

In [None]:
output = program(movie_name="Lord of the Rings")
output

Album(name='The Fellowship of the Ring', artist='Middle-earth Ensemble', songs=[Song(title='The Shire', length_seconds=240), Song(title='Concerning Hobbits', length_seconds=180), Song(title='The Ring Goes South', length_seconds=300), Song(title='A Knife in the Dark', length_seconds=270), Song(title='Flight to the Ford', length_seconds=210), Song(title='Many Meetings', length_seconds=240), Song(title='The Council of Elrond', length_seconds=330), Song(title='The Great Eye', length_seconds=180), Song(title='The Breaking of the Fellowship', length_seconds=360)])