# 如何仅通过提示（不调用工具）完成信息提取
[工具调用](/docs/concepts/tool_calling/)功能并非从大语言模型获取结构化输出的必要条件。擅长遵循提示指令的大语言模型，完全能够按照指定格式输出信息。
这种方法依赖于设计优质的提示词，然后解析大语言模型的输出，从而有效地提取信息。
在不使用工具调用功能的情况下提取数据：
1. 指导大语言模型按照预期格式生成文本（例如，遵循特定模式的JSON）；2. 使用[输出解析器](/docs/concepts/output_parsers)将模型响应结构化为你想要的Python对象。
首先我们选择一个大型语言模型（LLM）：
import ChatModelTabs from "@theme/ChatModelTabs";
<ChatModelTabs customVarName="model" />


In [1]:
# | output: false
# | echo: false

from langchain_anthropic.chat_models import ChatAnthropic

model = ChatAnthropic(model_name="claude-3-sonnet-20240229", temperature=0)

:::提示本教程旨在简洁明了，但通常还是应该包含参考示例以充分挖掘性能潜力！好的,我将按照您的要求进行翻译,确保输出标准的markdown格式内容,不显示任何额外标记。以下是一个示例翻译:

# 项目文档

## 简介
这是一个示例项目,用于演示markdown格式的翻译。

### 功能特性
- 支持多级标题
- 支持列表项
- 支持**加粗**和*斜体*文本
- 包含代码块:

```python
def hello():
    print("Hello World!")
```

## 安装指南
1. 下载项目文件
2. 运行安装脚本
3. 按照提示完成配置

> 注意: 本示例仅用于演示目的

[查看更多](#)

## 使用 PydanticOutputParser
以下示例使用内置的 `PydanticOutputParser` 来解析聊天模型的输出。

In [2]:
from typing import List, Optional

from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field, validator


class Person(BaseModel):
    """Information about a person."""

    name: str = Field(..., description="The name of the person")
    height_in_meters: float = Field(
        ..., description="The height of the person expressed in meters."
    )


class People(BaseModel):
    """Identifying information about all people in a text."""

    people: List[Person]


# Set up a parser
parser = PydanticOutputParser(pydantic_object=People)

# Prompt
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer the user query. Wrap the output in `json` tags\n{format_instructions}",
        ),
        ("human", "{query}"),
    ]
).partial(format_instructions=parser.get_format_instructions())

让我们看看哪些信息被发送给了模型

In [3]:
query = "Anna is 23 years old and she is 6 feet tall"

In [4]:
print(prompt.format_prompt(query=query).to_string())

System: Answer the user query. Wrap the output in `json` tags
The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"$defs": {"Person": {"description": "Information about a person.", "properties": {"name": {"description": "The name of the person", "title": "Name", "type": "string"}, "height_in_meters": {"description": "The height of the person expressed in meters.", "title": "Height In Meters", "type": "number"}}, "required": ["name", "height_in_meters"], "title": "Person", "type": "object"}}, "description": "Identifying information about all people in a text.", "properties": {"people": {"items"

定义好提示词后，我们只需将提示词、模型和输出解析器串联起来：

In [5]:
chain = prompt | model | parser
chain.invoke({"query": query})

People(people=[Person(name='Anna', height_in_meters=1.83)])

查看相关的 [Langsmith 追踪记录](https://smith.langchain.com/public/92ed52a3-92b9-45af-a663-0a9c00e5e396/r)。
请注意，该模式出现在两个位置：
1. 在提示中，通过 `parser.get_format_instructions()`；2. 在链中，接收格式化输出并将其结构化为Python对象（本例中为Pydantic对象`People`）。

## 自定义解析
如有所需，使用 `LangChain` 和 `LCEL` 轻松创建自定义提示和解析器。
要创建自定义解析器，需定义一个函数来将模型的输出（通常为[AIMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html)）解析为您所需的对象。
以下是一个简单的 JSON 解析器实现：

In [6]:
import json
import re
from typing import List, Optional

from langchain_anthropic.chat_models import ChatAnthropic
from langchain_core.messages import AIMessage
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field, validator


class Person(BaseModel):
    """Information about a person."""

    name: str = Field(..., description="The name of the person")
    height_in_meters: float = Field(
        ..., description="The height of the person expressed in meters."
    )


class People(BaseModel):
    """Identifying information about all people in a text."""

    people: List[Person]


# Prompt
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer the user query. Output your answer as JSON that  "
            "matches the given schema: ```json\n{schema}\n```. "
            "Make sure to wrap the answer in ```json and ``` tags",
        ),
        ("human", "{query}"),
    ]
).partial(schema=People.schema())


# Custom parser
def extract_json(message: AIMessage) -> List[dict]:
    """Extracts JSON content from a string where JSON is embedded between ```json and ``` tags.

    Parameters:
        text (str): The text containing the JSON content.

    Returns:
        list: A list of extracted JSON strings.
    """
    text = message.content
    # Define the regular expression pattern to match JSON blocks
    pattern = r"```json(.*?)```"

    # Find all non-overlapping matches of the pattern in the string
    matches = re.findall(pattern, text, re.DOTALL)

    # Return the list of matched JSON strings, stripping any leading or trailing whitespace
    try:
        return [json.loads(match.strip()) for match in matches]
    except Exception:
        raise ValueError(f"Failed to parse: {message}")

In [7]:
query = "Anna is 23 years old and she is 6 feet tall"
print(prompt.format_prompt(query=query).to_string())

System: Answer the user query. Output your answer as JSON that  matches the given schema: ```json
{'$defs': {'Person': {'description': 'Information about a person.', 'properties': {'name': {'description': 'The name of the person', 'title': 'Name', 'type': 'string'}, 'height_in_meters': {'description': 'The height of the person expressed in meters.', 'title': 'Height In Meters', 'type': 'number'}}, 'required': ['name', 'height_in_meters'], 'title': 'Person', 'type': 'object'}}, 'description': 'Identifying information about all people in a text.', 'properties': {'people': {'items': {'$ref': '#/$defs/Person'}, 'title': 'People', 'type': 'array'}}, 'required': ['people'], 'title': 'People', 'type': 'object'}
```. Make sure to wrap the answer in ```json and ``` tags
Human: Anna is 23 years old and she is 6 feet tall


In [8]:
chain = prompt | model | extract_json
chain.invoke({"query": query})



[{'people': [{'name': 'Anna', 'height_in_meters': 1.83}]}]

## 其他库
如果您正在考虑使用解析方法进行提取，可以查看 [Kor](https://eyurtsev.github.io/kor/) 库。它由 `LangChain` 的维护者之一编写，并且帮助构建一个考虑示例的提示模板，支持控制输出格式（如JSON或CSV）并用TypeScript表达数据结构。效果相当不错！