# 如何在不使用函数调用的情况下进行信息抽取

:::info 前提条件

本指南假定您熟悉以下内容：

- [信息抽取](/docs/tutorials/extraction)

:::

对于能够很好地遵循提示指令的大型语言模型（LLMs），我们可以不使用函数调用来让其以特定格式输出信息。

这种方法依赖于设计良好的提示，并通过对LLMs的输出进行解析，以实现良好的信息抽取效果，但相比函数调用或JSON模式，它缺乏一些保障机制。

在这里，我们将使用非常擅长遵循指令的Claude模型！更多关于Anthropic模型的信息请参见[此处](/docs/integrations/chat/anthropic)。

首先，我们将安装集成包：

```{=mdx}
import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
import Npm2Yarn from "@theme/Npm2Yarn";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

<Npm2Yarn>
  @langchain/anthropic @langchain/core zod zod-to-json-schema
</Npm2Yarn>
```

In [1]:
import { ChatAnthropic } from "@langchain/anthropic";

const model = new ChatAnthropic({
  model: "claude-3-sonnet-20240229",
  temperature: 0,
})

:::{.callout-tip}
解析方法同样适用于提取质量的所有考量。

本教程旨在保持简洁，但通常应包含参考示例以提升性能！
:::

## 使用 StructuredOutputParser

以下示例使用内置的 [`StructuredOutputParser`](/docs/how_to/output_parser_structured/) 来解析聊天模型的输出。我们使用解析器中包含的内置提示格式化指令。

In [2]:
import { z } from "zod";
import { StructuredOutputParser } from "langchain/output_parsers";
import { ChatPromptTemplate } from "@langchain/core/prompts";

let personSchema = z.object({
  name: z.optional(z.string()).describe("The name of the person"),
  hair_color: z.optional(z.string()).describe("The color of the person's hair, if known"),
  height_in_meters: z.optional(z.string()).describe("Height measured in meters")
}).describe("Information about a person.");

const parser = StructuredOutputParser.fromZodSchema(personSchema);

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "Answer the user query. Wrap the output in `json` tags\n{format_instructions}"],
  ["human", "{query}"],
]);

const partialedPrompt = await prompt.partial({
  format_instructions: parser.getFormatInstructions(),
});

让我们看一下发送给模型的信息内容

In [3]:
const query = "Anna is 23 years old and she is 6 feet tall";

In [4]:
const promptValue = await partialedPrompt.invoke({ query });

console.log(promptValue.toChatMessages());

[
  SystemMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "Answer the user query. Wrap the output in `json` tags\n" +
        "You must format your output as a JSON value th"... 1444 more characters,
      additional_kwargs: {}
    },
    lc_namespace: [ "langchain_core", "messages" ],
    content: "Answer the user query. Wrap the output in `json` tags\n" +
      "You must format your output as a JSON value th"... 1444 more characters,
    name: undefined,
    additional_kwargs: {}
  },
  HumanMessage {
    lc_serializable: true,
    lc_kwargs: {
      content: "Anna is 23 years old and she is 6 feet tall",
      additional_kwargs: {}
    },
    lc_namespace: [ "langchain_core", "messages" ],
    content: "Anna is 23 years old and she is 6 feet tall",
    name: undefined,
    additional_kwargs: {}
  }
]


In [5]:
const chain = partialedPrompt.pipe(model).pipe(parser);

await chain.invoke({ query });

{ name: [32m"Anna"[39m, hair_color: [32m""[39m, height_in_meters: [32m"1.83"[39m }

## 自定义解析

你还可以使用 `LangChain` 和 `LCEL` 创建自定义提示词和解析器。

你可以使用原始函数来解析模型的输出。

在下面的例子中，我们会将模式作为 JSON Schema 传递给提示词。为了方便起见，我们将使用 Zod 声明我们的模式，然后使用 [`zod-to-json-schema`](https://github.com/StefanTerdell/zod-to-json-schema) 工具将其转换为 JSON Schema。

In [6]:
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

personSchema = z.object({
  name: z.optional(z.string()).describe("The name of the person"),
  hair_color: z.optional(z.string()).describe("The color of the person's hair, if known"),
  height_in_meters: z.optional(z.string()).describe("Height measured in meters")
}).describe("Information about a person.");

const peopleSchema = z.object({
  people: z.array(personSchema),
});

const SYSTEM_PROMPT_TEMPLATE = [
  "Answer the user's query. You must return your answer as JSON that matches the given schema:",
  "```json\n{schema}\n```.",
  "Make sure to wrap the answer in ```json and ``` tags. Conform to the given schema exactly.",
].join("\n");

const customParsingPrompt = ChatPromptTemplate.fromMessages([
  ["system", SYSTEM_PROMPT_TEMPLATE],
  ["human", "{query}"],
]);

const extractJsonFromOutput = (message) => {
  const text = message.content;

  // Define the regular expression pattern to match JSON blocks
  const pattern = /```json\s*((.|\n)*?)\s*```/gs;

  // Find all non-overlapping matches of the pattern in the string
  const matches = pattern.exec(text);

  if (matches && matches[1]) {
    try {
      return JSON.parse(matches[1].trim());
    } catch (error) {
      throw new Error(`Failed to parse: ${matches[1]}`);
    }
  } else {
    throw new Error(`No JSON found in: ${message}`);
  }
}

In [7]:
const customParsingQuery = "Anna is 23 years old and she is 6 feet tall";

const customParsingPromptValue = await customParsingPrompt.invoke({
  schema: zodToJsonSchema(peopleSchema),
  customParsingQuery
});

customParsingPromptValue.toString();

[32m"System: Answer the user's query. You must return your answer as JSON that matches the given schema:\n"[39m... 170 more characters

In [8]:
const customParsingChain = prompt.pipe(model).pipe(extractJsonFromOutput);

await customParsingChain.invoke({
  schema: zodToJsonSchema(peopleSchema),
  customParsingQuery,
});

{ name: [32m"Anna"[39m, age: [33m23[39m, height: { feet: [33m6[39m, inches: [33m0[39m } }

## 下一步
您现在已经了解了如何在不使用工具调用的情况下执行提取操作。

接下来，请查看本节中的其他一些指南，例如[如何通过示例提高提取质量的一些技巧](/docs/how_to/extraction_examples)。