# 如何返回引用文献

:::info 预备知识

本指南假定您熟悉以下内容：

- [检索增强生成](/docs/tutorials/rag/)
- [从模型返回结构化数据](/docs/how_to/structured_output/)

:::

我们如何让模型引用其响应中参考的源文档部分？

为了探讨一些提取引用的方法，我们首先创建一个简单的RAG链。开始时，我们将仅使用[`TavilySearchAPIRetriever`](https://api.js.langchain.com/classes/langchain_community_retrievers_tavily_search_api.TavilySearchAPIRetriever.html)通过网络进行检索。

## 环境配置
### 依赖项

在本次演示中，我们将使用 OpenAI 的聊天模型和嵌入模型以及一个 Memory 向量存储，但这里展示的所有内容均适用于任何 [ChatModel](/docs/concepts/chat_models) 或 [LLM](/docs/concepts/text_llms)、[Embeddings](/docs/concepts/embedding_models/)、以及 [VectorStore](/docs/concepts/vectorstores/) 或 [Retriever](/docs/concepts/retrievers/)。

我们将使用以下软件包：

```bash
npm install --save langchain @langchain/community @langchain/openai
```

我们需要为 Tavily Search 和 OpenAI 设置环境变量：

```bash
export OPENAI_API_KEY=YOUR_KEY
export TAVILY_API_KEY=YOUR_KEY
```

### LangSmith

使用LangChain构建的许多应用程序将包含多个步骤，并多次调用LLM。随着这些应用程序变得越来越复杂，能够检查链或代理内部确切发生了什么变得至关重要。要做到这一点，最好的方法是使用[LangSmith](https://smith.langchain.com/)。

请注意，LangSmith并不是必需的，但它非常有用。如果您确实想使用LangSmith，在上面的链接注册后，请确保设置您的环境变量以开始记录跟踪信息：


```bash
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY=您的密钥

# 如果您不在无服务器环境中，请减少跟踪延迟
# export LANGCHAIN_CALLBACKS_BACKGROUND=true
```

### 初始设置

In [1]:
import { TavilySearchAPIRetriever } from "@langchain/community/retrievers/tavily_search_api";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  model: "gpt-3.5-turbo",
  temperature: 0,
});

const retriever = new TavilySearchAPIRetriever({
  k: 6,
});

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "You're a helpful AI assistant. Given a user question and some web article snippets, answer the user question. If none of the articles answer the question, just say you don't know.\n\nHere are the web articles:{context}"],
  ["human", "{question}"],
]);

现在我们已经有了模型、检索器和提示词，让我们把它们全部串联起来。我们需要添加一些逻辑，将检索到的 `Document` 格式化为可以传递给提示词的字符串。我们会让这个链同时返回答案和检索到的文档。

In [2]:
import { Document } from "@langchain/core/documents";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { RunnableMap, RunnablePassthrough } from "@langchain/core/runnables";

/**
 * Format the documents into a readable string.
 */
const formatDocs = (input: Record<string, any>): string => {
  const { docs } = input;
  return "\n\n" + docs.map((doc: Document) => `Article title: ${doc.metadata.title}\nArticle Snippet: ${doc.pageContent}`).join("\n\n");
}
// subchain for generating an answer once we've done retrieval
const answerChain = prompt.pipe(llm).pipe(new StringOutputParser());
const map = RunnableMap.from({
  question: new RunnablePassthrough(),
  docs: retriever,
})
// complete chain that calls the retriever -> formats docs to string -> runs answer subchain -> returns just the answer and retrieved docs.
const chain = map.assign({ context: formatDocs }).assign({ answer: answerChain }).pick(["answer", "docs"])

await chain.invoke("How fast are cheetahs?")

{
  answer: [32m"Cheetahs are the fastest land animals on Earth. They can reach speeds as high as 75 mph or 120 km/h."[39m... 124 more characters,
  docs: [
    Document {
      pageContent: [32m"Contact Us − +\n"[39m +
        [32m"Address\n"[39m +
        [32m"Smithsonian's National Zoo & Conservation Biology Institute  3001 Connecticut"[39m... 1343 more characters,
      metadata: {
        title: [32m"Cheetah | Smithsonian's National Zoo and Conservation Biology Institute"[39m,
        source: [32m"https://nationalzoo.si.edu/animals/cheetah"[39m,
        score: [33m0.96283[39m,
        images: [1mnull[22m
      }
    },
    Document {
      pageContent: [32m"Now, their only hope lies in the hands of human conservationists, working tirelessly to save the che"[39m... 880 more characters,
      metadata: {
        title: [32m"How Fast Are Cheetahs, and Other Fascinating Facts About the World's ..."[39m,
        source: [32m"https://www.discovermagazine.com/planet-

[在此处](https://smith.langchain.com/public/bb0ed37e-b2be-4ae9-8b0d-ce2aff0b4b5e/r)查看一个 LangSmith 跟踪，展示其内部工作原理。

## 工具调用

### 引用文档
让我们尝试使用[工具调用](/docs/how_to/tool_calling)，让模型在回答时指明它实际引用了所提供的哪些文档。LangChain 提供了一些工具，用于将对象或[Zod](https://zod.dev)对象转换为像 OpenAI 这样的提供商所期望的 JSONSchema 格式。我们将使用[`.withStructuredOutput()`](/docs/how_to/structured_output/)方法，使模型输出符合我们所需模式的数据：

In [3]:
import { z } from "zod";

const llmWithTool1 = llm.withStructuredOutput(
  z.object({
    answer: z.string().describe("The answer to the user question, which is based only on the given sources."),
    citations: z.array(z.number()).describe("The integer IDs of the SPECIFIC sources which justify the answer.")
  }).describe("A cited source from the given text"),
  {
    name: "cited_answers"
  }
);

const exampleQ = `What is Brian's height?

Source: 1
Information: Suzy is 6'2"

Source: 2
Information: Jeremiah is blonde

Source: 3
Information: Brian is 3 inches shorter than Suzy`;

await llmWithTool1.invoke(exampleQ);

{
  answer: [32m`Brian is 6'2" - 3 inches = 5'11" tall.`[39m,
  citations: [ [33m1[39m, [33m3[39m ]
}

查看 LangSmith 追踪 [此处](https://smith.langchain.com/public/28736c75-122e-4deb-9916-55c73eea3167/r)，展示内部流程

现在我们准备好将链条组合在一起了

In [4]:
import { Document } from "@langchain/core/documents";

const formatDocsWithId = (docs: Array<Document>): string => {
  return "\n\n" + docs.map((doc: Document, idx: number) => `Source ID: ${idx}\nArticle title: ${doc.metadata.title}\nArticle Snippet: ${doc.pageContent}`).join("\n\n");
}
// subchain for generating an answer once we've done retrieval
const answerChain1 = prompt.pipe(llmWithTool1);
const map1 = RunnableMap.from({
  question: new RunnablePassthrough(),
  docs: retriever,
})
// complete chain that calls the retriever -> formats docs to string -> runs answer subchain -> returns just the answer and retrieved docs.
const chain1 = map1
  .assign({ context: (input: { docs: Array<Document> }) => formatDocsWithId(input.docs) })
  .assign({ cited_answer: answerChain1 })
  .pick(["cited_answer", "docs"])
  
await chain1.invoke("How fast are cheetahs?")

{
  cited_answer: {
    answer: [32m"Cheetahs can reach speeds as high as 75 mph or 120 km/h."[39m,
    citations: [ [33m1[39m, [33m2[39m, [33m5[39m ]
  },
  docs: [
    Document {
      pageContent: [32m"One of two videos from National Geographic's award-winning multimedia coverage of cheetahs in the ma"[39m... 60 more characters,
      metadata: {
        title: [32m"The Science of a Cheetah's Speed | National Geographic"[39m,
        source: [32m"https://www.youtube.com/watch?v=icFMTB0Pi0g"[39m,
        score: [33m0.97858[39m,
        images: [1mnull[22m
      }
    },
    Document {
      pageContent: [32m"The maximum speed cheetahs have been measured at is 114 km (71 miles) per hour, and they routinely r"[39m... 1048 more characters,
      metadata: {
        title: [32m"Cheetah | Description, Speed, Habitat, Diet, Cubs, & Facts"[39m,
        source: [32m"https://www.britannica.com/animal/cheetah-mammal"[39m,
        score: [33m0.97213[39m,
        images

查看一个展示内部流程的 LangSmith 追踪 [链接](https://smith.langchain.com/public/86814255-b9b0-4c4f-9463-e795c9961451/r)。

### 引用片段

如果我们想要引用实际的文本段落该怎么办呢？我们也可以尝试让模型返回这些内容。

**注意**：请注意，如果我们拆分文档，使得我们拥有许多只包含一两个句子的文档，而不是少量较长的文档，那么引用文档大致等同于引用片段，并且对模型来说可能更容易，因为模型只需为每个片段返回一个标识符，而不是实际的文本。我们建议尝试这两种方法并进行评估。

In [5]:
import { Document } from "@langchain/core/documents";

const citationSchema = z.object({
  sourceId: z.number().describe("The integer ID of a SPECIFIC source which justifies the answer."),
  quote: z.string().describe("The VERBATIM quote from the specified source that justifies the answer.")
});

const llmWithTool2 = llm.withStructuredOutput(
  z.object({
    answer: z.string().describe("The answer to the user question, which is based only on the given sources."),
    citations: z.array(citationSchema).describe("Citations from the given sources that justify the answer.")
  }), {
    name: "quoted_answer",
  })

const answerChain2 = prompt.pipe(llmWithTool2);
const map2 = RunnableMap.from({
  question: new RunnablePassthrough(),
  docs: retriever,
})
// complete chain that calls the retriever -> formats docs to string -> runs answer subchain -> returns just the answer and retrieved docs.
const chain2 = map2
  .assign({ context: (input: { docs: Array<Document> }) => formatDocsWithId(input.docs) })
  .assign({ quoted_answer: answerChain2 })
  .pick(["quoted_answer", "docs"]);
  
await chain2.invoke("How fast are cheetahs?")

{
  quoted_answer: {
    answer: [32m"Cheetahs can reach speeds of up to 120kph or 75mph, making them the world’s fastest land animals."[39m,
    citations: [
      {
        sourceId: [33m5[39m,
        quote: [32m"Cheetahs can reach speeds of up to 120kph or 75mph, making them the world’s fastest land animals."[39m
      },
      {
        sourceId: [33m1[39m,
        quote: [32m"The cheetah (Acinonyx jubatus) is the fastest land animal on Earth, capable of reaching speeds as hi"[39m... 25 more characters
      },
      {
        sourceId: [33m3[39m,
        quote: [32m"The maximum speed cheetahs have been measured at is 114 km (71 miles) per hour, and they routinely r"[39m... 72 more characters
      }
    ]
  },
  docs: [
    Document {
      pageContent: [32m"Contact Us − +\n"[39m +
        [32m"Address\n"[39m +
        [32m"Smithsonian's National Zoo & Conservation Biology Institute  3001 Connecticut"[39m... 1343 more characters,
      metadata: {
        titl

你可以查看一个 LangSmith 追踪 [链接](https://smith.langchain.com/public/f0588adc-1914-45e8-a2ed-4fa028cea0e1/r)，它展示了内部运行情况。

## 直接提示

并非所有模型都支持工具调用。我们可以通过直接提示实现类似的结果。让我们看看使用一个较旧的Anthropic聊天模型（在处理XML方面特别擅长）时会是什么样子：

### 安装

安装 LangChain Anthropic 集成包：

```bash
npm install @langchain/anthropic
```

将你的 Anthropic API 密钥添加到环境变量中：

```bash
export ANTHROPIC_API_KEY=你的密钥
```

In [7]:
import { ChatAnthropic } from "@langchain/anthropic";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { XMLOutputParser } from "@langchain/core/output_parsers";
import { Document } from "@langchain/core/documents";
import { RunnableLambda, RunnablePassthrough, RunnableMap } from "@langchain/core/runnables";

const anthropic = new ChatAnthropic({
  model: "claude-instant-1.2",
  temperature: 0,
});
const system = `You're a helpful AI assistant. Given a user question and some web article snippets,
answer the user question and provide citations. If none of the articles answer the question, just say you don't know.

Remember, you must return both an answer and citations. A citation consists of a VERBATIM quote that
justifies the answer and the ID of the quote article. Return a citation for every quote across all articles
that justify the answer. Use the following format for your final output:

<cited_answer>
    <answer></answer>
    <citations>
        <citation><source_id></source_id><quote></quote></citation>
        <citation><source_id></source_id><quote></quote></citation>
        ...
    </citations>
</cited_answer>

Here are the web articles:{context}`;

const anthropicPrompt = ChatPromptTemplate.fromMessages([
  ["system", system],
  ["human", "{question}"]
]);

const formatDocsToXML = (docs: Array<Document>): string => {
  const formatted: Array<string> = [];
  docs.forEach((doc, idx) => {
    const docStr = `<source id="${idx}">
  <title>${doc.metadata.title}</title>
  <article_snippet>${doc.pageContent}</article_snippet>
</source>`
    formatted.push(docStr);
  });
  return `\n\n<sources>${formatted.join("\n")}</sources>`;
}

const format3 = new RunnableLambda({
  func: (input: { docs: Array<Document> }) => formatDocsToXML(input.docs)
})
const answerChain = anthropicPrompt
  .pipe(anthropic)
  .pipe(new XMLOutputParser())
  .pipe(
    new RunnableLambda({ func: (input: { cited_answer: any }) => input.cited_answer })
  );
const map3 = RunnableMap.from({
  question: new RunnablePassthrough(),
  docs: retriever,
});
const chain3 = map3.assign({ context: format3 }).assign({ cited_answer: answerChain }).pick(["cited_answer", "docs"])

const res = await chain3.invoke("How fast are cheetahs?");

console.log(JSON.stringify(res, null, 2));

{
  "cited_answer": [
    {
      "answer": "Cheetahs can reach top speeds of around 75 mph, but can only maintain bursts of speed for short distances before tiring."
    },
    {
      "citations": [
        {
          "citation": [
            {
              "source_id": "1"
            },
            {
              "quote": "Scientists calculate a cheetah's top speed is 75 mph, but the fastest recorded speed is somewhat slower."
            }
          ]
        },
        {
          "citation": [
            {
              "source_id": "3"
            },
            {
              "quote": "The maximum speed cheetahs have been measured at is 114 km (71 miles) per hour, and they routinely reach velocities of 80–100 km (50–62 miles) per hour while pursuing prey."
            }
          ]
        }
      ]
    }
  ],
  "docs": [
    {
      "pageContent": "One of two videos from National Geographic's award-winning multimedia coverage of cheetahs in the magazine's November 2012 

[点击此处](https://smith.langchain.com/public/e2e938e8-f847-4ea8-bc84-43d4eaf8e524/r) 查看 LangSmith 追踪，了解更多内部细节。

## 检索后处理

另一种方法是对检索到的文档进行后处理以压缩内容，使得源内容本身已经足够简洁，无需模型引用特定来源或文本片段。例如，我们可以将每个文档拆分成一两句句子，对这些句子进行嵌入处理，并仅保留最相关的部分。LangChain 为此提供了一些内置组件。在这里，我们将使用 [`RecursiveCharacterTextSplitter`](/docs/how_to/recursive_text_splitter)，它通过在分隔子字符串处分割文本，创建指定大小的块；以及 [`EmbeddingsFilter`](/docs/how_to/contextual_compression)，它仅保留具有最相关嵌入的文本。

In [8]:
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { EmbeddingsFilter } from "langchain/retrievers/document_compressors/embeddings_filter";
import { OpenAIEmbeddings } from "@langchain/openai";
import { DocumentInterface } from "@langchain/core/documents";
import { RunnableMap, RunnablePassthrough } from "@langchain/core/runnables";

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 400,
  chunkOverlap: 0,
  separators: ["\n\n", "\n", ".", " "],
  keepSeparator: false,
});

const compressor = new EmbeddingsFilter({
  embeddings: new OpenAIEmbeddings(),
  k: 10,
});

const splitAndFilter = async (input): Promise<Array<DocumentInterface>> => {
  const { docs, question } = input;
  const splitDocs = await splitter.splitDocuments(docs);
  const statefulDocs = await compressor.compressDocuments(splitDocs, question);
  return statefulDocs;
};

const retrieveMap = RunnableMap.from({
  question: new RunnablePassthrough(),
  docs: retriever,
});

const retriever = retrieveMap.pipe(splitAndFilter);
const docs = await retriever.invoke("How fast are cheetahs?");
for (const doc of docs) {
  console.log(doc.pageContent, "\n\n");
}

The maximum speed cheetahs have been measured at is 114 km (71 miles) per hour, and they routinely reach velocities of 80–100 km (50–62 miles) per hour while pursuing prey.
cheetah,
(Acinonyx jubatus), 


The science of cheetah speed
The cheetah (Acinonyx jubatus) is the fastest land animal on Earth, capable of reaching speeds as high as 75 mph or 120 km/h. Cheetahs are predators that sneak up on their prey and sprint a short distance to chase and attack.
 Key Takeaways: How Fast Can a Cheetah Run?
Fastest Cheetah on Earth 


Built for speed, the cheetah can accelerate from zero to 45 in just 2.5 seconds and reach top speeds of 60 to 70 mph, making it the fastest land mammal! Fun Facts
Conservation Status
Cheetah News
Taxonomic Information
Animal News
NZCBI staff in Front Royal, Virginia, are mourning the loss of Walnut, a white-naped crane who became an internet sensation for choosing one of her keepers as her mate. 


The speeds attained by the cheetah may be only slightly greater th

[点击此处](https://smith.langchain.com/public/ae6b1f52-c1fe-49ec-843c-92edf2104652/r)查看LangSmith的追踪信息以了解内部细节。

In [9]:
const chain4 = retrieveMap
  .assign({ context: formatDocs })
  .assign({ answer: answerChain })
  .pick(["answer", "docs"]);
  
// Note the documents have an article "summary" in the metadata that is now much longer than the
// actual document page content. This summary isn't actually passed to the model.
const res = await chain4.invoke("How fast are cheetahs?");

console.log(JSON.stringify(res, null, 2))

{
  "answer": [
    {
      "answer": "\nCheetahs are the fastest land animals. They can reach top speeds between 75-81 mph (120-130 km/h). \n"
    },
    {
      "citations": [
        {
          "citation": [
            {
              "source_id": "Article title: How Fast Can a Cheetah Run? - ThoughtCo"
            },
            {
              "quote": "The science of cheetah speed\nThe cheetah (Acinonyx jubatus) is the fastest land animal on Earth, capable of reaching speeds as high as 75 mph or 120 km/h."
            }
          ]
        },
        {
          "citation": [
            {
              "source_id": "Article title: Cheetah - Wikipedia"
            },
            {
              "quote": "Scientists calculate a cheetah's top speed is 75 mph, but the fastest recorded speed is somewhat slower."
            }
          ]
        }
      ]
    }
  ],
  "docs": [
    {
      "pageContent": "The science of cheetah speed\nThe cheetah (Acinonyx jubatus) is the fastest l

[点击此处](https://smith.langchain.com/public/b767cca0-6061-4208-99f2-7f522b94a587/r)查看LangSmith的追踪信息，了解内部实现。

## 下一步

你现在已了解了从问答链中返回引用来源的几种方法。

接下来，请查看本部分中的其他指南，例如[如何添加聊天历史记录](/docs/how_to/qa_chat_history_how_to)。