# Summarize Text

```{=mdx}
:::info

This tutorial demonstrates text summarization using built-in chains and [LangGraph](https://langchain-ai.github.io/langgraphjs/).

See [here](https://js.langchain.com/v0.2/docs/tutorials/summarization/) for a previous version of this page, which showcased the legacy chain [RefineDocumentsChain](https://api.js.langchain.com/classes/langchain.chains.RefineDocumentsChain.html).

:::
```

Suppose you have a set of documents (PDFs, Notion pages, customer questions, etc.) and you want to summarize the content. 

LLMs are a great tool for this given their proficiency in understanding and synthesizing text.

In the context of [retrieval-augmented generation](/docs/tutorials/rag), summarizing text can help distill the information in a large number of retrieved documents to provide context for a LLM.

In this walkthrough we'll go over how to summarize content from multiple documents using LLMs.

## Concepts

Concepts we will cover are:

- Using [language models](/docs/concepts/chat_models).

- Using [document loaders](/docs/concepts/document_loaders), specifically the [CheerioWebBaseLoader](https://api.js.langchain.com/classes/langchain.document_loaders_web_cheerio.CheerioWebBaseLoader.html) to load content from an HTML webpage.

- Two ways to summarize or otherwise combine documents.
  1. [Stuff](/docs/tutorials/summarization#stuff), which simply concatenates documents into a prompt;
  2. [Map-reduce](/docs/tutorials/summarization#map-reduce), for larger sets of documents. This splits documents into batches, summarizes those, and then summarizes the summaries.

## Setup

### Jupyter Notebook

This and other tutorials are perhaps most conveniently run in a [Jupyter notebooks](https://jupyter.org/). Going through guides in an interactive environment is a great way to better understand them. See [here](https://jupyter.org/install) for instructions on how to install.

### Installation

To install LangChain run:

```bash npm2yarn
npm i langchain @langchain/core
```

For more details, see our [Installation guide](/docs/how_to/installation).

### LangSmith

Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls.
As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent.
The best way to do this is with [LangSmith](https://smith.langchain.com).

After you sign up at the link above, make sure to set your environment variables to start logging traces:

```shell
export LANGSMITH_TRACING="true"
export LANGSMITH_API_KEY="..."

# Reduce tracing latency if you are not in a serverless environment
# export LANGCHAIN_CALLBACKS_BACKGROUND=true
```

## Overview

A central question for building a summarizer is how to pass your documents into the LLM's context window. Two common approaches for this are:

1. `Stuff`: Simply "stuff" all your documents into a single prompt. This is the simplest approach.

2. `Map-reduce`: Summarize each document on its own in a "map" step and then "reduce" the summaries into a final summary.

Note that map-reduce is especially effective when understanding of a sub-document does not rely on preceding context. For example, when summarizing a corpus of many, shorter documents. In other cases, such as summarizing a novel or body of text with an inherent sequence, [iterative refinement](https://js.langchain.com/v0.2/docs/tutorials/summarization/) may be more effective.

First we load in our documents. We will use [WebBaseLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.web_base.WebBaseLoader.html) to load a blog post:

In [1]:
import "cheerio";
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";

const pTagSelector = "p";
const cheerioLoader = new CheerioWebBaseLoader(
  "https://lilianweng.github.io/posts/2023-06-23-agent/",
  {
    selector: pTagSelector
  }
);

const docs = await cheerioLoader.load();

Let's next select a [chat model](/docs/integrations/chat/):

```{=mdx}
import ChatModelTabs from "@theme/ChatModelTabs";

<ChatModelTabs customVarName="llm" />
```

In [3]:
// @lc-docs-hide-cell
import { ChatOpenAI } from '@langchain/openai';

const llm = new ChatOpenAI({
  model: "gpt-4o-mini",
  temperature: 0,
})

## Stuff: summarize in a single LLM call {#stuff}

We can use [createStuffDocumentsChain](https://api.js.langchain.com/functions/langchain.chains_combine_documents.createStuffDocumentsChain.html), especially if using larger context window models such as:

* 128k token OpenAI `gpt-4o` 
* 200k token Anthropic `claude-3-5-sonnet-20240620`

The chain will take a list of documents, insert them all into a prompt, and pass that prompt to an LLM:

In [4]:
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { PromptTemplate } from "@langchain/core/prompts";


// Define prompt
const prompt = PromptTemplate.fromTemplate(
  "Summarize the main themes in these retrieved docs: {context}"
);

// Instantiate
const chain = await createStuffDocumentsChain({
  llm: llm,
  outputParser: new StringOutputParser(),
  prompt,
});

// Invoke
const result = await chain.invoke({context: docs})
console.log(result)

The retrieved documents discuss the development and capabilities of autonomous agents powered by large language models (LLMs). Here are the main themes:

1. **LLM as a Core Controller**: LLMs are positioned as the central intelligence in autonomous agent systems, capable of performing complex tasks beyond simple text generation. They can be framed as general problem solvers, with various implementations like AutoGPT, GPT-Engineer, and BabyAGI serving as proof-of-concept demonstrations.

2. **Task Decomposition and Planning**: Effective task management is crucial for LLMs. Techniques like Chain of Thought (CoT) and Tree of Thoughts (ToT) are highlighted for breaking down complex tasks into manageable steps. CoT encourages step-by-step reasoning, while ToT explores multiple reasoning paths, enhancing the agent's problem-solving capabilities.

3. **Integration of External Tools**: The use of external tools significantly enhances LLM capabilities. Frameworks like MRKL and Toolformer allow 

### Streaming

Note that we can also stream the result token-by-token:

In [5]:
const stream = await chain.stream({context: docs});

for await (const token of stream) {
  process.stdout.write(token + "|");
}

|The| retrieved| documents| discuss| the| development| and| capabilities| of| autonomous| agents| powered| by| large| language| models| (|LL|Ms|).| Here| are| the| main| themes|:

|1|.| **|LL|M| as| a| Core| Controller|**|:| L|LM|s| are| positioned| as| the| central| intelligence| in| autonomous| agent| systems|,| capable| of| performing| complex| tasks| beyond| simple| text| generation|.| They| can| be| framed| as| general| problem| sol|vers|,| with| various| implementations| like| Auto|GPT|,| GPT|-|Engineer|,| and| Baby|AG|I| serving| as| proof|-of|-con|cept| demonstrations|.

|2|.| **|Task| De|composition| and| Planning|**|:| Effective| task| management| is| crucial| for| L|LM|s| to| handle| complicated| tasks|.| Techniques| like| Chain| of| Thought| (|Co|T|)| and| Tree| of| Thoughts| (|To|T|)| are| highlighted| for| breaking| down| tasks| into| manageable| steps| and| exploring| multiple| reasoning| paths|.| Additionally|,| L|LM|+|P| integrates| classical| planning| methods| to| en

### Go deeper

* You can easily customize the prompt. 
* You can easily try different LLMs, (e.g., [Claude](/docs/integrations/chat/anthropic)) via the `llm` parameter.

## Map-Reduce: summarize long texts via parallelization {#map-reduce}

Let's unpack the map reduce approach. For this, we'll first map each document to an individual summary using an LLM. Then we'll reduce or consolidate those summaries into a single global summary.

Note that the map step is typically parallelized over the input documents.

[LangGraph](https://langchain-ai.github.io/langgraphjs/), built on top of [@langchain/core](/docs/concepts/architecture#langchaincore), supports [map-reduce](https://langchain-ai.github.io/langgraphjs/how-tos/map-reduce/) workflows and is well-suited to this problem:

- LangGraph allows for individual steps (such as successive summarizations) to be streamed, allowing for greater control of execution;
- LangGraph's [checkpointing](https://langchain-ai.github.io/langgraphjs/how-tos/persistence/) supports error recovery, extending with human-in-the-loop workflows, and easier incorporation into conversational applications.
- The LangGraph implementation is straightforward to modify and extend, as we will see below.

### Map
Let's first define the prompt associated with the map step. We can use the same summarization prompt as in the `stuff` approach, above:

In [6]:
import { ChatPromptTemplate } from "@langchain/core/prompts";

const mapPrompt = ChatPromptTemplate.fromMessages(
  [
    ["user", "Write a concise summary of the following: \n\n{context}"]
  ]
)

We can also use the Prompt Hub to store and fetch prompts.

This will work with your [LangSmith API key](https://docs.smith.langchain.com/).

For example, see the map prompt [here](https://smith.langchain.com/hub/rlm/map-prompt).

```javascript
import { pull } from "langchain/hub";
import { ChatPromptTemplate } from "@langchain/core/prompts";

const mapPrompt = await pull<ChatPromptTemplate>("rlm/map-prompt");
```

### Reduce

We also define a prompt that takes the document mapping results and reduces them into a single output.

In [7]:
// Also available via the hub at `rlm/reduce-prompt`
let reduceTemplate = `
The following is a set of summaries:
{docs}
Take these and distill it into a final, consolidated summary
of the main themes.
`

const reducePrompt = ChatPromptTemplate.fromMessages(
  [
    ["user", reduceTemplate]
  ]
)

### Orchestration via LangGraph

Below we implement a simple application that maps the summarization step on a list of documents, then reduces them using the above prompts.

Map-reduce flows are particularly useful when texts are long compared to the context window of a LLM. For long texts, we need a mechanism that ensures that the context to be summarized in the reduce step does not exceed a model's context window size. Here we implement a recursive "collapsing" of the summaries: the inputs are partitioned based on a token limit, and summaries are generated of the partitions. This step is repeated until the total length of the summaries is within a desired limit, allowing for the summarization of arbitrary-length text.

First we chunk the blog post into smaller "sub documents" to be mapped:

In [8]:
import { TokenTextSplitter } from "@langchain/textsplitters";

const textSplitter = new TokenTextSplitter({
  chunkSize: 1000,
  chunkOverlap: 0,
});

const splitDocs = await textSplitter.splitDocuments(docs)
console.log(`Generated ${splitDocs.length} documents.`)

Generated 6 documents.


Next, we define our graph. Note that we define an artificially low maximum token length of 1,000 tokens to illustrate the "collapsing" step.

In [9]:
import {
  collapseDocs,
  splitListOfDocs,
} from "langchain/chains/combine_documents/reduce";
import { Document } from "@langchain/core/documents";
import { StateGraph, Annotation, Send } from "@langchain/langgraph";


let tokenMax = 1000


async function lengthFunction(documents) {
    const tokenCounts = await Promise.all(documents.map(async (doc) => {
        return llm.getNumTokens(doc.pageContent);
    }));
    return tokenCounts.reduce((sum, count) => sum + count, 0);
}

const OverallState = Annotation.Root({
  contents: Annotation<string[]>,
  // Notice here we pass a reducer function.
  // This is because we want combine all the summaries we generate
  // from individual nodes back into one list. - this is essentially
  // the "reduce" part
  summaries: Annotation<string[]>({
    reducer: (state, update) => state.concat(update),
  }),
  collapsedSummaries: Annotation<Document[]>,
  finalSummary: Annotation<string>,
});


// This will be the state of the node that we will "map" all
// documents to in order to generate summaries
interface SummaryState {
  content: string;
}

// Here we generate a summary, given a document
const generateSummary = async (state: SummaryState): Promise<{ summaries: string[] }> => {
  const prompt = await mapPrompt.invoke({context: state.content});
  const response = await llm.invoke(prompt);
  return { summaries: [String(response.content)] };
};


// Here we define the logic to map out over the documents
// We will use this an edge in the graph
const mapSummaries = (state: typeof OverallState.State) => {
  // We will return a list of `Send` objects
  // Each `Send` object consists of the name of a node in the graph
  // as well as the state to send to that node
  return state.contents.map((content) => new Send("generateSummary", { content }));
};


const collectSummaries = async (state: typeof OverallState.State) => {
  return {
      collapsedSummaries: state.summaries.map(summary => new Document({pageContent: summary}))
  };
}


async function _reduce(input) {
    const prompt = await reducePrompt.invoke({ docs: input });
    const response = await llm.invoke(prompt);
    return String(response.content);
}

// Add node to collapse summaries
const collapseSummaries = async (state: typeof OverallState.State) => {
  const docLists = splitListOfDocs(state.collapsedSummaries, lengthFunction, tokenMax);
  const results = [];
  for (const docList of docLists) {
      results.push(await collapseDocs(docList, _reduce));
  }

  return { collapsedSummaries: results };
}


// This represents a conditional edge in the graph that determines
// if we should collapse the summaries or not
async function shouldCollapse(state: typeof OverallState.State) {
  let numTokens = await lengthFunction(state.collapsedSummaries);
  if (numTokens > tokenMax) {
    return "collapseSummaries";
  } else {
    return "generateFinalSummary";
  }
}


// Here we will generate the final summary
const generateFinalSummary = async (state: typeof OverallState.State) => {
  const response = await _reduce(state.collapsedSummaries);
  return { finalSummary: response}
}

// Construct the graph
const graph = new StateGraph(OverallState)
  .addNode("generateSummary", generateSummary)
  .addNode("collectSummaries", collectSummaries)
  .addNode("collapseSummaries", collapseSummaries)
  .addNode("generateFinalSummary", generateFinalSummary)
  .addConditionalEdges(
    "__start__",
    mapSummaries,
    ["generateSummary"]
  )
  .addEdge("generateSummary", "collectSummaries")
  .addConditionalEdges(
    "collectSummaries",
    shouldCollapse,
    ["collapseSummaries", "generateFinalSummary"]
  )
  .addConditionalEdges(
    "collapseSummaries",
    shouldCollapse,
    ["collapseSummaries", "generateFinalSummary"]
  )
  .addEdge("generateFinalSummary", "__end__")

const app = graph.compile();

LangGraph allows the graph structure to be plotted to help visualize its function:

```javascript
// Note: tslab only works inside a jupyter notebook. Don't worry about running this code yourself!
import * as tslab from "tslab";

const image = await app.getGraph().drawMermaidPng();
const arrayBuffer = await image.arrayBuffer();

await tslab.display.png(new Uint8Array(arrayBuffer));
```

![graph_img_summarization](../../static/img/graph_img_summarization.png)

When running the application, we can stream the graph to observe its sequence of steps. Below, we will simply print out the name of the step.

Note that because we have a loop in the graph, it can be helpful to specify a [recursion_limit](https://langchain-ai.github.io/langgraphjs/reference/classes/langgraph.GraphRecursionError.html) on its execution. This will raise a specific error when the specified limit is exceeded.

In [11]:
let finalSummary = null;

for await (
  const step of await app.stream(
    {contents: splitDocs.map(doc => doc.pageContent)},
    { recursionLimit: 10 }
  )
) {
  console.log(Object.keys(step));
  if (step.hasOwnProperty("generateFinalSummary")) {
      finalSummary = step.generateFinalSummary
  }
}

[ [32m'generateSummary'[39m ]
[ [32m'generateSummary'[39m ]
[ [32m'generateSummary'[39m ]
[ [32m'generateSummary'[39m ]
[ [32m'generateSummary'[39m ]
[ [32m'generateSummary'[39m ]
[ [32m'collectSummaries'[39m ]
[ [32m'generateFinalSummary'[39m ]


In [13]:
finalSummary

{
  finalSummary: [32m'The summaries highlight the evolving landscape of large language models (LLMs) and their integration into autonomous agents and various applications. Key themes include:\n'[39m +
    [32m'\n'[39m +
    [32m'1. **Autonomous Agents and LLMs**: Projects like AutoGPT and GPT-Engineer demonstrate the potential of LLMs as core controllers in autonomous systems, utilizing techniques such as Chain of Thought (CoT) and Tree of Thoughts (ToT) for task management and reasoning. These agents can learn from past actions through self-reflection mechanisms, enhancing their problem-solving capabilities.\n'[39m +
    [32m'\n'[39m +
    [32m'2. **Supervised Fine-Tuning and Human Feedback**: The importance of human feedback in fine-tuning models is emphasized, with methods like Algorithm Distillation (AD) showing promise in improving model performance while preventing overfitting. The integration of various memory types and external memory systems is suggested to enhance c

In the corresponding [LangSmith trace](https://smith.langchain.com/public/467d535b-1732-46ee-8d3b-f44d9cea7efa/r) we can see the individual LLM calls, grouped under their respective nodes.

### Go deeper
 
**Customization** 

* As shown above, you can customize the LLMs and prompts for map and reduce stages.

**Real-world use-case**

* See [this blog post](https://blog.langchain.dev/llms-to-improve-documentation/) case-study on analyzing user interactions (questions about LangChain documentation)!  
* The blog post and associated [repo](https://github.com/mendableai/QA_clustering) also introduce clustering as a means of summarization.
* This opens up another path beyond the `stuff` or `map-reduce` approaches that is worth considering.

## Next steps

We encourage you to check out the [how-to guides](/docs/how_to) for more detail on: 

- Built-in [document loaders](/docs/how_to/#document-loaders) and [text-splitters](/docs/how_to/#text-splitters)
- Integrating various combine-document chains into a [RAG application](/docs/tutorials/rag/)
- Incorporating retrieval into a [chatbot](/docs/how_to/chatbots_retrieval/)

and other concepts.