# Text Summarization of Large Documents using LangChain 🦜🔗


### Objective

In this tutorial, you learn how to use LangChain to summarize large documents by working through the following examples:

- Stuffing method
- MapReduce method
- Refine method

### Import libraries

In [1]:
from langchain import PromptTemplate
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document
from langchain.chat_models import ChatOpenAI
from langchain.text_splitter import SentenceTransformersTokenTextSplitter, RecursiveCharacterTextSplitter
from youtube_transcript_api.formatters import TextFormatter
from youtube_transcript_api import YouTubeTranscriptApi
llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo")
word_splitter = SentenceTransformersTokenTextSplitter(chunk_overlap=0)

def get_youtube_transcript(youtube_url):
    id_input = youtube_url.split("=")[1]
    transcript = YouTubeTranscriptApi.get_transcript(id_input)
    formatter = TextFormatter()
    formatted_transcript = formatter.format_transcript(transcript)
    formatted_transcript = formatted_transcript.replace("\xa0", " ")
    return formatted_transcript

def get_token_count(text):
    text_token_count = word_splitter.count_tokens(text=text.replace("\n"," "))
    return text_token_count

def create_single_doc(text):
    return Document(page_content=text)
    
def create_multiple_docs(text):
    text_splitter = RecursiveCharacterTextSplitter(separators=["\n\n", "\n"], chunk_size=3900, chunk_overlap=100, length_function=get_token_count)
    docs = text_splitter.create_documents([text])
    return docs

def create_docs(text):
    if get_token_count(text) > 3900:
        return create_multiple_docs(text)
    else:
        return [create_single_doc(text)]


  from .autonotebook import tqdm as notebook_tqdm


In [2]:
text = get_youtube_transcript("https://www.youtube.com/watch?v=UI5tSde-VUU")
single_doc = create_single_doc(text)
print("token count of single doc: ", get_token_count(single_doc.page_content))
docs = create_multiple_docs(text)
for doc in docs:
    print(get_token_count(doc.page_content))

token count of single doc:  6133
2616
2645
1008


## Summarization with Large Documents

## Method 1: Stuffing

Stuffing is the simplest method to pass data to a language model. It "stuffs" text into the prompt as context in a way that all of the relevant information can be processed by the model to get what you want.

In LangChain, you can use `StuffDocumentsChain` as part of the `load_summarize_chain` method. What you need to do is setting `stuff` as `chain_type` of your chain.

### Prompt design with `Stuffing` chain

In [3]:
prompt_template = """Write a concise summary of the following text delimited by triple backquotes.
              Return your response in bullet points which covers the key points of the text.
              ```{text}```
              BULLET POINT SUMMARY:
  """

prompt = PromptTemplate(template=prompt_template, input_variables=["text"])
stuff_chain = load_summarize_chain(llm, chain_type="stuff", prompt=prompt)

In [4]:
try:
    stuff_chain.run([single_doc])
except Exception as e:
    print(e)

This model's maximum context length is 4097 tokens. However, your messages resulted in 6736 tokens. Please reduce the length of the messages.


As you can see, with the `stuff` method, you can summarize the entire document content with a single API call passing all data at once.

Depending on the context length of LLM, the `stuff` method would not work as it result in a prompt larger than the context length.

As expected, the code returns the expection message.

### Considerations

The `stuffing` method is a way to summarize text by feeding the entire document to a large language model (LLM) in a single call. This method has both pros and cons.

The stuffing method only requires a single call to the LLM, which can be faster than other methods that require multiple calls. When summarizing text, the LLM has access to all the data at once, which can result in a better summary.

But, LLMs have a context length, which is the maximum number of tokens that can be processed in a single call. If the document is longer than the context length, the stuffing method will not work. Also the stuffing method is not suitable for summarizing large documents, as it can be slow and may not produce a good summary.

Let's explore other approaches to help deal with having longer text than context lengh limit of LLMs.

## Method 2: MapReduce

The `MapReduce` method implements a multi-stage summarization. It is a technique for summarizing large pieces of text by first summarizing smaller chunks of text and then combining those summaries into a single summary.

In LangChain, you can use `MapReduceDocumentsChain` as part of the `load_summarize_chain` method. What you need to do is setting `map_reduce` as `chain_type` of your chain.

### Prompt design with `MapReduce` chain

In our example, you have a 32-page document that you need to summarize.

With LangChain, the `map_reduce` chain breaks the document down into 1024 token chunks max. Then it runs the initial prompt you define on each chunk to generate a summary of that chunk. In the example below, you use the following first stage or map prompt.

In [5]:
map_prompt_template = """
                      Write a summary of this chunk of text that includes the main points and any important details.
                      {text}
                      """

map_prompt = PromptTemplate(template=map_prompt_template, input_variables=["text"])

combine_prompt_template = """
                      Write a concise summary of the following text delimited by triple backquotes.
                      Return your response in bullet points which covers the key points of the text.
                      ```{text}```
                      BULLET POINT SUMMARY:
                      """

combine_prompt = PromptTemplate(
    template=combine_prompt_template, input_variables=["text"]
)

### Generate summaries using MapReduce method

After defining prompts, you initialize the associated `map_reduce_chain`.

In [6]:
map_reduce_chain = load_summarize_chain(
    llm=llm,
    chain_type="map_reduce",
    map_prompt=map_prompt,
    combine_prompt=combine_prompt,
    return_intermediate_steps=True,
    verbose=True,
)

In [22]:
map_reduce_outputs = await map_reduce_chain.acall({"input_documents": docs})



[1m> Entering new  chain...[0m


[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3m
                      Write a summary of this chunk of text that includes the main points and any important details.
                      you already know if the glasses are
coming on the video is about to get
nerdy and you are about to learn
something that is going to change the
game for you and your agency everyone is
out here with their AI automation agency
building chat Bots and trying to serve
clients with these chat Bots but there
is one key fundamental thing that is
missing and no one is talking about
right now and that is how do you build a
solid knowledge base for these chatbots
now these chat Bots predominantly run
using AI or they have some form of
knowledge base attached to them which
train them and help them actually
respond and answer questions on a set
structure that we give it so this video
is going to be breaking down how you can
build a killer knowledge base a

  map_reduce_outputs = await map_reduce_chain.acall({"input_documents": docs})


In [23]:
print(map_reduce_outputs["output_text"])

- Building a solid knowledge base is important for chatbots, as they rely on AI or a knowledge base to respond and answer questions.
- Use cases for chatbots include answering customer questions, serving as an internal resource for staff training, and using third-party data to create persona bots.
- A strict knowledge base is necessary to ensure accurate and controlled responses.
- Websites, documents, and plain text can be used to build the knowledge base.
- Free tools like Chat GPT, YouTube, and copy and paste can be used to enhance the knowledge base.
- The author provides a step-by-step guide on creating a knowledge base using Bot Press as an example.
- Continuous updating and expanding of the knowledge base is important.
- The author discusses the process of creating a knowledge base using YouTube videos.
- They emphasize the importance of having a comprehensive knowledge base to make the chatbot more effective.


### Considerations

With `MapReduce` method, the model is able to summarize a large paper by overcoming the context limit of `Stuffing` method with parallel processing.

However, the `MapReduce` requires multiple calls to the model and potentially losing context between pages.

To deal this challenge, you can try another method to summarize multiple pages at a time.

## Method 3: Refine

The Refine method is an alternative method to deal with large document summarization. It works by first running an initial prompt on a small chunk of data, generating some output. Then, for each subsequent document, the output from the previous document is passed in along with the new document, and the LLM is asked to refine the output based on the new document.

In LangChain, you can use `MapReduceDocumentsChain` as part of the load_summarize_chain method. What you need to do is setting `refine` as `chain_type` of your chain.

### Prompt design with `Refine` chain

With LangChain, the `refine` chain requires two prompts.

The question prompt to generate the output for subsequent task. The refine prompt to refine the output based on the generated content.

In this example, the question prompt is:

```
Please provide a summary of the following text.
TEXT: {text}
SUMMARY:
```

and the refine prompt is:

```
Write a concise summary of the following text delimited by triple backquotes.
Return your response in bullet points which covers the key points of the text.
```{text}```
BULLET POINT SUMMARY:
```


### Generate summaries using Refine method

After you define prompts, you initiate a summarization chain using `refine` chain type.

In [14]:
refine_chain = load_summarize_chain(
    llm,
    chain_type="refine",
    return_intermediate_steps=True,
    verbose=True
)

Then, you use the summatization chain to summarize document using Refine method.

In [15]:
refine_outputs = refine_chain({"input_documents": docs})



[1m> Entering new  chain...[0m


[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"you already know if the glasses are
coming on the video is about to get
nerdy and you are about to learn
something that is going to change the
game for you and your agency everyone is
out here with their AI automation agency
building chat Bots and trying to serve
clients with these chat Bots but there
is one key fundamental thing that is
missing and no one is talking about
right now and that is how do you build a
solid knowledge base for these chatbots
now these chat Bots predominantly run
using AI or they have some form of
knowledge base attached to them which
train them and help them actually
respond and answer questions on a set
structure that we give it so this video
is going to be breaking down how you can
build a killer knowledge base and how
you can use a ton of different tools and
I'm also going to give you a couple
examples an

In [17]:
print(refine_outputs["output_text"])

The video discusses the importance of building a solid knowledge base for chatbots in AI automation agencies. It highlights three use cases for chatbots: answering questions, serving as an internal resource, and utilizing third-party data. The speaker demonstrates how to use free tools like Chat GPT, YouTube, and copy and paste to supercharge the knowledge base. The video also explains how to scrape data from YouTube videos using a website called YouTube transcript.com and incorporate it into the knowledge base. The speaker concludes by demonstrating how to create a knowledge-based bot using Bot Press.


### Considerations

In short, the Refine method for text summarization with LLMs can pull in more relevant context and may be less lossy than Map Reduce. However, it requires many more calls to the LLM than Stuffing, and these calls are not independent, meaning they cannot be parallelized. Additionally, there is some potential dependency on the ordering of the documents. Latest documents they might become more relevant as this method suffers from recency bias.

## Conclusion


In this notebook you learn about different techniques to summarize long documents with LangChain. What you have seen in this notebook are only some of the possibilities you have. For example, there is another method called the Map-Rerank method which involves running an initial prompt on each chunk of data, which not only tries to complete a task but also gives a score for how certain it is in its answer. The responses are then ranked according to this score, and the highest score is returned.

With that being said, it is important to highlight that depending on your needs you may consider to use pure Foundational model with a custom framework to build generative ai application.

Here are some of the benefits of using a foundational model with a custom framework:

 - More flexibility to implement your application with different LLMs, prompting templates, document handling strategies and more.

 - More control to customize your generative applications based on your scenario.

 - Better performance to improve latency and scalability of your application.
