# Raw RAG 04: Summarization Techniques

Summarizing text with LLMs offers a range of approaches, each suited to different use cases and text lengths. While the process can be straightforward, the choice of method depends on factors such as text length, desired summary detail, and available computational resources.

Key summarization approaches include:

* Direct summarization: For shorter texts within the LLM's token limit, a single-pass summarization can be effective.

* Chunking and hierarchical summarization: For longer texts, breaking the content into smaller chunks, summarizing each, and then summarizing the summaries can overcome token limitations.

* Extractive summarization: Identifying and extracting key sentences or passages from the original text.

* Rolling summarization: Progressively summarizing the text by maintaining a running summary and updating it as new information is processed.

* Query-focused summarization: Tailoring the summary to answer specific questions or focus on particular aspects of the text.

* Multi-document summarization: Synthesizing information from multiple sources into a cohesive summary.

As LLM context windows expand, the ability to summarize longer texts without chunking is improving. However, human oversight and intervention remain valuable for ensuring accuracy and relevance in summaries, especially for complex or nuanced content.

The choice of summarization method should be based on your specific requirements, such as summary length, level of detail, computational resources, and the nature of the source material. Experimenting with different approaches can help determine the most effective method for your particular use case.

## Method 1: Summarizing Long Text in One Pass

This method leverages the impressive capabilities of advanced language models like Google Gemini 1.5 Pro, which boasts an extensive context window of up to 2 million tokens. This approach demonstrates the power of modern LLMs in handling lengthy texts without the need for chunking.

Key Points:
- Simplicity: Feed the entire text into the model and receive a comprehensive summary in a single operation.
- Context Preservation: By processing the full text at once, the model can maintain a holistic understanding, potentially leading to more coherent and contextually accurate summaries.
- Resource Considerations: While effective, this method may not be the most cost-efficient or fastest for extremely long texts.

The Chunking Debate:
There's ongoing discussion in the NLP community about the merits of chunking versus full-text processing. The optimal approach often depends on specific use cases:

- Full-text processing excels when:
  1. Maintaining overall context is crucial
  2. The text length falls within the model's token limit
  3. Processing time and cost are not primary concerns

- Chunking may be preferred when:
  1. Dealing with extremely long documents that exceed token limits
  2. Faster processing or lower costs are priorities
  3. The focus is on specific sections or themes within a larger text

Best Practices:
- Evaluate your specific needs: Consider factors like required summary detail, processing speed, and budget constraints.
- Experiment with both approaches: Compare the results to determine which method yields the most satisfactory summaries for your use case.
- Use judiciously: While powerful, feeding entire textbooks or very long documents may not always be necessary or efficient.

Note: To utilize this method, ensure you have access to the long-context "gemini-1.5-pro" model, capable of handling up to 2 million tokens.

In [1]:
# Install Google Generative AI package
%pip install google-generativeai python-dotenv


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [2]:
# Load the environment variables from the .env file

from dotenv import load_dotenv
import os

dotenv_path = ".env"
load_dotenv(dotenv_path=dotenv_path)

True

In [3]:
import google.generativeai as genai

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])

model = genai.GenerativeModel("gemini-1.5-pro")

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
# load the book The Dead by James Joyce

file_path = "docs/the_dead.txt"
with open(file_path, "r", encoding="utf-8") as file:
    text = file.read()
    
print(text[:690])

The Dead
James Joyce

LILY, the caretaker's daughter, was literally run off her feet. Hardly had she brought one gentleman into the little pantry behind the office on the ground floor and helped him off with his overcoat than the wheezy hall-door bell clanged again and she had to scamper along the bare hallway to let in another guest. It was well for her she had not to attend to the ladies also. But Miss Kate and Miss Julia had thought of that and had converted the bathroom upstairs into a ladies' dressing-room. Miss Kate and Miss Julia were there, gossiping and laughing and fussing, walking after each other to the head of the stairs, peering down over the banisters and calling dow


In [5]:
# Full prompt for summarizing the short novel generated by Claude 3.5 Sonnet

full_prompt = f"""You are tasked with summarizing a short novel into a concise, easy-to-read format of one to two pages. Follow these instructions carefully to create an effective summary.

First, here is the full text of the novel:

<novel>
{text}
</novel>

Now, follow these steps to create your summary:

1. Read the novel carefully, paying attention to key elements such as plot, characters, themes, and significant events.

2. Identify the main plot points and character arcs. Focus on the most important events and character developments that drive the story forward.

3. Note the central themes and messages of the novel. What are the core ideas or lessons the author is trying to convey?

4. Create a brief outline of the story's structure, including the introduction, rising action, climax, falling action, and resolution.

5. Write your summary, keeping the following guidelines in mind:
   a. Begin with a brief introduction that includes the title, author, and a one-sentence overview of the plot.
   b. Present the main events in chronological order, focusing on cause and effect relationships.
   c. Introduce main characters as they appear in the story, providing only essential details about them.
   d. Include key dialogues or quotes that are crucial to understanding the story or characters, but use them sparingly.
   e. Explain major conflicts and how they are resolved.
   f. Discuss the main themes and how they are developed throughout the story.
   g. Conclude with the resolution of the story and any final thoughts or messages the author conveys.

6. Keep your summary concise and easy to read:
   a. Aim for a length of one to two pages (approximately 500-1000 words).
   b. Use clear, simple language that can be understood by a general audience.
   c. Break your summary into paragraphs for better readability, with each paragraph focusing on a specific part of the story or theme.
   d. Use transition words and phrases to ensure smooth flow between ideas and events.

7. After writing your summary, review it to ensure you have captured the essence of the novel without including unnecessary details.

8. Proofread your summary for any grammatical errors, typos, or unclear sentences.

Present your final summary within <summary> tags. Remember to keep it between one to two pages in length, focusing on the most crucial elements of the story while maintaining an easy-to-read format."""

In [6]:
response = model.generate_content(full_prompt)

print(response.text) 

<summary>
## The Dead: A Summary 

In James Joyce's "The Dead," we are transported to a vibrant Dublin during the Christmas season to witness the annual dance and dinner hosted by the aging Morkan sisters, Kate and Julia, and their niece, Mary Jane. The story centers around Gabriel Conroy, the sisters' nephew, a self-conscious intellectual, and his wife, Gretta.  

The festive evening unfolds with a flurry of music, dancing, and lively conversation.  Gabriel, tasked with delivering the night's speech, grapples with feelings of inadequacy and insecurity, heightened by a tense encounter with Miss Ivors, a fervent Irish nationalist who mocks his intellectualism and West Briton leanings. This encounter casts a shadow over Gabriel’s anticipation of a romantic end to the evening with his wife. 

As the night progresses, we are introduced to a cast of characters, each embodying different facets of Irish society. Through their interactions, Joyce paints a poignant portrait of a nation grapplin

## Method 2: Summarizing Long Text through Chunking

While not universally favored, the chunking method remains a practical approach for summarizing extensive texts, particularly when dealing with context window limitations or resource constraints. This technique involves breaking down the text into smaller, manageable segments and summarizing each chunk individually before combining them into a cohesive whole.

Advantages:
- Overcomes token limits of models with smaller context windows
- Can be more computationally efficient for extremely long texts
- Allows for parallel processing of chunks, potentially reducing overall processing time

Challenges:
- Potential loss of broader context: Summarizing isolated chunks may miss overarching themes or connections present in the full text
- Inconsistency: Different chunks might be summarized with varying levels of detail or focus
- Redundancy: Important information may be repeated across chunk summaries

Best Practices for Chunking:
1. Intelligent segmentation: Break the text at logical points (e.g., chapter breaks, topic shifts) rather than arbitrary word counts
2. Overlap strategy: Include some overlap between chunks to maintain context
3. Hierarchical summarization: Summarize chunks, then create a meta-summary of the chunk summaries
4. Post-processing: Refine the final summary to eliminate redundancies and ensure coherence

While chunking has its limitations, it remains a valuable tool in the summarization toolkit, especially when dealing with very long documents or when using models with more restricted context windows. The key is to apply this method judiciously, being aware of its strengths and weaknesses, and to refine the results as needed to produce a high-quality final summary.

In [7]:
# Split the text into chunks of 2000 characters

from utils import TextProcessor

text_processor = TextProcessor()

In [8]:
from openai import OpenAI

client = OpenAI()

In [9]:
def summarize_chunk(chunk, max_output_length=150):
    """Summarize a single chunk of text using the OpenAI API."""
    response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant that summarizes text.",
            },
            {
                "role": "user",
                "content": f"Summarize the following text in a concise paragraph:\n\n{chunk}",
            },
        ],
        max_tokens=max_output_length,
    )
    return response.choices[0].message.content.strip()

In [10]:
# Summarize the entire novel by breaking it into chunks and summarizing each chunk

def summarize_novel(long_text, max_output_length=150):
    # Split the novel into chunks
    chunks = text_processor.text_splitter(long_text, char_limit=3000)

    # Summarize each chunk
    chunk_summaries = []
    for i, chunk in enumerate(chunks):
        print(f"Summarizing chunk {i+1}/{len(chunks)}...")
        summary = summarize_chunk(chunk, max_output_length)
        chunk_summaries.append(summary)

    # Combine chunk summaries
    combined_summary = "\n\n".join(chunk_summaries)

    # Create a final summary of the entire novel
    final_summary = summarize_chunk(combined_summary, max_output_length)

    return final_summary

In [11]:
summary = summarize_novel(text)

print("\nFinal Summary:")
print(summary)

Summarizing chunk 1/30...
Summarizing chunk 2/30...
Summarizing chunk 3/30...
Summarizing chunk 4/30...
Summarizing chunk 5/30...
Summarizing chunk 6/30...
Summarizing chunk 7/30...
Summarizing chunk 8/30...
Summarizing chunk 9/30...
Summarizing chunk 10/30...
Summarizing chunk 11/30...
Summarizing chunk 12/30...
Summarizing chunk 13/30...
Summarizing chunk 14/30...
Summarizing chunk 15/30...
Summarizing chunk 16/30...
Summarizing chunk 17/30...
Summarizing chunk 18/30...
Summarizing chunk 19/30...
Summarizing chunk 20/30...
Summarizing chunk 21/30...
Summarizing chunk 22/30...
Summarizing chunk 23/30...
Summarizing chunk 24/30...
Summarizing chunk 25/30...
Summarizing chunk 26/30...
Summarizing chunk 27/30...
Summarizing chunk 28/30...
Summarizing chunk 29/30...
Summarizing chunk 30/30...

Final Summary:
"In James Joyce's 'The Dead,' the narrative unfolds around an annual dance hosted by the Misses Morkan—sisters Kate and Julia, and their niece Mary Jane. As guests arrive, there is a 

The summaries of individual chunks often lack the depth and coherence of a summary generated from the entire text. This highlights a key limitation of the chunking method: the potential loss of overall context and connections between different parts of the document. To improve chunk-based summaries, we can increase the requested output length for each chunk, allowing for more detailed and nuanced summaries. This approach helps preserve important information that might otherwise be lost in overly concise summaries. However, it's important to balance this with the need for a concise final summary. Experimenting with different output lengths can help find the right balance between detail and brevity, ultimately leading to a more comprehensive and accurate representation of the original text.

In [12]:
# Let's try to summarize the novel with a longer output length of 300 tokens

summary = summarize_novel(text, max_output_length=300)

print("Final Summary with 300 tokens output:")
print(summary)

Summarizing chunk 1/30...
Summarizing chunk 2/30...
Summarizing chunk 3/30...
Summarizing chunk 4/30...
Summarizing chunk 5/30...
Summarizing chunk 6/30...
Summarizing chunk 7/30...
Summarizing chunk 8/30...
Summarizing chunk 9/30...
Summarizing chunk 10/30...
Summarizing chunk 11/30...
Summarizing chunk 12/30...
Summarizing chunk 13/30...
Summarizing chunk 14/30...
Summarizing chunk 15/30...
Summarizing chunk 16/30...
Summarizing chunk 17/30...
Summarizing chunk 18/30...
Summarizing chunk 19/30...
Summarizing chunk 20/30...
Summarizing chunk 21/30...
Summarizing chunk 22/30...
Summarizing chunk 23/30...
Summarizing chunk 24/30...
Summarizing chunk 25/30...
Summarizing chunk 26/30...
Summarizing chunk 27/30...
Summarizing chunk 28/30...
Summarizing chunk 29/30...
Summarizing chunk 30/30...
Final Summary with 300 tokens output:
In James Joyce's story "The Dead," readers are immersed in the intricate social dynamics of an annual dance hosted by the Morkan sisters and their niece on Usher

The results show a noticeable improvement compared to the previous attempt. As demonstrated, generating richer, more detailed summaries for each chunk yields a better overall representation of the text than shorter, more concise summaries.

To further enhance our summarization approach, let's explore the rolling summarization method:

Begin by summarizing the first chunk of text.
For each subsequent chunk: 

a. Combine it with the existing summary. 

b. Summarize this combined text.

Continue this iterative process through the entire document.
The final output is a comprehensive summary of the complete text, developed incrementally.
This rolling approach offers several advantages:

Maintains context throughout the summarization process
Allows for the integration of new information with previously summarized content
Potentially captures overarching themes and narratives more effectively
Reduces redundancy often found in chunk-based methods
By employing this technique, we aim to create a more cohesive and contextually rich summary that better reflects the flow and interconnectedness of the original text. This method is particularly useful for long-form content like novels, where plot development and character arcs span the entire work.

In [13]:
# Rolling summary implementation

def rolling_summarize_novel(novel_text, chunk_size=3000, summary_size=300):
    chunks = text_processor.text_splitter(novel_text, chunk_size)
    print("Processing chunk 1/{}...".format(len(chunks)))
    current_summary = summarize_chunk(chunks[0], summary_size)

    for i, chunk in enumerate(chunks[1:], 1):
        print(f"Processing chunk {i+1}/{len(chunks)}...")

        # Combine the current summary with the new chunk
        combined_text = f"{current_summary}\n\nNew content:\n{chunk}"

        # Summarize the combination
        current_summary = summarize_chunk(combined_text, summary_size)

    return current_summary

In [14]:
final_summary = rolling_summarize_novel(text)
print("Final Summary:")
print(final_summary)

Processing chunk 1/30...
Processing chunk 2/30...
Processing chunk 3/30...
Processing chunk 4/30...
Processing chunk 5/30...
Processing chunk 6/30...
Processing chunk 7/30...
Processing chunk 8/30...
Processing chunk 9/30...
Processing chunk 10/30...
Processing chunk 11/30...
Processing chunk 12/30...
Processing chunk 13/30...
Processing chunk 14/30...
Processing chunk 15/30...
Processing chunk 16/30...
Processing chunk 17/30...
Processing chunk 18/30...
Processing chunk 19/30...
Processing chunk 20/30...
Processing chunk 21/30...
Processing chunk 22/30...
Processing chunk 23/30...
Processing chunk 24/30...
Processing chunk 25/30...
Processing chunk 26/30...
Processing chunk 27/30...
Processing chunk 28/30...
Processing chunk 29/30...
Processing chunk 30/30...
Final Summary:
During a hotel stay, Gabriel Conroy delves into his feelings about his wife, Gretta, following the revelation of her past love for Michael Furey, a young man who died tragically. This disclosure leads Gabriel to fe

## Conclusion:

The exploration of various summarization techniques in this notebook underscores the importance of tailoring your approach to your specific needs and system constraints. There is no one-size-fits-all solution; the best method depends on a careful consideration of your unique circumstances.

Key factors to consider when choosing a summarization approach:

1. System Limitations:
   - Available computational resources
   - Token limits of your chosen language model
   - Processing time constraints

2. Objectives:
   - Desired summary length and level of detail
   - Importance of preserving context and overarching themes
   - Specific focus areas or types of information to prioritize

3. Document Characteristics:
   - Length and structure of typical documents
   - Complexity and interconnectedness of content
   - Presence of domain-specific terminology or concepts

4. Control and Customization:
   - Ability to fine-tune the summarization process
   - Flexibility to adjust parameters based on different document types
   - Capacity to incorporate domain knowledge or specific rules

Recommendations for choosing and implementing a summarization solution:

1. Assess Your Needs: Clearly define what you're trying to achieve with your summaries. Are you looking for brief overviews or detailed abstracts? Do you need to capture specific types of information?

2. Evaluate Your Constraints: Understand your system's limitations in terms of processing power, memory, and time. This will help you determine whether methods like full-text summarization are feasible or if you need to consider chunking approaches.

3. Experiment and Compare: Test different summarization methods on a representative sample of your documents. Compare the results against your objectives and constraints.

4. Prioritize Controllability: Opt for solutions that allow you to adjust parameters and fine-tune the process. This ensures you can adapt the summarization to different document types or changing needs.

5. Consider Hybrid Approaches: Don't be afraid to combine methods. For example, you might use chunking for initial processing but then apply a rolling summarization for final refinement.

6. Implement Feedback Loops: Set up a system to regularly evaluate the quality of your summaries and make adjustments as needed.

7. Balance Automation and Human Oversight: While automation is powerful, maintaining some level of human review can help ensure the summaries meet your quality standards and capture critical information.

By carefully considering these factors and maintaining control over your summarization process, you can develop a solution that not only meets your current needs but can also be adapted as your requirements evolve. Remember, the goal is to create a system that serves your specific purposes, rather than trying to force-fit a generic solution. With the right approach, you can achieve efficient, accurate, and useful summaries tailored to your unique context.