# Document Summarization


This notebook demonstrates an application of long document summarization techniques to a work of literature using Granite.

### Python version

Ensure you are running python 3.10, 3.11, or 3.12 in a freshly-created virtual environment.

In [None]:
import sys
assert sys.version_info >= (3, 10) and sys.version_info < (3, 13), "Use Python 3.10, 3.11, or 3.12 to run this notebook."

## Install Dependencies


Granite utils provides some helpful functions for recipes.

In [None]:
! pip install "git+https://github.com/ibm-granite-community/granite-kitchen" "transformers>=4.45.2" torch tiktoken

## Serving the Granite AI model


This notebook requires IBM Granite models to be served by an AI model runtime so that the models can be invoked or called. This notebook can use a locally accessible [Ollama](https://github.com/ollama/ollama) server to serve the models, or the [Replicate](https://replicate.com) cloud service.

During the pre-work, you may have either started a local Ollama server on your computer, or setup Replicate access and obtained an [API token](https://replicate.com/account/api-tokens).

## Select your model

Select a Granite model to use. Here we use a Langchain client to connect to the model. If there is a locally accessible Ollama server, we use an Ollama client to access the model. Otherwise, we use a Replicate client to access the model.

When using Replicate, if the `REPLICATE_API_TOKEN` environment variable is not set, or a `REPLICATE_API_TOKEN` Colab secret is not set, then the notebook will ask for your [Replicate API token](https://replicate.com/account/api-tokens) in a dialog box.

In [None]:
import os
import requests
from langchain_ollama.llms import OllamaLLM
from langchain_community.llms import Replicate
from ibm_granite_community.notebook_utils import get_env_var

try: # Look for a locally accessible Ollama server for the model
    response = requests.get(os.getenv("OLLAMA_HOST", "http://127.0.0.1:11434"))
    model = OllamaLLM(model="granite3-dense:8b")
except Exception: # Use Replicate for the model
    set_env_var("REPLICATE_API_TOKEN")
    model = Replicate(model="ibm-granite/granite-3.0-8b-instruct")


use ollama


## Download a book

Here we fetch H.D. Thoreau's "Walden" from [Project Gutenberg](https://www.gutenberg.org/) for summarization.

We have to chunk the book text so that chunks will fit in the context window size of the AI model.

### Count the tokens

Before sending our book chunks to the AI model, it's crucial to understand how much of the model's capacity we're using. Language models typically have a limit on the number of tokens they can process in a single request.

Key points:
- We're using the [`granite-3.2`](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct) model, which has a context window of 128K tokens.
- Tokenization can vary between models, so we use the specific tokenizer for our chosen model.

Understanding token count helps us optimize our prompts and ensure we're using the model efficiently.

In [12]:
from transformers import AutoTokenizer

model_path = "ibm-granite/granite-3.2-2b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)

Your model uses the tokenizer GPT2TokenizerFast
Your text for summarization has 2863 tokens. 
Your book has 184361 tokens and 644843 chars.


### Summary of Summaries

Here we use a hierarchical abstractive summarization technique to adapt to the context length of the model. Our approach uses [Docling](https://ds4sd.github.io/docling/) to understand the document's structure, chunk the document into text passages, and group the text passages by chapter which we can then summarize. 

In [6]:
import itertools
from typing import Iterator, Callable
from docling.document_converter import DocumentConverter
from docling_core.transforms.chunker.hierarchical_chunker import HierarchicalChunker
from docling_core.transforms.chunker.base import BaseChunk

We construct our final prompt and send it to the AI model being served for processing.

In [13]:
prompt = prompt_guide_template.format(prompt=f"""
Summarize the following text:
{contents}
""")

def merge_chunks(chunks: Iterator[BaseChunk], *, headings: Callable[[BaseChunk], list[str]] = lambda c: c.meta.headings) -> Iterator[dict[str, str]]:
    """Merge chunks having the same headings"""
    prior_headings: list[str] | None = None
    document: dict[str, str] = {}
    for chunk in chunks:
        text = chunk.text.replace('\r\n', '\n')
        current_headings = headings(chunk)
        if prior_headings != current_headings:
            if document:
                yield document
            prior_headings = current_headings
            document = {'title': " - ".join(current_headings), 'text': text}
        else:
            document['text'] += f"\n\n{text}"
    if document:
        yield document

def chunk_dropwhile(chunk: BaseChunk) -> bool:
    """Ignore front matter prior to the book start"""
    return "WALDEN" not in chunk.meta.headings

def chunk_takewhile(chunk: BaseChunk) -> bool:
    """Ignore remaining chunks once we see this heading"""
    return "ON THE DUTY OF CIVIL DISOBEDIENCE" not in chunk.meta.headings

Divide the full text into smaller passages for separate processing.

It seems like you're sharing a text about the struggles and hardships faced by individuals who are bound by their inherited farms, houses, barns, cattle, farming tools. The author empathize with others or himself or his thoughts, which make his wiser?


In [8]:
from langchain.text_splitter import TokenTextSplitter

documents: list[dict[str, str]] = list(merge_chunks(
    chunk_document(
        "https://www.gutenberg.org/cache/epub/205/pg205-images.html",
        dropwhile=chunk_dropwhile,
        takewhile=chunk_takewhile,
    ),
    headings=chunk_headings,
))

print(f"{len(documents)} documents created")
print(f"Max document size: {max(len(tokenizer.tokenize(document['text'])) for document in documents)} tokens")

The text is 184361 tokens.
Chunk count: 57
Max chunk tokens: 3351


## Summarize the chunks

Here we define a method to generate a response using a list of documents and a user prompt about those documents. 

We create the prompt according to the [Granite Prompting Guide](https://www.ibm.com/granite/docs/models/granite/#chat-template) and provide the documents using the `documents` parameter.

In [9]:
def generate(user_prompt: str, documents: list[dict[str, str]]):
    """Use the chat template to format the prompt"""
    prompt = tokenizer.apply_chat_template(
        conversation=[{
            "role": "user",
            "content": user_prompt,
        }],
        documents=documents, # This uses the documents support in the Granite chat template
        add_generation_prompt=True,
        tokenize=False,
    )

    print(f"Input size: {len(tokenizer.tokenize(prompt))} tokens")
    output = model.invoke(prompt)
    print(f"Output size: {len(tokenizer.tokenize(output))} tokens")

    return output


1. Prompt size: 3240 tokens
1. Output size: 273 tokens
Summary 1:
It seems like the text is discussing the concept of servitude and how people are often so focused on their daily tasks and responsibilities that they lose sight of what truly matters in life. The author suggests that many people are trapped in a "fool's life" due to their blind obedience to societal norms and expectations.

The author uses the example of Deucalion and Pyrrha, who created men by throwing stones over their heads behind them, to illustrate how people often follow outdated or harmful beliefs without questioning their validity. The author also mentions that many people are so occupied with their daily labors that they do not have time to reflect on their lives or pursue more meaningful endeavors.

The author then goes on to discuss the idea of preserving one's finest qualities, such as honesty and integrity, by treating oneself and others with kindness and understanding. The author suggests that it is never t

For each chapter, we create a separate summary. This can take a few minutes.

In [10]:
if get_env_var('GRANITE_TESTING', 'false').lower() == 'true':
    documents = documents[:5] # shorten testing work

user_prompt = """\
Using only the the book chapter document, compose a summary of the book chapter.
Your response should only include the summary. Do not provide any further explanation."""

summaries: list[dict[str, str]] = []

for document in documents:
    print(f"============================= {document['title']} =============================")
    output = generate(user_prompt, [document])
    summaries.append({'title': document['title'], 'text': output})

print("Summary count: " + str(len(summaries)))

Prompt size: 1638 tokens
The text discusses various themes related to human life, including the importance of simplicity, self-sufficiency, and engaging in meaningful work. It criticizes societal norms and expectations that often lead people to focus on superficial matters and material possessions, rather than genuine culture and personal growth. The author suggests that humans should strive for a more authentic way of living, connected to nature and their surroundings, and that true wisdom can be acquired through practical experiences and living life fully.

The text also emphasizes the value of labor and the importance of engaging in creative and meaningful work. It argues that people should not rely on professionals for tasks like building a home, but rather take pleasure in constructing their own houses. The author criticizes the current system of education, stating that it often focuses on trivial matters rather than important things.

Additionally, the text discusses the idea of 

## Create the Final Summary

Now we need to summarize the chapter summaries. We prompt the model to create a unified summary of the chapter summaries we previously generated.

In [14]:
user_prompt = """\
Using only the book chapter summary documents, compose a single, unified summary of the book.
Your response should only include the unified summary. Do not provide any further explanation."""

output = generate(user_prompt, summaries)
print(output)

1. Prompt size: 3240 tokens
1. Output size: 248 tokens
Summary 1:
The text discusses the idea of how many people live their lives in a way that is not truly fulfilling or meaningful, often due to societal pressures and expectations. It uses the metaphor of being enslaved by various "masters" to describe this situation. The author suggests that people should not be content with simply going through the motions of life and should strive for a more authentic existence.

The text also mentions how many people are too focused on material possessions and superficial concerns, which prevents them from truly living and experiencing life's finer fruits. It argues that true integrity and growth require time and space to reflect and learn, but these are often lacking in modern society due to the demands of work and other responsibilities.

The author also discusses the idea of prejudice and how it can prevent people from seeing things clearly. They suggest that it is never too late to challenge o

In [19]:
summaries_of_summaries = []
prompt_summary_template = prompt_guide_template.format(prompt="""\
Summarize the following text using only the information found in the text:
{text}
""")

for i in range(0, len(summaries), 10):
    text = "\n\n".join(summaries[i:i+10])
    prompt = prompt_summary_template.format(text=text)
    print(f"{i + 1}. Prompt size: {len(tokenizer.tokenize(prompt))} tokens")
    output = model.invoke(
        prompt,
        model_kwargs={
            "max_tokens": 2000, # Set the maximum number of tokens to generate as output.
            "min_tokens": 200, # Set the minimum number of tokens to generate as output.
            "temperature": 0.75,
            "system_prompt": "You are a helpful assistant.",
            "presence_penalty": 0,
            "frequency_penalty": 0
        }
    )
    print(f"{i + 1}. Output size: {len(tokenizer.tokenize(output))} tokens")
    summary = f"Summary {i+1}:\n{output}\n\n"
    summaries_of_summaries.append(summary)
    print(summary)

print(f"Summary of summaries count: {len(summaries_of_summaries)}")
summary_of_summaries_contents = "\n\n".join(summaries_of_summaries)
print(f"Total: {len(tokenizer.tokenize(summary_of_summaries_contents))} tokens")

1. Prompt size: 2544 tokens
1. Output size: 234 tokens
Summary 1:
Summary 1: The text discusses the concept of ownership and the cost of living in a civilized society versus a savage one. The author argues that many people in civilized societies are poor because they have become trapped in a system where they must work long hours to pay for the upkeep of their labor, which can be a burdensome encumbrance as belonging to the cost of his dwelling, and all this while the farmer’s house is usually the value of the land itself. When we consider these things, the author is discussing the concept of ownership and the cost of living in different societies. The author argues that many people in civilized societies are often on material possessions can lead to astray from our true needs and values.

The author also mentions that the cost of owning a house is not just financial, but also includes the time and effort required to maintain it. The author suggests that we should focus on improving ou

In [20]:
prompt = prompt_guide_template.format(prompt=f"""
A text was summarized in separate passages; those passage summaries are provided below.

{summary_of_summaries_contents}

From these summaries alone, compose a single, unified summary of the text.
""")
print(f"Prompt size: {len(tokenizer.tokenize(prompt))} tokens")
output = model.invoke(
    prompt,
    model_kwargs={
        "max_tokens": 2000, # Set the maximum number of tokens to generate as output.
        "min_tokens": 500, # Set the minimum number of tokens to generate as output.
        "temperature": 0.75,
        "system_prompt": "You are a helpful assistant.",
        "presence_penalty": 0,
        "frequency_penalty": 0
    }
    )

print(output)

Prompt size: 1058 tokens
The text explores various themes related to ownership, nature, and societal expectations. The author discusses the concept of ownership in civilized societies, arguing that many people are trapped in a system where they must work long hours to pay for their labor and material possessions, which can lead them astray from their true needs and values. They suggest focusing on personal growth before pursuing material success.

In contrast, the author expresses fondness for nighttime sounds in nature, such as the hooting owl, bullfrogs, and dogs, finding them expressive of undeveloped nature and ancient spirits. They also mention observing various creatures around a pond or lake, such as frogs, bugs, fish, and birds, and reflect on the relationship between humans and nature.

The author's personal experiences living off the grid, using fire for warmth, and their perspective on life, nature, and human relationships are also discussed. They emphasize the importance of