# Text Summarization

This notebook showcases multiple methods for text summarization using LLMs.
- Stuff: simply "stuffing" the whole document into a prompt
- Iterative Refinement: split document into a sequence of chunks and summarize iteratively adding one piece at a time

In [1]:
import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from openai import OpenAI

In [3]:
client = OpenAI()

## Stuff Method

This is the simplest method of summarization, which simply includes the entire text of the document to be summarized within the prompt. This only works if the document is short enough to fit in the selected model's context window.

In [8]:
def stuff_summary_prompt(info_file) :
    '''Load info_file and add to summarization prompt.'''

    # read input file
    with open(info_file, 'r') as f :
        lines = f.readlines()
    info = [line.strip() for line in lines]

    prompt = f"Write a detailed summary of the following:\\n\\n{info}"

    messages = [
        {
            "role": "user",
            "content": prompt
        }
    ]
    
    return messages

Asking for a summary of the US Constitution gives a strong response!

In [10]:
# get messages
usconst_file = '../data/us_constitution.txt'
messages = stuff_summary_prompt(usconst_file)

# send prompt to LLM
completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)
completion_message = completion.choices[0].message

print(completion_message.content)

The Constitution of the United States is the supreme law of the land, established on September 17, 1787, to form a more perfect union among the states. Its preamble outlines its purposes, namely to establish justice, ensure domestic tranquility, provide for national defense, promote general welfare, and secure liberty for current and future generations.

**Article I** grants all legislative powers to a bicameral Congress, composed of the Senate and the House of Representatives. Key provisions include the direct election of Representatives every two years, eligibility criteria for Members of the House and Senate, the apportionment of Representatives, and the powers and responsibilities of both chambers, including the initiation of revenue bills and the conduct of impeachment.

**Article II** establishes the executive branch, headed by the President, who is elected for a four-year term alongside the Vice President. It delineates their powers, including being the commander in chief of the

Asking for a summary of a full novel results in an error. It's too long!

In [13]:
# get messages
fotr_file = '../data/LotR/FotR.txt'
messages = stuff_summary_prompt(fotr_file)

# send prompt to LLM
try :
    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages
    )
    completion_message = completion.choices[0].message

    print(completion_message.content)
except Exception as e :
    print(e)

Error code: 429 - {'error': {'message': 'Request too large for gpt-4o-mini in organization org-O3XtokNjayTiOvXFMHAw1nS1 on tokens per min (TPM): Limit 200000, Requested 252930. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}}


## Iterative Refinement

### TO DO