# Summarization 

While doing summarization, the best way is to do the prompt engineering or excelling the output through prompt. 

In [12]:
import os 
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv(), override=True)

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.schema import (AIMessage, HumanMessage, SystemMessage)

## Summarizing with basic prompt

In [11]:
text = """Embark on a voyage of a lifetime with One Piece. The epic anime series created by renowned mangaka Eiichiro Oda is a global phenomenon, captivating the hearts of fans across generations throughout its 25-year span. This thrilling high seas adventure is filled with unwavering friendship, epic battles for freedom, and the relentless pursuit of dreams. Join Monkey D. Luffy and his lovable pirate crew as they discover the true meaning of power and justice in this great pirate era.

Monkey D. Luffy refuses to let anyone or anything stand in the way of his quest to become King of the Pirates. With his rubber-like stretching powers granted by the supernatural Devil Fruit, the spirited young pirate seeks the legendary treasure known as the One Piece. He’ll chart a course for the treacherous waters of the Grand Line and recruit a motley crew to build his Straw Hat Pirates one bond at a time. This is one captain who’ll never drop anchor until he and his friends all reach their dreams!

One Piece boasts more than 1100 episodes. Currently in the Egghead Arc, the Straw Hats finally meet the long awaited Dr. Vegapunk on Egghead Island. Crunchyroll is home to every subbed episode and also every English dubbed episode, with over 1000 and counting. In addition, One Piece has 13 television specials and 15 movies, with the latest, One Piece Film Red, being the highest-grossing film in the franchise. One Piece is produced by Toei Animation."""

messages = [
    SystemMessage(content="You are a expert copywriter with expertize in summarizing documents."),
    HumanMessage(content=f"Please provide a short and concise summary of the following text: \n {text}")
]

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=1)

print("number of tokens: ", llm.get_num_tokens(text))

output = llm.invoke(messages)
print("Basic Summarization: ", output.content)

number of tokens:  309
Basic Summarization:  One Piece is a global phenomenon anime series by Eiichiro Oda, spanning 25 years. It follows Monkey D. Luffy, who gains rubber-like powers from a Devil Fruit, on his high-seas adventure to become King of the Pirates and find the legendary One Piece treasure. Alongside his loyal Straw Hat Pirates, he navigates the Grand Line, fighting for friendship, freedom, and dreams. The extensive franchise, produced by Toei Animation, includes over 1100 episodes (available on Crunchyroll, currently in the Egghead Arc), 15 movies, and 13 TV specials.


## Summarizing with Prompt Template

In [14]:
from langchain.chains import LLMChain, SimpleSequentialChain
from langchain import  PromptTemplate

In [18]:
template = ''' 
write a concise and short summary of the following text: 
TEXT: `{text}`
Translate the summary into {language}
'''

Prompt = PromptTemplate(
    input_variable = ["text", "language"],
    template=template

)

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=1)

print("Number of tokens : ", llm.get_num_tokens(Prompt.format(text=text, language="English")))

chain = LLMChain(llm=llm, prompt=Prompt)

summary = chain.invoke({"text": text, "language": "Nepali"})

print("summary: ", summary)

print(summary["text"])


Number of tokens :  334
summary:  {'text': '**Concise Summary:**\n\n"One Piece" is a 25-year-old epic anime series by Eiichiro Oda, a global phenomenon. It follows Monkey D. Luffy, a rubber-powered pirate, and his Straw Hat crew on a high-seas adventure filled with friendship and battles. Their quest is to find the legendary One Piece treasure and fulfill Luffy\'s dream of becoming the King of the Pirates. The series boasts over 1100 episodes, along with multiple movies and specials.\n\n---\n\n**Nepali Translation:**\n\n"वन पीस" (One Piece) इइचिरो ओडाद्वारा सिर्जना गरिएको २५ वर्ष पुरानो एक विश्वव्यापी प्रसिद्ध एनिमे शृङ्खला हो। यसले रबर-शक्ति भएको समुद्री डाकू मन्की डी. लफी (Monkey D. Luffy) र उनको स्ट्र ह्याट (Straw Hat) टोलीको समुद्री यात्रालाई पछ्याउँछ, जुन मित्रता र लडाइँले भरिएको छ। उनीहरूको खोज पौराणिक वन पीस खजाना पत्ता लगाउने र लफीलाई समुद्री डाकुहरूको राजा बन्ने सपना पूरा गर्ने हो। यो शृङ्खलामा ११०० भन्दा बढी एपिसोडहरू, धेरै चलचित्रहरू र विशेष कार्यक्रमहरू छन्।', 'language': '

## Summarizing Using StuffDocumentChain

In [59]:
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document

with open("./files/sj.txt", "r") as f:
    text = f.read()
    # print(text)

docs = [Document(page_content=text)]
prompt_template = """ Write a concise and very short summary of the following test \n 
TEXT: `{text}`
"""

prompt = PromptTemplate(
    input_variables=["text"],
    template=prompt_template
)

chain = load_summarize_chain(
    llm,
    chain_type="stuff",
    prompt=prompt,
    verbose=False
)

output = chain.invoke({"text": text})

print(output)


ValueError: Missing some input keys: {'input_documents'}

In [57]:
output["input_documents"]

[Document(metadata={}, page_content='I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.\n\nThe first story is about connecting the dots.\n\nI dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?\n\nIt started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got 

In [20]:
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document

In [None]:
with open("./files/sj.txt", 'r') as f:
    text = f.read()

# text 
docs = [Document(page_content=text)]
prompt_template = """ Write a concise and short summary of the following text 
TEXT: {text}"""

prompt = PromptTemplate(
    input_variables=['text'],
    template=prompt_template
)

chain = load_summarize_chain(
    llm, 
    chain_type="stuff",
    prompt=prompt,
    verbose=False
)

output_summary = chain.invoke(docs)

In [24]:
print(output_summary)

{'input_documents': [Document(metadata={}, page_content='I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.\n\nThe first story is about connecting the dots.\n\nI dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?\n\nIt started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on 

# Summarizing Large Documents Using map_reduce



In [27]:
from langchain import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import RecursiveCharacterTextSplitter


In [28]:
with open("./files/sj.txt", encoding="utf-8") as f:
    text = f.read()

llm = ChatGoogleGenerativeAI(model='gemini-2.5-flash', temperature=1)


In [29]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
chunks = text_splitter.create_documents([text])

In [30]:
len(chunks)

17

In [31]:
chain = load_summarize_chain(
    llm,
    chain_type="map_reduce",
    verbose=False
)
output_summary = chain.invoke(chunks)


Retrying langchain_google_genai.chat_models._chat_with_retry.<locals>._chat_with_retry in 2.0 seconds as it raised ResourceExhausted: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.
* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 10
Please retry in 11.545160538s. [violations {
  quota_metric: "generativelanguage.googleapis.com/generate_content_free_tier_requests"
  quota_id: "GenerateRequestsPerMinutePerProjectPerModel-FreeTier"
  quota_dimensions {
    key: "location"
    value: "global"
  }
  quota_dimensions {
    key: "model"
    value: "gemini-2.5-flash"
  }
  quota_value: 10
}
, links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
, retry_delay {
  seconds: 11
}
].
Retrying langchain_google_genai.chat_models._chat_with_retry.<lo

In [32]:
print(output_summary)

{'input_documents': [Document(metadata={}, page_content='I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.\n\nThe first story is about connecting the dots.\n\nI dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?'), Document(metadata={}, page_content='It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted

In [34]:
output_summary["input_documents"]

[Document(metadata={}, page_content='I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.\n\nThe first story is about connecting the dots.\n\nI dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?'),
 Document(metadata={}, page_content='It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my pare

In [None]:
llm.get_num_tokens()

In [35]:
for doc in output_summary["input_documents"]:
    print(doc.page_content)

I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.

The first story is about connecting the dots.

I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?
It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got a call in the middle of the night asking: “W

In [36]:
chain.llm_chain.prompt.template

'Write a concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'

In [38]:
chain.combine_document_chain.llm_chain.prompt.template

'Write a concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'

## map reduce with custom prompts

In [41]:
map_prompts = ''' 
Write a short and concise summary of the following : 
TEXT: {text}
concise summary: 
'''

max_prompt_template = PromptTemplate(
    input_variables=["text"],
    template=map_prompts
)

In [40]:
combine_prompt = ''' 
Write a concise summary of the following text that covers the key points. 

Add a title to the summary, 
start your summary with an INTRODUCTION PARAGRAPH that gives an overview of 
the topic FOLLOWED by BULLET POINTS if possible AND end the summary with a CONCLUSION phares. 

TEXT: {text}
'''

combine_prompt_template = PromptTemplate(
    template=combine_prompt, input_variables=['text']
)

In [43]:
summary_chain = load_summarize_chain(
    llm=llm,
    chain_type="map_reduce",
    map_prompt=max_prompt_template,
    combine_prompt=combine_prompt_template,
    verbose=False

)

output = summary_chain.invoke(chunks)


Retrying langchain_google_genai.chat_models._chat_with_retry.<locals>._chat_with_retry in 2.0 seconds as it raised ResourceExhausted: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.
* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 10
Please retry in 7.124056418s. [violations {
  quota_metric: "generativelanguage.googleapis.com/generate_content_free_tier_requests"
  quota_id: "GenerateRequestsPerMinutePerProjectPerModel-FreeTier"
  quota_dimensions {
    key: "location"
    value: "global"
  }
  quota_dimensions {
    key: "model"
    value: "gemini-2.5-flash"
  }
  quota_value: 10
}
, links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
, retry_delay {
  seconds: 7
}
].
Retrying langchain_google_genai.chat_models._chat_with_retry.<loca

In [44]:
print(output)

{'input_documents': [Document(metadata={}, page_content='I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.\n\nThe first story is about connecting the dots.\n\nI dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?'), Document(metadata={}, page_content='It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted

In [48]:
for i in output['input_documents']:
    print(i.page_content)

I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.

The first story is about connecting the dots.

I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?
It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got a call in the middle of the night asking: “W