# Map reduce to Summarize Large Documents
A two-step summarization method inspired by the MapReduce paradigm:

- Map: Generate individual summaries of smaller chunks or documents.

- Reduce: Aggregate those individual summaries into a final overall summary.

## How it works:

- Step 1 (Map): Split text into chunks and summarize each chunk.

- Step 2 (Reduce): Combine those chunk summaries into a final global summary.

## Pros:

1. Scales to larger corpora.

2. Can handle more content than stuffing.

3. More structured and reliable for long documents.

## Cons:

- Slower due to multiple model calls.

- Slight loss of global context.

In [2]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("APJ_Abdul_Kalam_Long_Speech.pdf")
docs = loader.load_and_split()
docs

[Document(metadata={'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'creator': 'PyPDF', 'creationdate': 'D:20250613071130', 'source': 'APJ_Abdul_Kalam_Long_Speech.pdf', 'total_pages': 3, 'page': 0, 'page_label': '1'}, page_content='Speech by Dr. A.P.J. Abdul Kalam\nMy dear young friends,\nI am delighted to address you today. It gives me immense pleasure to stand before the bright minds\nof the nation  the youth of India, the true strength of our society.\nDream, dream, dream. Dreams transform into thoughts and thoughts result in action. You have to\ndream before your dreams can come true. Great dreams of great dreamers are always transcended.\nLet me share with you a vision. The vision of a developed India. An India that is free from poverty,\nfree from illiteracy, and free from corruption. An India where every citizen has access to clean water,\nquality education, affordable healthcare, and opportunities for growth. This vision can only be\nrealized when each one of us decid

In [3]:
from langchain.prompts import PromptTemplate
template = """
Write a concise and short summary of the following speech,
Speech : {text}

"""
prompt = PromptTemplate(input_variables=['text'],
                        template=template)


In [4]:
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_groq import ChatGroq
api_key = os.getenv("GROQ_API_KEY")
llm = ChatGroq(api_key=api_key,model="Llama3-8b-8192")

In [6]:
final_docs = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=50).split_documents(docs)
final_docs

[Document(metadata={'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'creator': 'PyPDF', 'creationdate': 'D:20250613071130', 'source': 'APJ_Abdul_Kalam_Long_Speech.pdf', 'total_pages': 3, 'page': 0, 'page_label': '1'}, page_content='Speech by Dr. A.P.J. Abdul Kalam\nMy dear young friends,\nI am delighted to address you today. It gives me immense pleasure to stand before the bright minds\nof the nation  the youth of India, the true strength of our society.\nDream, dream, dream. Dreams transform into thoughts and thoughts result in action. You have to\ndream before your dreams can come true. Great dreams of great dreamers are always transcended.\nLet me share with you a vision. The vision of a developed India. An India that is free from poverty,\nfree from illiteracy, and free from corruption. An India where every citizen has access to clean water,\nquality education, affordable healthcare, and opportunities for growth. This vision can only be\nrealized when each one of us decid

In [None]:
chunks_prompt = """
Please summarize the below speech:
Speech: {text}
Summary:
"""
map_prompt_template = PromptTemplate(input_variables=['text'],template=chunks_prompt)

In [10]:
final_prompt = '''
Provide the final summary of the entire speech with these important points.
Add a Motivational Title, start the precise summary with an introduction and provide the summary in number
points for the speech.
Speech: {text}
'''
final_prompt_template = PromptTemplate(input_variables=['text'],template=final_prompt)

In [11]:
from langchain.chains.summarize import load_summarize_chain
summary_chain = load_summarize_chain(
    llm=llm,
    chain_type='map_reduce',
    map_prompt = map_prompt_template, #used to give summary of the smaller chunks
    combine_prompt =final_prompt_template, #all the summary will be combined and sent to this template
    verbose=True
)
output = summary_chain.run(final_docs)
output


  output = summary_chain.run(final_docs)




[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Please summarize the below speech:
Speech: Speech by Dr. A.P.J. Abdul Kalam
My dear young friends,
I am delighted to address you today. It gives me immense pleasure to stand before the bright minds
of the nation  the youth of India, the true strength of our society.
Dream, dream, dream. Dreams transform into thoughts and thoughts result in action. You have to
dream before your dreams can come true. Great dreams of great dreamers are always transcended.
Let me share with you a vision. The vision of a developed India. An India that is free from poverty,
free from illiteracy, and free from corruption. An India where every citizen has access to clean water,
quality education, affordable healthcare, and opportunities for growth. This vision can only be
realized when each one of us decides to be a part of the change.
My message, especially to young peopl

'I apologize for the mistake. Since you haven\'t provided the full speech, I will provide a motivational title and a summary based on the points you provided.\n\n**Motivational Title:** "Empowering the Future: Unlocking the Power of Youth"\n\n**Summary:**\n\nHere are the key points from the speech:\n\n1. **Dream Big**: The speaker emphasizes the importance of developing a clear aim in life and setting goals that challenge you.\n2. **Acquire Knowledge**: He encourages young people to acquire knowledge through reading, exploring science and technology, and learning about history and culture.\n3. **Work Hard**: Nothing worthwhile comes easily, and perseverance is key to achieving success.\n4. **Maintain Integrity**: The speaker stresses the importance of maintaining integrity and being righteous in thought and action.\n5. **Never Give Up**: He encourages young people to learn from their mistakes and never give up in the face of failure.\n\n**Additional Points:**\n\n* **Take Responsibility