# Quickstart with Q&A Chatbot

This project aims to build an Vanilla Q&A Chatbot with Streamlit using openai LLM models.


# Setup

In [1]:
import os

def create_folders_if_not_exist():
    folders = ['data', 'image', 'utils', 'paper']
    for folder in folders:
        if not os.path.exists(folder):
            os.makedirs(folder)
            print(f"Folder '{folder}' has been created.")
        else:
            print(f"Folder '{folder}' already exists.")

create_folders_if_not_exist()

Folder 'data' already exists.
Folder 'image' already exists.
Folder 'utils' already exists.
Folder 'paper' already exists.


In [2]:
from dotenv import load_dotenv

load_dotenv()

True

In [3]:
## openai
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

## langsmith tracking
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")

## huggingface
os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN")

In [4]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# First way to summarize the content 

Use an LLM as an expert to summarizing speeches

In [5]:
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage,
)



In [6]:
speech = """
Turning once again, and this time more generally, to the question of invasion, I would observe that there has never been a period in all these long centuries of which we boast when an absolute guarantee against invasion, still less against serious raids, could have been given to our people. In the days of Napoleon, of which I was speaking just now, the same wind which would have carried his transports across the Channel might have driven away the blockading fleet. There was always the chance, and it is that chance which has excited and befooled the imaginations of many Continental tyrants. Many are the tales that are told. We are assured that novel methods will be adopted, and when we see the originality of malice, the ingenuity of aggression, which our enemy displays, we may certainly prepare ourselves for every kind of novel stratagem and every kind of brutal and treacherous manœuvre. I think that no idea is so outlandish that it should not be considered and viewed with a searching, but at the same time, I hope, with a steady eye. We must never forget the solid assurances of sea power and those which belong to air power if it can be locally exercised.

I have, myself, full confidence that if all do their duty, if nothing is neglected, and if the best arrangements are made, as they are being made, we shall prove ourselves once more able to defend our island home, to ride out the storm of war, and to outlive the menace of tyranny, if necessary for years, if necessary alone. At any rate, that is what we are going to try to do. That is the resolve of His Majesty's Government – every man of them. That is the will of Parliament and the nation. The British Empire and the French Republic, linked together in their cause and in their need, will defend to the death their native soil, aiding each other like good comrades to the utmost of their strength.

Even though large tracts of Europe and many old and famous States have fallen or may fall into the grip of the Gestapo and all the odious apparatus of Nazi rule, we shall not flag or fail. We shall go on to the end. We shall fight in France, we shall fight on the seas and oceans, we shall fight with growing confidence and growing strength in the air, we shall defend our island, whatever the cost may be. We shall fight on the beaches, we shall fight on the landing grounds, we shall fight in the fields and in the streets, we shall fight in the hills; we shall never surrender. And even if, which I do not for a moment believe, this island or a large part of it were subjugated and starving, then our Empire beyond the seas, armed and guarded by the British Fleet, would carry on the struggle, until, in God's good time, the New World, with all its power and might, steps forth to the rescue and the liberation of the Old.
"""


In [7]:
chat_message = [
    SystemMessage(content="You are expert in summarizing speeches."),
    HumanMessage(content=f"Please provide a summary of the following speech: \n Text: {speech}"),
]



In [8]:
llm.get_num_tokens(speech)

601

In [9]:
summary = llm(chat_message).content

print(summary)

  summary = llm(chat_message).content


In this speech, the speaker addresses the persistent threat of invasion throughout history, emphasizing that no period has ever guaranteed complete safety from such dangers. Reflecting on past conflicts, particularly during the Napoleonic era, the speaker acknowledges the unpredictability of warfare and the creativity of adversaries in their aggressive tactics. However, he expresses unwavering confidence in the ability of the British people to defend their homeland, provided that everyone fulfills their duties and proper preparations are made.

The speaker reaffirms the commitment of His Majesty's Government and Parliament to protect the nation alongside allies, particularly the French Republic. Despite the fall of many European states to Nazi control, he insists that Britain will not falter. He outlines a determined resolve to fight on multiple fronts—on land, at sea, and in the air—vowing to defend the island at all costs. The speech concludes with a powerful declaration of resilienc

In [10]:
llm.get_num_tokens(summary)

208

# Second Way 

Prompt template for text summarization 

In [11]:
from langchain.chains import LLMChain
from langchain import PromptTemplate

generic_prompt = """
Write a summary of the following speech:
Speech: {speech}
translate the precise summary to {language}
"""

prompt_template = PromptTemplate(
    input_variables=["speech", "language"],
    template=generic_prompt
)


In [12]:
complete_prompt = prompt_template.format(speech=speech, language="Portuguese")

print(complete_prompt)


Write a summary of the following speech:
Speech: 
Turning once again, and this time more generally, to the question of invasion, I would observe that there has never been a period in all these long centuries of which we boast when an absolute guarantee against invasion, still less against serious raids, could have been given to our people. In the days of Napoleon, of which I was speaking just now, the same wind which would have carried his transports across the Channel might have driven away the blockading fleet. There was always the chance, and it is that chance which has excited and befooled the imaginations of many Continental tyrants. Many are the tales that are told. We are assured that novel methods will be adopted, and when we see the originality of malice, the ingenuity of aggression, which our enemy displays, we may certainly prepare ourselves for every kind of novel stratagem and every kind of brutal and treacherous manœuvre. I think that no idea is so outlandish that it sho

In [13]:
llm.get_num_tokens(complete_prompt)

619

In [14]:
llm_chain = LLMChain(llm=llm, prompt=prompt_template)

  llm_chain = LLMChain(llm=llm, prompt=prompt_template)


In [15]:
summary = llm_chain.run({"speech": speech, "language": "Portuguese"})

llm.get_num_tokens(summary)

  summary = llm_chain.run({"speech": speech, "language": "Portuguese"})


353

# Third way 

Summarize pdf documents - `Stuffdocumentchain Text Summarization`


The **Stuff Document Chain** is one of the most basic summarization techniques in LangChain.  
It operates by taking external data sources—such as a PDF file—and loading their contents into a **prompt template**. The entire content is then passed to a **Large Language Model (LLM)** to generate a comprehensive summary in a single step.

In this approach, if the source contains multiple documents (e.g., 10 separate PDFs), all their contents are concatenated and inserted directly into the prompt template. The LLM processes this combined text to produce a single summarized output.

**Challenges:**  
The main limitation of the Stuff method is **context size**. Since LLMs have a maximum token limit, summarization becomes impractical when the combined size of the documents exceeds the model’s context window. This can result in truncation or loss of important information.




```mermaid
flowchart LR
    A[External Data Source] --> B[Load Documents]
    B --> C[Combine All Documents]
    C --> D[Insert into Prompt Template]
    D --> E[Pass to LLM]
    E --> F[Generate Single Summary]
    
    style A fill:#f8f9fa,stroke:#333,stroke-width:1px
    style B fill:#fce5cd,stroke:#333,stroke-width:1px
    style C fill:#f4cccc,stroke:#333,stroke-width:1px
    style D fill:#d9ead3,stroke:#333,stroke-width:1px
    style E fill:#cfe2f3,stroke:#333,stroke-width:1px
    style F fill:#d0e0e3,stroke:#333,stroke-width:1px
```


## Stuff Document Chain

In [17]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("paper/apjspeech.pdf")

docs = loader.load_and_split()

print(docs)

[Document(metadata={'producer': 'GPL Ghostscript 8.15', 'creator': 'PScript5.dll Version 5.2', 'creationdate': 'D:20070730160943', 'moddate': 'D:20070730160943', 'title': 'Microsoft Word - Document1', 'author': 'Shri', 'source': 'paper/apjspeech.pdf', 'total_pages': 7, 'page': 0, 'page_label': '1'}, page_content='A P J Abdul Kalam Departing speech \n \n \nFriends, I am delighted to address you all, in the country and those livi ng abroad, after \nworking with you and completing five beautiful and eventful years in Rashtrapati \nBhavan. Today, it is indeed a thanks giving occasion. I would like to narr ate, how I \nenjoyed every minute of my tenure enriched by the wonderful assoc iation from each one \nof you, hailing from different walks of life, be it politics, sci ence and technology, \nacademics, arts, literature, business, judiciary, administration, local bodies, farming, \nhome makers, special children, media and above all from the youth and st udent \ncommunity who are the future

In [18]:
template = """
You are a helpful assistant that can summarize text.
Write a concise summary of the following text.

Text to summarize: {text}

Summarize the text in 120 words in Brazilian Portuguese.
"""
from langchain.prompts import PromptTemplate

prompt =PromptTemplate(template=template, input_variables=["text"])


In [19]:
from langchain.chains.summarize import load_summarize_chain

In [21]:
chain = load_summarize_chain(llm,chain_type='stuff',prompt=prompt, verbose=True)
output_summary = chain.run(docs)
print(output_summary)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
You are a helpful assistant that can summarize text.
Write a concise summary of the following text.

Text to summarize: A P J Abdul Kalam Departing speech 
 
 
Friends, I am delighted to address you all, in the country and those livi ng abroad, after 
working with you and completing five beautiful and eventful years in Rashtrapati 
Bhavan. Today, it is indeed a thanks giving occasion. I would like to narr ate, how I 
enjoyed every minute of my tenure enriched by the wonderful assoc iation from each one 
of you, hailing from different walks of life, be it politics, sci ence and technology, 
academics, arts, literature, business, judiciary, administration, local bodies, farming, 
home makers, special children, media and above all from the youth and st udent 
community who are the future wealth of our country. During my intera ction at 
Rashtrapati Bhavan

## Map Reduce to summarize large documents

In [22]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [24]:
docs[0].page_content

'A P J Abdul Kalam Departing speech \n \n \nFriends, I am delighted to address you all, in the country and those livi ng abroad, after \nworking with you and completing five beautiful and eventful years in Rashtrapati \nBhavan. Today, it is indeed a thanks giving occasion. I would like to narr ate, how I \nenjoyed every minute of my tenure enriched by the wonderful assoc iation from each one \nof you, hailing from different walks of life, be it politics, sci ence and technology, \nacademics, arts, literature, business, judiciary, administration, local bodies, farming, \nhome makers, special children, media and above all from the youth and st udent \ncommunity who are the future wealth of our country. During my intera ction at \nRashtrapati Bhavan in Delhi and at every state and union territor y as well as through my \nonline interactions, I have many unique experiences to share with you, which signify the \nfollowing important messages: \n \n1. Accelerate development : Aspiration of th

In [31]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=100)

docs_split = text_splitter.split_documents(docs)
print(docs_split[0].page_content)


A P J Abdul Kalam Departing speech 
 
 
Friends, I am delighted to address you all, in the country and those livi ng abroad, after 
working with you and completing five beautiful and eventful years in Rashtrapati 
Bhavan. Today, it is indeed a thanks giving occasion. I would like to narr ate, how I 
enjoyed every minute of my tenure enriched by the wonderful assoc iation from each one 
of you, hailing from different walks of life, be it politics, sci ence and technology, 
academics, arts, literature, business, judiciary, administration, local bodies, farming, 
home makers, special children, media and above all from the youth and st udent 
community who are the future wealth of our country. During my intera ction at 
Rashtrapati Bhavan in Delhi and at every state and union territor y as well as through my 
online interactions, I have many unique experiences to share with you, which signify the 
following important messages: 
 
1. Accelerate development : Aspiration of the youth, 
 
2. E

In [32]:
len(docs_split)

13

In [34]:
chunk_prompt = """
You are a helpful assistant that can summarize text.
Write a concise summary of the following text.

Text to summarize: {text}

Summarize the text in 120 words in Brazilian Portuguese.
"""

from langchain.prompts import PromptTemplate

prompt = PromptTemplate(template=chunk_prompt, input_variables=["text"])

In [35]:
final_prompt = """
Provide the final summary of the entire speech with these important points.
Add a motivational title in theand quote at the end of the summary.

Write a concise summary of the following text.

Text: {text}

Summarize the text in 120 words in Brazilian Portuguese.
"""


final_prompt = PromptTemplate(template=final_prompt, input_variables=["text"])

In [37]:
chain = load_summarize_chain(llm=llm,chain_type='map_reduce',map_prompt=prompt, combine_prompt=final_prompt, verbose=True)
output_summary = chain.run(docs_split)
print(output_summary)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
You are a helpful assistant that can summarize text.
Write a concise summary of the following text.

Text to summarize: A P J Abdul Kalam Departing speech 
 
 
Friends, I am delighted to address you all, in the country and those livi ng abroad, after 
working with you and completing five beautiful and eventful years in Rashtrapati 
Bhavan. Today, it is indeed a thanks giving occasion. I would like to narr ate, how I 
enjoyed every minute of my tenure enriched by the wonderful assoc iation from each one 
of you, hailing from different walks of life, be it politics, sci ence and technology, 
academics, arts, literature, business, judiciary, administration, local bodies, farming, 
home makers, special children, media and above all from the youth and st udent 
community who are the future wealth of our country. During my intera ction at 
Rashtrapati Bh

## Refine Chain For Summarization

In [38]:
chain = load_summarize_chain(llm=llm,chain_type='refine',verbose=True)
output_summary = chain.run(docs_split)
print(output_summary)



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"A P J Abdul Kalam Departing speech 
 
 
Friends, I am delighted to address you all, in the country and those livi ng abroad, after 
working with you and completing five beautiful and eventful years in Rashtrapati 
Bhavan. Today, it is indeed a thanks giving occasion. I would like to narr ate, how I 
enjoyed every minute of my tenure enriched by the wonderful assoc iation from each one 
of you, hailing from different walks of life, be it politics, sci ence and technology, 
academics, arts, literature, business, judiciary, administration, local bodies, farming, 
home makers, special children, media and above all from the youth and st udent 
community who are the future wealth of our country. During my intera ction at 
Rashtrapati Bhavan in Delhi and at every state and union territor y as well as through my 
on