# Question & Answering With AI

### 1. Basic Question & Answering

Let's first install the required libraries and load up our packages

In [None]:
pip install --upgrade langchain

In [None]:
pip install openai

In [None]:
pip install python-dotenv

In [None]:
pip install beautifulsoup4

In [None]:
pip install tiktoken

In [1]:
import os
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

api_key=os.environ['OPENAI_API_KEY']
base_url=os.environ['OPENAI_BASE_URL']

print(base_url)

https://devsquad-eastus-2.openai.azure.com/


## Summaries of Short Text

For summaries of short texts, the method is straightforward, in fact you don't need to do anything fancy other than simple prompting with instructions

In [2]:
from langchain.chat_models import AzureChatOpenAI
from langchain.schema import HumanMessage
from langchain import PromptTemplate

llm = AzureChatOpenAI(
    api_key=api_key,
    azure_endpoint=base_url, 
    api_version="2023-07-01-preview",
    model="gpt-4",
    temperature=0
)

template = """
Please provide a summary of the following text:

{text}
"""

prompt = PromptTemplate(input_variables=["text"], template=template)

let's test it with a small story:

In [3]:
story = """
Once upon a time, in a small village nestled between the mountains and the sea, lived a mighty dragon and a wise princess. The dragon, known for its fiery breath, was feared by all. The princess, on the other hand, was loved for her kindness and wisdom.
The princess was deeply concerned about the changing climate. She noticed the winters becoming harsher, the summers hotter, and the crops failing. She realized that the dragon’s fire, used to keep the villagers warm during the cold winters, was contributing to the rising temperatures.
She decided to have a conversation with the dragon. “Dear friend,” she began, “Our village is suffering because of the changing climate. The heat from your fire is making the summers unbearable and affecting our crops. We need to find a solution.”
The dragon, who cared for the village as much as the princess, agreed. They decided to limit the use of the dragon’s fire to only the coldest days of winter. The dragon also helped the villagers build energy-efficient homes to stay warm.
The princess didn’t stop there. She educated the villagers about the importance of sustainable living. They started planting more trees, recycling, and using renewable energy sources.
Over time, the village became a model of sustainability. The dragon and the princess showed everyone that with understanding, cooperation, and sustainable practices, it’s possible to combat climate change. And so, they continued to live in harmony with nature, proving that even in a story of a princess and a dragon, there’s room for real-world issues like climate change.
"""

In [4]:
final_prompt = prompt.format(text=story)
print(final_prompt)


Please provide a summary of the following text:


Once upon a time, in a small village nestled between the mountains and the sea, lived a mighty dragon and a wise princess. The dragon, known for its fiery breath, was feared by all. The princess, on the other hand, was loved for her kindness and wisdom.
The princess was deeply concerned about the changing climate. She noticed the winters becoming harsher, the summers hotter, and the crops failing. She realized that the dragon’s fire, used to keep the villagers warm during the cold winters, was contributing to the rising temperatures.
She decided to have a conversation with the dragon. “Dear friend,” she began, “Our village is suffering because of the changing climate. The heat from your fire is making the summers unbearable and affecting our crops. We need to find a solution.”
The dragon, who cared for the village as much as the princess, agreed. They decided to limit the use of the dragon’s fire to only the coldest days of winter. The

Finally let's use our LLM

In [5]:
message = HumanMessage(content=final_prompt)
output = llm([message])
print(output.content)

In a small village, a mighty dragon and a wise princess lived among the residents. The princess was concerned about the worsening climate, noting that the dragon's fire, while warming the villagers in winter, was exacerbating the heat in summer and harming crops. She approached the dragon, and together they agreed to limit the use of the dragon's fire and to help villagers build energy-efficient homes. The princess also led the village in adopting sustainable practices, such as planting trees, recycling, and using renewable energy. Over time, the village became a model of sustainability, demonstrating that cooperation and sustainable living can address climate change, with the dragon and princess living in harmony with nature and highlighting the relevance of environmental issues in their story.


This method works fine, but for longer text, it can become a pain to manage and you'll run into token limits. Luckily LangChain has out of the box support for different methods to summarize via their `load_summarize_chain`

## Summaries of Longer Text

Note: This method will also work for short text.

#### Load your data

We need to define a function to extract the article part of it using `BeautifulSoup4`

In [6]:
from bs4 import BeautifulSoup


def extract_article(content: BeautifulSoup) -> str:
    # Find all 'article' elements in the BeautifulSoup object
    article_elements = content.find_all("article")

    return str(article_elements.get_text())

Then we'll load our PDF text into a variable. The `WebBaseLoader` will convert our text for us

In [7]:
loader = WebBaseLoader("https://www.newyorker.com/news/letter-from-the-uk/the-collateral-damage-of-queen-elizabeths-glorious-reign")
data = loader.load()

Let's check how many tokens this would use:

In [8]:
num_tokens = llm.get_num_tokens(data[0].page_content)
print(f"The number of tokens in the message is {num_tokens}.")

The number of tokens in the message is 4770.


Let's see how many documents and characters we have in our data

In [9]:
print(f'You have {len(data)} documents in your data')
print(f'There are {len(data[0].page_content)} characters in your first document')

You have 1 documents in your data
There are 21773 characters in your first document


#### Chunk your data up into smaller documents
Let's split into smaller chunks and assume is too big.

In [10]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=10000, chunk_overlap=500)
docs = text_splitter.split_documents(data)

In [11]:
print(f'You have {len(docs)} documents in your data')

You have 3 documents in your data


In [12]:
print(docs[0].page_content[:500])

The Collateral Damage of Queen Elizabeth’s Glorious Reign | The New YorkerSkip to main contentNewsletterStory SavedTo revisit this article, select My Account, then View saved storiesClose AlertSign InSearchSearchThe Latest2023 in ReviewNewsBooks & CultureFiction & PoetryHumor & CartoonsMagazinePuzzles & GamesVideoPodcastsGoings OnShopOpen Navigation MenuMenuStory SavedFind anything you save across the site in your account Close AlertLetter from the U.K.The Collateral Damage of Queen Elizabeth’s 


Now let's use the `load_summarize_chain` with `map_reduce` to summarize our text

In [13]:
from langchain.chains.summarize import load_summarize_chain

chain = load_summarize_chain(llm=llm, chain_type="map_reduce")

In [14]:
output = chain.run(docs)

In [15]:
print(output)

Sam Knight's article "The Collateral Damage of Queen Elizabeth’s Glorious Reign," based on Tina Brown's "The Palace Papers," examines the psychological impact and limited roles of the British Royal Family during Queen Elizabeth II's reign. The monarchy is depicted as a system where the Queen holds real authority, while other royals serve ceremonial purposes, leading to personal struggles for figures like Prince Charles and Princess Margaret. The article touches on the varied fates of the Queen's children, including Prince Edward's privacy breach, Prince Charles's unmet expectations, Prince Andrew's fall from grace, and the contrasting lives of Prince William and Prince Harry. As the Queen prepares for the transition of power, endorsing Camilla as Queen Consort, the monarchy faces contemporary challenges and its future may rest on the image of William and Catherine's family. The article also briefly mentions a range of other topics and encourages readers to subscribe to The New Yorker's

### Custom prompts

Let's try some custom prompts to customize the summary generation

In [16]:
map_prompt_template = """
                      Write a summary of this chunk of text that includes the main points and any important details.
                      {text}
                      """

map_prompt = PromptTemplate(template=map_prompt_template, input_variables=["text"])

combine_prompt_template = """
                      Write a concise summary of the following text delimited by triple backquotes.
                      Return your response in bullet points which covers the key points of the text.
                      ```{text}```
                      BULLET POINT SUMMARY:
                      """

combine_prompt = PromptTemplate(
    template=combine_prompt_template, input_variables=["text"]
)

In [19]:
chain = load_summarize_chain(
    llm=llm,
    chain_type="map_reduce",
    map_prompt=map_prompt,
    combine_prompt=combine_prompt,
)

In [20]:
output = chain.run(docs)

In [21]:
print(output)

- The New Yorker article discusses the impact of Queen Elizabeth II's reign on the Royal Family, referencing Tina Brown's "The Palace Papers."
- The Queen is depicted as a central figure with authority, while other royals have limited agency and serve as symbols.
- Royals like Prince Charles and Princess Margaret have faced personal struggles due to their roles in the monarchy.
- Prince Andrew was stripped of royal duties and titles due to his association with Jeffrey Epstein.
- Prince William and Prince Harry have taken different life paths, with William leading a quiet life and Harry becoming a global influencer.
- Harry and Meghan left royal life and now live in California with deals with Netflix, Spotify, and Penguin Random House.
- Camilla and Diana are portrayed positively, with Diana's understanding of royal power and social cause ambitions highlighted.
- The monarchy's future may hinge on Prince William, Catherine, and their children.
- Prince Edward faced protests over Britain

### Refine summarization

Let's try a different way of summarizing with Refine the new chunks are added to the already first created summary

In [22]:
chain = load_summarize_chain(llm, chain_type="refine")
output = chain.run(docs)

'The original summary provided is comprehensive and captures the essence of Sam Knight\'s article, "The Collateral Damage of Queen Elizabeth’s Glorious Reign," as well as the key themes and issues surrounding the British Royal Family as portrayed in Tina Brown\'s "The Palace Papers." The additional context regarding Prince Edward\'s encounter with protesters and the mention of reparations for Britain’s role in the transatlantic slave trade adds a contemporary challenge faced by the monarchy, but it does not fundamentally alter the narrative or the summary\'s conclusions. The rest of the context provided is largely unrelated to the main article and consists of other New Yorker content and administrative details.\n\nTherefore, the original summary remains effective and no refinement is necessary. Here is the original summary for reference:\n\nIn "The Collateral Damage of Queen Elizabeth’s Glorious Reign," Sam Knight delves into the complexities and personal struggles of the British Royal

In [None]:
print(output)

Let's adapt it to our need customizing the prompts

In [25]:
prompt_template = """Write a concise summary of the following:
{text}
CONCISE SUMMARY:"""
prompt = PromptTemplate.from_template(prompt_template)

refine_template = (
    "Your job is to produce a final summary\n"
    "We have provided an existing summary up to a certain point: {existing_answer}\n"
    "We have the opportunity to refine the existing summary"
    "(only if needed) with some more context below.\n"
    "------------\n"
    "{text}\n"
    "------------\n"
    "Given the new context, refine the original summary about 120 words"
    "If the context isn't useful, return the original summary."
)
refine_prompt = PromptTemplate.from_template(refine_template)
chain = load_summarize_chain(
    llm=llm,
    chain_type="refine",
    question_prompt=prompt,
    refine_prompt=refine_prompt,
    return_intermediate_steps=True,
    input_key="input_documents",
    output_key="output_text",
)
result = chain({"input_documents": docs}, return_only_outputs=True)

In [26]:
print(result["output_text"])

In "The Palace Papers" by Tina Brown, the British Royal Family is scrutinized during Queen Elizabeth II's tenure, emphasizing the tribulations of non-sovereign royals. The Queen's unwavering leadership contrasts with the personal struggles of figures like Princess Margaret and Prince Charles. The narrative delves into the newer generation's efforts to balance tradition with modernity, notably Prince William's quest for normalcy and Prince Harry's controversial exit with Meghan Markle. Prince Andrew's reputation is tarnished by scandal, underscoring the psychological toll of royal duties. The monarchy's endurance seems to hinge on William and Catherine's ability to maintain a relatable yet regal image amidst a changing societal landscape and growing public scrutiny.
