# Data Augmented Question Answering

This notebook uses some generic prompts/language models to evaluate an question answering system that uses other sources of data besides what is in the model. For example, this can be used to evaluate a question answering system over your propritary data.

## Setup
Let's set up an example with our favorite example - the state of the union address.

In [1]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain import OpenAI, VectorDBQA

In [2]:
from langchain.document_loaders import TextLoader
loader = TextLoader('../../modules/state_of_the_union.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)
qa = VectorDBQA.from_llm(llm=OpenAI(), vectorstore=docsearch)

Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.


In [3]:
from langchain.chat.vector_db_qa import VectorDBQA as ChatVectorDBQA
from langchain.chat_models import ChatOpenAI

In [4]:
chat_qa = ChatVectorDBQA.from_model(model = ChatOpenAI(temperature=0), vectorstore=docsearch)

## Examples
Now we need some examples to evaluate. We can do this in two ways:

1. Hard code some examples ourselves
2. Generate examples automatically, using a language model

In [5]:
# Hard-coded examples
examples = [
    {
        "query": "What did the president say about Ketanji Brown Jackson",
        "answer": "He praised her legal ability and said he nominated her for the supreme court."
    },
    {
        "query": "What did the president say about Michael Jackson",
        "answer": "Nothing"
    }
]

In [6]:
# Generated examples
from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(OpenAI())

In [7]:
new_examples = example_gen_chain.apply_and_parse([{"doc": t} for t in texts[:5]])

In [8]:
new_examples

[{'query': 'Who did Vladimir Putin think he would meet when he entered Ukraine?',
  'answer': 'He thought he would meet the world and that it would roll over.'},
 {'query': 'Who is the Ukrainian Ambassador to the United States?',
  'answer': 'The Ukrainian Ambassador to the United States is mentioned in the document.'},
 {'query': 'How many countries have joined the coalition to confront Putin?',
  'answer': '27 members of the European Union, France, Germany, Italy, the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland.'},
 {'query': 'What is the U.S. Department of Justice doing to go after the crimes of Russian oligarchs?',
  'answer': 'The U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs and join with European allies to find and seize yachts, luxury apartments, and private jets.'},
 {'query': 'What percentage of value has the Ruble lost due to the actions of the US and its allies?

In [9]:
# Combine examples
examples += new_examples

## Evaluate
Now that we have examples, we can use the question answering evaluator to evaluate our question answering chain.

In [10]:
predictions = qa.apply(examples)

In [11]:
chat_predictions = chat_qa.apply(examples)

In [12]:
from langchain.evaluation.qa.chat_eval_chain import QAEvalChatChain
from langchain.chat_models import ChatOpenAI

model = ChatOpenAI(temperature=0)

eval_chain = QAEvalChatChain.from_model(model)

In [13]:
graded_outputs = eval_chain.evaluate(examples, predictions)
graded_chat_outputs = eval_chain.evaluate(examples, chat_predictions)

In [14]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

Example 0:
Question: What did the president say about Ketanji Brown Jackson
Real Answer: He praised her legal ability and said he nominated her for the supreme court.
Predicted Answer:  The president said that Ketanji Brown Jackson is "one of our nation's top legal minds" and that she will "continue Justice Breyer's legacy of excellence."
Predicted Grade: GRADE: INCORRECT

Example 1:
Question: What did the president say about Michael Jackson
Real Answer: Nothing
Predicted Answer:  The president did not mention Michael Jackson in the given context.
Predicted Grade: GRADE: CORRECT

Example 2:
Question: Who did Vladimir Putin think he would meet when he entered Ukraine?
Real Answer: He thought he would meet the world and that it would roll over.
Predicted Answer:  Putin thought he would meet a divided West and NATO that would not respond to his aggression.
Predicted Grade: GRADE: INCORRECT

Example 3:
Question: Who is the Ukrainian Ambassador to the United States?
Real Answer: The Ukraini

In [15]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + chat_predictions[i]['query'])
    print("Real Answer: " + chat_predictions[i]['answer'])
    print("Predicted Answer: " + chat_predictions[i]['result'])
    print("Predicted Grade: " + graded_chat_outputs[i]['text'])
    print()

Example 0:
Question: What did the president say about Ketanji Brown Jackson
Real Answer: He praised her legal ability and said he nominated her for the supreme court.
Predicted Answer: The President said that he nominated Circuit Court of Appeals Judge Ketanji Brown Jackson for the United States Supreme Court, and that she is one of the nation's top legal minds who will continue Justice Breyer's legacy of excellence. He also mentioned that she has received broad support from various groups, including the Fraternal Order of Police and former judges appointed by Democrats and Republicans.
Predicted Grade: GRADE: INCORRECT

Example 1:
Question: What did the president say about Michael Jackson
Real Answer: Nothing
Predicted Answer: I'm sorry, I cannot provide an answer to that question as there is no mention of Michael Jackson in the given speech.
Predicted Grade: GRADE: CORRECT

Example 2:
Question: Who did Vladimir Putin think he would meet when he entered Ukraine?
Real Answer: He though

In [16]:
from langchain.evaluation.qa import QAEvalChain
predictions = qa.apply(examples)
llm = OpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)
normal_graded_outputs = eval_chain.evaluate(examples, predictions)

normal_graded_chat_outputs = eval_chain.evaluate(examples, chat_predictions)

In [17]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + normal_graded_outputs[i]['text'])
    print()

Example 0:
Question: What did the president say about Ketanji Brown Jackson
Real Answer: He praised her legal ability and said he nominated her for the supreme court.
Predicted Answer:  The president said that she is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also mentioned that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.
Predicted Grade:  CORRECT

Example 1:
Question: What did the president say about Michael Jackson
Real Answer: Nothing
Predicted Answer:  The president did not mention Michael Jackson.
Predicted Grade:  CORRECT

Example 2:
Question: Who did Vladimir Putin think he would meet when he entered Ukraine?
Real Answer: He thought he would meet the world and that it would roll over.
Predicted Answer:  Putin thought he would meet people who would roll

In [18]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + chat_predictions[i]['query'])
    print("Real Answer: " + chat_predictions[i]['answer'])
    print("Predicted Answer: " + chat_predictions[i]['result'])
    print("Predicted Grade: " + normal_graded_chat_outputs[i]['text'])
    print()

Example 0:
Question: What did the president say about Ketanji Brown Jackson
Real Answer: He praised her legal ability and said he nominated her for the supreme court.
Predicted Answer: The President said that he nominated Circuit Court of Appeals Judge Ketanji Brown Jackson for the United States Supreme Court, and that she is one of the nation's top legal minds who will continue Justice Breyer's legacy of excellence. He also mentioned that she has received broad support from various groups, including the Fraternal Order of Police and former judges appointed by Democrats and Republicans.
Predicted Grade:  CORRECT

Example 1:
Question: What did the president say about Michael Jackson
Real Answer: Nothing
Predicted Answer: I'm sorry, I cannot provide an answer to that question as there is no mention of Michael Jackson in the given speech.
Predicted Grade:  CORRECT

Example 2:
Question: Who did Vladimir Putin think he would meet when he entered Ukraine?
Real Answer: He thought he would mee

In [19]:
from langchain.evaluation.qa.chat_comp_chain import QACompChatChain

In [20]:
comp_chain = QACompChatChain.from_model(ChatOpenAI(temperature=0))

In [21]:
comps = comp_chain.evaluate(examples, predictions, chat_predictions)

In [22]:
for i, eg in enumerate(examples):
    print(f"## Example {i}:")
    print()
    print("Question: " + chat_predictions[i]['query'])
    print()
    print("Real Answer: " + chat_predictions[i]['answer'])
    print()
    print("Normal Answer: " + predictions[i]['result'])
    print()
    print("Chat Answer: " + chat_predictions[i]['result'])
    print()
    print("Comparison: " + comps[i]['text'])
    print()

## Example 0:

Question: What did the president say about Ketanji Brown Jackson

Real Answer: He praised her legal ability and said he nominated her for the supreme court.

Normal Answer:  The president said that she is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also mentioned that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.

Chat Answer: The President said that he nominated Circuit Court of Appeals Judge Ketanji Brown Jackson for the United States Supreme Court, and that she is one of the nation's top legal minds who will continue Justice Breyer's legacy of excellence. He also mentioned that she has received broad support from various groups, including the Fraternal Order of Police and former judges appointed by Democrats and Republicans.

Comparison: Studen