# Question Answering with Sources

This notebook walks through how to use LangChain for question answering with sources over a list of documents. It covers three different chain types: `stuff`, `map_reduce`, and `refine`. For a more in depth explanation of what these chain types are, see [here](../combine_docs.md).

## Prepare Data
First we prepare the data. For this example we do similarity search over a vector database, but these documents could be fetched in any manner (the point of this notebook to highlight what to do AFTER you fetch the documents).

In [1]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch
from langchain.vectorstores.faiss import FAISS
from langchain.docstore.document import Document

In [4]:
with open('../../state_of_the_union.txt') as f:
    state_of_the_union = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_text(state_of_the_union)

embeddings = OpenAIEmbeddings()

In [5]:
docsearch = FAISS.from_texts(texts, embeddings, metadatas=[{"source": i} for i in range(len(texts))])

In [6]:
query = "What did the president say about Justice Breyer"
docs = docsearch.similarity_search(query)

In [7]:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.llms import OpenAI

## The `stuff` Chain

This sections shows results of using the `stuff` Chain to do question answering with sources.

In [8]:
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")

In [9]:
docs = [Document(page_content=t, metadata={"source": i}) for i, t in enumerate(texts[:3])]

In [10]:
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'output_text': ' The president did not mention Justice Breyer.\nSOURCES: 0, 1, 2'}

## The `map_reduce` Chain

This sections shows results of using the `map_reduce` Chain to do question answering with sources.

In [11]:
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="map_reduce")

In [14]:
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'output_text': ' The president did not mention Justice Breyer.\nSOURCES: 0, 1, 2'}

**Intermediate Steps**

We can also return the intermediate steps for `map_reduce` chains, should we want to inspect them. This is done with the `return_map_steps` variable.

In [15]:
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="map_reduce", return_map_steps=True)

In [16]:
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'map_steps': [' None', ' None', ' None'],
 'output_text': ' The president did not mention Justice Breyer.\nSOURCES: 0, 1, 2'}

## The `refine` Chain

This sections shows results of using the `refine` Chain to do question answering with sources.

In [17]:
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="refine")

In [18]:
query = "What did the president say about Justice Breyer"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'output_text': "\n\nThe president did not mention Justice Breyer in his speech to the European Parliament. He spoke about the struggle of the Ukrainian people, the importance of the NATO Alliance, and the need for American diplomacy and resolve. He discussed Putin's premeditated and unprovoked attack on Ukraine, and the efforts to build a coalition of freedom-loving nations to confront Putin. He also discussed how the free world is holding Putin accountable, and the countries that are part of the European Union, including France, Germany, Italy, the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and Switzerland. Source: 1, 2"}

**Intermediate Steps**

We can also return the intermediate steps for `refine` chains, should we want to inspect them. This is done with the `return_refine_steps` variable.

In [19]:
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="refine", return_refine_steps=True)

In [20]:
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'refine_steps': ['\nThe president did not mention Justice Breyer in the given context.',
  '\n\nThe president did not mention Justice Breyer in his speech to the European Parliament. He spoke about the struggle of the Ukrainian people, the importance of the NATO Alliance, and the need for American diplomacy and resolve. Source: 1',
  "\n\nThe president did not mention Justice Breyer in his speech to the European Parliament. He spoke about the struggle of the Ukrainian people, the importance of the NATO Alliance, and the need for American diplomacy and resolve. He discussed Putin's premeditated and unprovoked attack on Ukraine, and the efforts to build a coalition of freedom-loving nations to confront Putin. He also discussed how the free world is holding Putin accountable, and the countries that are part of the coalition, including France, Germany, Italy, the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and Switzerland. Source: 1, 2"],
 'output_text': "\n\nThe preside