# Document Question Answering

An example of using Chroma DB and LangChain to do question answering over documents.

In [None]:
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import VectorDBQA
from langchain.document_loaders import TextLoader

: 

## Load documents

Load documents to do question answering over. If you want to do this over your documents, this is the section you should replace.

In [4]:
loader = TextLoader('state_of_the_union.txt',  encoding="utf-8")
documents = loader.load()

## Split documents

Split documents into small chunks. This is so we can find the most relevant chunks for a query and pass only those into the LLM.

In [5]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

## Initialize ChromaDB

Create embeddings for each chunk and insert into the Chroma vector database.

In [6]:
embeddings = OpenAIEmbeddings()
vectordb = Chroma.from_documents(texts, embeddings)

  warn_deprecated(


ImportError: Could not import chromadb python package. Please install it with `pip install chromadb`.

## Create the chain

Initialize the chain we will use for question answering.

In [9]:
qa = VectorDBQA.from_chain_type(llm=OpenAI(), chain_type="stuff", vectorstore=vectordb)

## Ask questions!

Now we can use the chain to ask questions!

In [10]:
query = "What did the president say about Ketanji Brown Jackson"
qa.run(query)

" The president said that Ketanji Brown Jackson is one of the nation's top legal minds and that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. He also said that she is a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He described her as a consensus builder."