# Document Question Answering

In [1]:
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQAWithSourcesChain

In [2]:
import os
import textwrap
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())

open.api_key = os.environ['OPENAI_API_KEY']

## Initialize ChromaDB

In [3]:
embeddings = OpenAIEmbeddings()
persist_directory = 'chroma/'
vectordb = Chroma(persist_directory=persist_directory, embedding_function=embeddings)
print(vectordb._collection.count())
vectordb.persist()

199


## Create the chain and a helper function

Initialize the chain we will use for question answering.

In [4]:
#llm_model_name = "gpt-3.5-turbo"
llm_model_name = "gpt-4"
llm = ChatOpenAI(model_name=llm_model_name, temperature=0)

qa = RetrievalQAWithSourcesChain.from_chain_type(
    llm,
    retriever=vectordb.as_retriever(),
    return_source_documents=True
)

def pretty_query (query, qa=qa):
    result = qa(query)
    print(f"{textwrap.fill(result['answer'])}\n\nSource: {result['sources']}")
    

In [10]:
print(qa.combine_documents_chain.llm_chain.prompt.template)

Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.

QUESTION: Which state/country's law governs the interpretation of the contract?
Content: This Agreement is governed by English law and the parties submit to the exclusive jurisdiction of the English courts in  relation to any dispute (contractual or non-contractual) concerning this Agreement save that either party may apply to any court for an  injunction or other relief to protect its Intellectual Property Rights.
Source: 28-pl
Content: No Waiver. Failure or delay in exercising any right or remedy under this Agreement shall not constitute a waiver of such (or any other)  right or remedy.

11.7 Severability. The invalidity, illegality or unenforceability of any term (or part of a term) of this Agreement shall not affect the con

## Ask questions!

Now we can use the chain to ask questions!

In [None]:
pretty_query ("What did the president say about large corporations and the wealthy?")

In [None]:
pretty_query ("What does Sarah Silverman allege against OpenAI?")

###  This code seems to be unable to distinguish the Mar-a-Lago indictment from the later indictment.

In [None]:
pretty_query("What does the District of Columbia indictment against Donald Trump allege?  I am not intersted in the Florida indictment.")

In [None]:
pretty_query ("Did Richard Robbins and his colleagues reach a conclusion about whether semantic metrics are better than lexical metrics for text generation tasks?")

In [None]:
pretty_query ("Can Epiq employees use AI tools at work?")

In [None]:
pretty_query ("Who should Epiq employees contact if they want to use an AI tool at work?")

In [None]:
pretty_query ("I am an Epiq employee. Can I use Chat GPT at work?")

In [None]:
pretty_query ("Everyone uses Chat GPT.  Why doesn't Epiq let me access it?")

In [None]:
pretty_query ("Please summarize, the key parts of Epiq's AI usage policy.")

In [None]:
pretty_query ("What are the rules at Epiq for the use of private AI tools?")

In [None]:
pretty_query ("What risks is Epiq concerned about regarding Generative AI?")

In [None]:
pretty_query ("What should I do if I think that Epiq proprietary information has leaked into a public AI?")

In [None]:
pretty_query ("I need to summarize Epiq's AI policy for my team of software developers.  Please provide a summary.")

In [None]:
pretty_query ("What are the risks associated with AI use?")