Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents.

LLMs, given their proficiency in understanding text, are a great tool for this.

In this walkthrough we'll go over how to build a question-answering over documents application using LLMs.

Two very related use cases which we cover elsewhere are:

QA over structured data (e.g., SQL)
QA over code (e.g., Python)

<img src="https://python.langchain.com/assets/images/qa_flow-9fbd91de9282eb806bda1c6db501ecec.jpeg" width="1000" >


In [1]:
from langchain.document_loaders import WebBaseLoader, CSVLoader  #more integrations here https://integrations.langchain.com/
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain import hub
from langchain.schema.runnable import RunnablePassthrough

# Load documents
loader = WebBaseLoader(['https://python.langchain.com/docs/get_started/introduction', 'https://python.langchain.com/docs/integrations/document_loaders/pandas_dataframe'])

In [2]:
# Split documents
# Context-aware splitters keep the location ("context") of each split in the original Document

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
splits = text_splitter.split_documents(loader.load())

In [3]:
splits[0]

Document(page_content='Introduction | ğŸ¦œï¸�ğŸ”— Langchain', metadata={'source': 'https://python.langchain.com/docs/get_started/introduction', 'title': 'Introduction | ğŸ¦œï¸�ğŸ”— Langchain', 'description': 'LangChain is a framework for developing applications powered by language models. It enables applications that:', 'language': 'en'})

In [4]:
# Embed and store splits
#https://api.python.langchain.com/en/latest/embeddings/langchain.embeddings.huggingface.HuggingFaceEmbeddings.html
#https://huggingface.co/models?library=sentence-transformers&sort=downloads

model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}

#pip install urllib3<2  https://stackoverflow.com/questions/76414514/cannot-import-name-default-ciphers-from-urllib3-util-ssl-on-aws-lambda-us
emb = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

#emb = OpenAIEmbeddings()

vectorstore = Chroma.from_documents(documents=splits, embedding=emb)  #can also use local FAISS db
retriever = vectorstore.as_retriever()

In [5]:
# Prompt 
# https://smith.langchain.com/hub/rlm/rag-prompt

# You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
# Question: {question} 
# Context: {context} 
# Answer:


rag_prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

In [6]:
llm.predict('what is langchain?')

'Langchain is a decentralized blockchain platform that aims to provide language services and solutions. It utilizes blockchain technology to create a secure and transparent ecosystem for language-related transactions, such as translation, interpretation, and language learning. Langchain aims to connect language service providers and users directly, eliminating intermediaries and reducing costs. It also offers features like smart contracts, reputation systems, and payment solutions to ensure efficient and reliable language services.'

In [7]:
# RAG chain 

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()} 
    | rag_prompt 
    | llm 
)

In [8]:
rag_chain.invoke('what is langchain?')

AIMessage(content='LangChain is a framework for developing applications powered by language models. It enables applications that are context-aware and can reason based on provided context. LangChain provides components that are modular and easy-to-use for working with language models.')

In [9]:
question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
len(docs)

4

# Testing another LLM

In [10]:
#llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
from langchain.llms import GPT4All
model = GPT4All(model="./models/gpt4all-model.bin", n_threads=8)

# Simplest invocation
model('what is langchain?')

# Another approach with new lib

In [None]:
from langchain.chains import RetrievalQA

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": rag_prompt}
)
question = 'what is langchain?'
result = qa_chain({"query": question})
result["result"]