# Document Question Answering with AI Applications Search and LangChain

In [1]:
! pip install -q --user google-cloud-aiplatform google-cloud-discoveryengine langchain-google-vertexai langchain-google-community

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
kfp 2.5.0 requires requests-toolbelt<1,>=0.8.0, but you have requests-toolbelt 1.0.0 which is incompatible.[0m[31m
[0m

In [2]:
# Restart kernel after packages are installed so that your environment can access the new packages
import IPython
import time

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

In [1]:
PROJECT_ID = "qwiklabs-gcp-02-826dd3c76c0f"
LOCATION = "us-east1"

import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

In [2]:
DATA_STORE_ID = "qna-datastore-id"  # @param {type:"string"}
DATA_STORE_LOCATION = "global"  # @param {type:"string"}

MODEL = "gemini-2.0-flash"  # @param {type:"string"}

if PROJECT_ID == "YOUR_PROJECT_ID" or DATA_STORE_ID == "YOUR_DATA_STORE_ID":
    raise ValueError(
        "Please set the PROJECT_ID, DATA_STORE_ID constants to reflect your environment."
    )

In [3]:
from langchain.chains import RetrievalQA
from langchain.chains import RetrievalQAWithSourcesChain
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate

from langchain_google_vertexai import ChatVertexAI
from langchain_google_community import VertexAISearchRetriever
from langchain_google_community import VertexAIMultiTurnSearchRetriever

In [4]:
llm = ChatVertexAI(model_name=MODEL)

retriever = VertexAISearchRetriever(
    project_id=PROJECT_ID,
    location_id=DATA_STORE_LOCATION,
    data_store_id=DATA_STORE_ID,
    get_extractive_answers=True,
    max_documents=10,
    max_extractive_segment_count=1,
    max_extractive_answer_count=5,
)



RetrievalQA simplest document Q&A chain offered by LangChain.

 - Here, we use the `stuff` type, which simply inserts all of the document chunks into the prompt.
 - This has the advantage of only making a single LLM call, which is faster and more cost efficient
 - However, if we have a large number of search results we run the risk of exceeding the token limit in our prompt, or truncating useful information.
 - Other chain types such as `map_reduce` and `refine` use an iterative process that makes multiple LLM calls, taking individual document chunks at a time and refining the answer iteratively.

In [5]:
search_query = "What was Alphabet's Revenue in Q2 2021?"  # @param {type:"string"}

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever
)
retrieval_qa.invoke(search_query)

{'query': "What was Alphabet's Revenue in Q2 2021?",
 'result': "Alphabet's revenue in Q2 2021 was $61.9 billion.\n"}

Now, we'll be inspecting the document, If we add `return_source_documents=True` we can inspect the document chunks that were returned by the retriever.

This is helpful for debugging, as these chunks may not always be relevant to the answer, or their relevance might not be obvious.

In [6]:
retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True
)

results = retrieval_qa.invoke(search_query)

print("*" * 79)
print(results["result"])
print("*" * 79)
for doc in results["source_documents"]:
    print("-" * 79)
    print(doc.page_content)

*******************************************************************************
Alphabet's revenue in Q2 2021 was $61.9 billion.

*******************************************************************************
-------------------------------------------------------------------------------
Our long-term investments in AI and Google Cloud are helping us drive significant improvements in everyone&#39;s digital experience.” “Our strong second quarter revenues of <b>$61.9 billion</b> reflect elevated consumer online activity and broad-based strength in advertiser spend.
-------------------------------------------------------------------------------
Alphabet Inc. CONSOLIDATED STATEMENTS OF INCOME (In millions, except share amounts which are reflected in thousands and per share amounts) Quarter Ended June 30, Year To Date June 30, 2020 2021 2020 2021 (unaudited) (unaudited) Revenues $ 38297 $ 61880 $ 79456 $ 117194 Costs and expenses: Cost of revenues 18553 26227 37535 50330 Research and deve

RetrievalQAWithSourceChain variant returns an answer to the question alongside the source documents that were used to generate the answer.

In [7]:
retrieval_qa_with_sources = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever
)

retrieval_qa_with_sources.invoke(search_query, return_only_outputs=True)

{'answer': "Alphabet's revenue in Q2 2021 was $61.9 billion.\n",
 'sources': 'gs://cloud-samples-data/gen-app-builder/search/alphabet-investor-pdfs/2021Q2_alphabet_earnings_release.pdf1, gs://cloud-samples-data/gen-app-builder/search/alphabet-investor-pdfs/2021Q2_alphabet_earnings_release.pdf5, gs://cloud-samples-data/gen-app-builder/search/alphabet-investor-pdfs/2021Q2_alphabet_earnings_release.pdf2'}

`ConversationalRetrievalChain` remembers and uses previous questions so you can have a chat-like discovery process.

To use this chain we must provide a memory class to store and pass the previous messages to the LLM as context. Here we use the `ConversationBufferMemory` class that comes with LangChain.

`VertexAIMultiTurnSearchRetriever` uses multi-turn search (also called conversational search or search with followups) to preserve context between requests.

Now will work with both retrievers, and the multi-turn retriever can be substituted in any of the previous examples.

In [12]:
multi_turn_retriever = VertexAIMultiTurnSearchRetriever(
    project_id=PROJECT_ID, location_id=DATA_STORE_LOCATION, data_store_id=DATA_STORE_ID
)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
conversational_retrieval = ConversationalRetrievalChain.from_llm(
    llm=llm, retriever=multi_turn_retriever, memory=memory
)

search_query = "What were alphabet revenues in 2022?"

result = conversational_retrieval.invoke(search_query)
print(result["answer"])

Alphabet revenues in 2022 were $282,836 million.


In [13]:
new_query = "What about costs and expenses?"
result = conversational_retrieval.invoke(new_query)
print(result["answer"])

Alphabet's costs and expenses in 2022 were $207,994 million.


In [14]:
new_query = "Is this more than in 2021?"

result = conversational_retrieval.invoke(new_query)
print(result["answer"])

Yes, Alphabet's total costs and expenses in 2022 were $207,994 million, while in 2021 they were $178,923 million.



In [None]:
qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True
)

print(qa.combine_documents_chain.llm_chain.prompt.messages[0].prompt.template)