# Building Q&A application using Knowledge Bases for Amazon Bedrock - Retrieve API and Langchain

### Context

With a knowledge base, you can securely connect foundation models (FMs) in Amazon Bedrock to your company
data for Retrieval Augmented Generation (RAG). Access to additional data helps the model generate more relevant,
context-speciﬁc, and accurate responses without continuously retraining the FM. All information retrieved from
knowledge bases comes with source attribution to improve transparency and minimize hallucinations. For more information on creating a knowledge base using console, please refer to this [post](!https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html).

In this notebook, we will dive deep into building Q&A application using Retrieve API provided by Knowledge Bases for Amazon Bedrock and LangChain. We will query the knowledge base to get the desired number of document chunks based on similarity search, integrate it with LangChain retriever and use Anthropic Claude instant model for answering questions.


### Pattern

We can implement the solution using Retreival Augmented Generation (RAG) pattern. RAG retrieves data from outside the language model (non-parametric) and augments the prompts by adding the relevant retrieved data in context. Here, we are performing RAG effectively on the knowledge base created in the previous notebook or using console. 

### Pre-requisite

Before being able to answer the questions, the documents must be processed and stored in knowledge base.

1. Load the documents into the knowledge base by connecting your s3 bucket (data source). 
2. Ingestion - Knowledge base will split them into smaller chunks (based on the strategy selected), generate embeddings and store it in the associated vectore store.

![data_ingestion.png](./images/data_ingestion.png)


#### Notebook Walkthrough

For our notebook we will use the `Retreive API` provided by Knowledge Bases for Amazon Bedrock which converts user queries into
embeddings, searches the knowledge base, and returns the relevant results, giving you more control to build custom
workﬂows on top of the semantic search results. The output of the `Retrieve API` includes the `retrieved text chunks`, the `location type` and `URI` of the source data, as well as the relevance `scores` of the retrievals. 


We will then use the `RetrievalQAChain` provided by LangChain, add `RetreiverAPI` as a `retriever` in the chain. This chain will then automatically augment the text chunks being generated with the original prompt and pass it through the `anthropic.claude-instant-v1` model.


### Ask question
We will use the following workflow for this notebook. 

![retrieve.png](./images/retrieveAPI.png)


### USE CASE:

#### Dataset

In this example, you will use several years of Amazon's Letter to Shareholders as a text corpus to perform Q&A on. This data is already ingested into the Kknowledge Bases for Amazon Bedrock. You will need the `knowledge base id` to run this example.

### Python 3.10

⚠  For this lab we need to run the notebook based on a Python 3.10 runtime. ⚠

### Setup

To run this notebook you would need to install dependencies, and LangChain and update boto3, botocore for accessing the newly released Query API provided by Knowledge Bases for Amazon Bedrock.


In [1]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

### Follow the steps below to set up necessary packages

1. Import the necessary libraries for creating `bedrock-runtime` for invoking foundation models and `bedrock-agent-runtime` client for using Retrieve API provided by Knowledge Bases for Amazon Bedrock. 
2. Import Langchain for: 
   1. Initializing bedrock model  `anthropic.claude-v2` as our large language model to perform query completions using the RAG pattern. 
   2. Initialize Langchain retriever integrated with knowledge bases. 
   3. Later in the notebook we will wrap the LLM and retriever with `RetrieverQAChain` for building our Q&A application.

In [2]:
import boto3
import pprint
from botocore.client import Config
from langchain.llms.bedrock import Bedrock
from langchain.retrievers.bedrock import AmazonKnowledgeBasesRetriever

pp = pprint.PrettyPrinter(indent=2)

kb_id = "YWNES8HIIH" # replace it with your Knowledge base id.

bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
bedrock_client = boto3.client('bedrock-runtime')
bedrock_agent_client = boto3.client("bedrock-agent-runtime",
                              config=bedrock_config
                              )

model_kwargs_claude = {
    "temperature": 0,
    "top_k": 10,
    "max_tokens_to_sample": 3000
}

llm = Bedrock(model_id="anthropic.claude-instant-v1",
              model_kwargs=model_kwargs_claude,
              client = bedrock_client,)

### Retrieve API: Process flow 

Create a `AmazonKnowledgeBasesRetriever` object from LangChain which will call the `Retreive API` provided by Knowledge Bases for Amazon Bedrock which converts user queries into
embeddings, searches the knowledge base, and returns the relevant results, giving you more control to build custom
workﬂows on top of the semantic search results. The output of the `Retrieve API` includes the the `retrieved text chunks`, the `location type` and `URI` of the source data, as well as the relevance `scores` of the retrievals. 

In [3]:

retriever = AmazonKnowledgeBasesRetriever(
        knowledge_base_id=kb_id,
        retrieval_config={"vectorSearchConfiguration": {"numberOfResults": 4}},
        # endpoint_url=endpoint_url,
        # region_name="us-east-1",
        # credentials_profile_name="<profile_name>",
    )
docs = retriever.get_relevant_documents(
        query="By what percentage did AWS revenue grow year-over-year in 2022?"
    )
pp.pprint(docs)

[ Document(page_content='Whether companies saw extraordinary demand spikes, or demand diminish quickly with reduced external consumption, the cloud’s elasticity to scale capacity up and down quickly, as well as AWS’s unusually broad functionality helped millions of companies adjust to these difficult circumstances.   Our AWS and Consumer businesses have had different demand trajectories during the pandemic. In the first year of the pandemic, AWS revenue continued to grow at a rapid clip—30% year over year (“YoY”) in 2020 on a $35 billion annual revenue base in 2019—but slower than the 37% YoY growth in 2019. This was due in part to the uncertainty and slowing demand that so many businesses encountered, but also in part to our helping companies optimize their AWS footprint to save money. Concurrently, companies were stepping back and determining what they wanted to change coming out of the pandemic. Many concluded that they didn’t want to continue managing their technology infrastructur

`score`: You can view the associated score of each of the text chunk that was returned which depicts its correlation to the query in terms of how closely it matches it.

### Prompt specific to the model to personalize responses 

Here, we will use the specific prompt below for the model to act as a financial advisor AI system that will provide answers to questions by using fact based and statistical information when possible. We will provide the `Retrieve API` responses from above as a part of the `{context}` in the prompt for the model to refer to, along with the user `query`.  

In [4]:
from langchain.prompts import PromptTemplate

PROMPT_TEMPLATE = """
    Human: You are a financial advisor AI system, and provides answers to questions by using fact based and statistical information when possible. 
    Use the following pieces of information to provide a concise answer to the question enclosed in <question> tags. 
    If you don't know the answer, just say that you don't know, don't try to make up an answer.
    <context>
    {context}
    </context>

    <question>
    {question}
    </question>

    The response should be specific and use statistics or numbers when possible.

    Assistant:"""
claude_prompt = PromptTemplate(template=PROMPT_TEMPLATE, 
                                input_variables=["context","question"])

In [5]:
# fetch context from the response
def get_contexts(docs):
    contexts = []
    for retrievedResult in docs: 
        contexts.append(retrievedResult.page_content)
    return contexts

In [6]:
contexts = get_contexts(docs)
pp.pprint(contexts)

[ 'Whether companies saw extraordinary demand spikes, or demand diminish '
  'quickly with reduced external consumption, the cloud’s elasticity to scale '
  'capacity up and down quickly, as well as AWS’s unusually broad '
  'functionality helped millions of companies adjust to these difficult '
  'circumstances.   Our AWS and Consumer businesses have had different demand '
  'trajectories during the pandemic. In the first year of the pandemic, AWS '
  'revenue continued to grow at a rapid clip—30% year over year (“YoY”) in '
  '2020 on a $35 billion annual revenue base in 2019—but slower than the 37% '
  'YoY growth in 2019. This was due in part to the uncertainty and slowing '
  'demand that so many businesses encountered, but also in part to our helping '
  'companies optimize their AWS footprint to save money. Concurrently, '
  'companies were stepping back and determining what they wanted to change '
  'coming out of the pandemic. Many concluded that they didn’t want to '
  'conti

### Initiate the user prompt and response via the LLM

Here, we are going to format our prompt using the context generated by the retrieve API as well as the user query to get the final response.

In [7]:
query = "By what percentage did AWS revenue grow year-over-year in 2022?"
prompt = claude_prompt.format(context=contexts, 
                                 question=query)

In [8]:
response = llm(prompt)
pp.pprint(response)

(' Based on the context provided, AWS revenue grew 29% year-over-year in 2022 '
 'on a $62B revenue base, according to the passage.')


## Integrating the retriever and the LLM defined above with `RetrievalQA` Chain to build the Q&A application.

In [9]:
from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(
                                    llm=llm,
                                    chain_type="stuff",
                                    retriever=retriever,
                                    return_source_documents=True,
                                    chain_type_kwargs={"prompt": claude_prompt}
                                )

In [10]:
answer = qa(query)
print(answer['result'])

 AWS revenue grew 29% year-over-year in 2022 on a $62B revenue base, according to the context provided.
