## Building Q&A application using Knowledge Bases for Amazon Bedrock - Retrieve API

### Context

In this notebook, we will dive deep into building Q&A application using Knowledge Bases for Amazon Bedrock - Retrieve API. Here, we will query the knowledge base to get the desired number of document chunks based on similarity search. We will then augment the prompt with relevant documents and query which will go as input to Anthropic Claude V2 for generating response.

With a knowledge base, you can securely connect foundation models (FMs) in Amazon Bedrock to your company
data for Retrieval Augmented Generation (RAG). Access to additional data helps the model generate more relevant,
context-speciﬁc, and accurate responses without continuously retraining the FM. All information retrieved from
knowledge bases comes with source attribution to improve transparency and minimize hallucinations. For more information on creating a knowledge base using console, please refer to this [post](!https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html).

### Pattern

We can implement the solution using Retreival Augmented Generation (RAG) pattern. RAG retrieves data from outside the language model (non-parametric) and augments the prompts by adding the relevant retrieved data in context. Here, we are performing RAG effectively on the knowledge base created using console/sdk. 

### Pre-requisite

Before being able to answer the questions, the documents must be processed and ingested in vector database.

1. Load the documents into the knowledge base by connecting your s3 bucket (data source). 
2. Ingestion - Knowledge bases will split them into smaller chunks (based on the strategy selected), generate embeddings and store it in the associated vectore store.

![data_ingestion](./images/data_ingestion.png)


#### Notebook Walkthrough



For our notebook we will use the `Retreive API` provided by Knowledge Bases for Amazon Bedrock which converts user queries into
embeddings, searches the knowledge base, and returns the relevant results, giving you more control to build custom
workﬂows on top of the semantic search results. The output of the `Retrieve API` includes the the `retrieved text chunks`, the `location type` and `URI` of the source data, as well as the relevance `scores` of the retrievals. 


We will then use the text chunks being generated and augment it with the original prompt and pass it through the `anthropic.claude-v2` model using prompt engineering patterns based on your use case.
    

### USE CASE:

#### Dataset

In this example, you will use several years of Amazon's Letter to Shareholders as a text corpus to perform Q&A on. This data is already ingested into the Knowledge Bases for Amazon Bedrock. You will need the `knowledge base id` to run this example.
In your specific use case, you can sync different files for different domain topics and query this notebook in the same manner to evaluate model responses using the retrieve API from knowledge bases.


### Python 3.10

⚠  For this lab we need to run the notebook based on a Python 3.10 runtime. ⚠

If you carry out the workshop from your local environment outside of the Amazon SageMaker studio please make sure you are running a Python runtime > 3.10.

### Setup

To run this notebook you would need to install following packages.


### Follow the steps below to initiate the bedrock client:

1. Import the necessary libraries, along with langchain for bedrock model selection, llama index to store the service context containing the llm and embedding model instances. We will use this service context later in the notebook for evaluating the responses from our Q&A application. 

2. Initialize `anthropic.claude-v2` as our large language model to perform query completions using the RAG pattern with the given knowledge base, once we get all text chunk searches through the `retrieve` API.

In [2]:
import boto3
import pprint
from botocore.client import Config
from langchain.llms.bedrock import Bedrock

pp = pprint.PrettyPrinter(indent=2)

bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
bedrock_client = boto3.client('bedrock-runtime')
bedrock_agent_client = boto3.client("bedrock-agent-runtime",
                                    config=bedrock_config)

model_kwargs_claude = {
                            "temperature": 0,
                            "top_k": 10,
                            "max_tokens_to_sample": 3000
                        }

llm = Bedrock(model_id="anthropic.claude-v2",
              model_kwargs=model_kwargs_claude,
              client = bedrock_client,)

### Retrieve API: Process flow 

Define a retrieve function that calls the `Retreive API` provided by Knowledge Bases for Amazon Bedrock which converts user queries into
embeddings, searches the knowledge base, and returns the relevant results, giving you more control to build custom
workﬂows on top of the semantic search results. The output of the `Retrieve API` includes the the `retrieved text chunks`, the `location type` and `URI` of the source data, as well as the relevance `scores` of the retrievals. 

![retrieveAPI](./images/retrieveAPI.png)



In [3]:
def retrieve(query, kbId, numberOfResults=5):
    
    response = bedrock_agent_client.retrieve(
                                                retrievalQuery= {
                                                    'text': query
                                                },
                                                knowledgeBaseId=kbId,
                                                retrievalConfiguration= {
                                                    'vectorSearchConfiguration': {
                                                        'numberOfResults': numberOfResults
                                                    }
                                                }
                                            )
    
    return response

#### Initialize your Knowledge base id before querying responses from the initialized LLM

In [4]:
kb_id = "U4DWFE7LSJ" # replace it with your Knowledge base id.

Next, we will call the `retreive API`, and pass `knowledge base id`, `number of results` and `query` as paramters. 

`score`: You can view the associated score of each of the text chunk that was returned which depicts its correlation to the query in terms of how closely it matches it.

In [5]:
query = "固德威2023营收情况?"
response = retrieve(query, kb_id, 3)
retrievalResults = response['retrievalResults']
pp.pprint(retrievalResults)

[ { 'content': { 'text': '固德威(688390)_公司公告_固德威：2023年半年度报告新浪财经_新浪网                            '
                         '财经首页\xa0|\xa0新浪首页\xa0|\xa0新浪导航              热点推荐 '
                         '·自选股-轻松管理您的千只股票 ·金融e路通-理财投资更轻松 ·行情中心-通往财富之门          '
                         '财经首页 股票 基金 滚动 公告 大盘 个股 新股 权证 报告 环球市场 博客财经博客股票博客 股票吧 '
                         '港股 美股 行情中心 自选股        上证指数: 0000.00\u30000.00\u3000'
                         '00.00亿元\u3000|\u3000深圳成指: 0000.00\u30000.00\u3000'
                         '00.00亿元\u3000|\u3000沪深300: 0000.00\u30000.00\u3000'
                         '00.00亿元                   读取中,请稍候00-00 00:00:00     '
                         '--.--0.00 '
                         '(0.000%)昨收盘:0.000今开盘:0.000最高价:0.000最低价:0.000 '
                         '成交额:0成交量:0买入价:0.000卖出价:0.000 '
                         '市盈率:0.000收益率:0.00052周最高:0.00052周最低:0.000              '
                         '资讯与公告： 个股资讯 公司公告 年度报告 中期报告 一季度报告 三季度报告       '
                         '固德威：

### Prompt specific to the model to personalize responses 

Here, we will use the specific prompt below for the model to act as a financial advisor AI system that will provide answers to questions by using fact based and statistical information when possible. We will provide the `Retrieve API` responses from above as a part of the `{context_str}` in the prompt for the model to refer to, along with the user `query`.  

In [6]:
from langchain.prompts import PromptTemplate

PROMPT_TEMPLATE = """
                    Human: You are a financial advisor AI system, and provides answers to questions by using fact based and statistical information when possible. 
                    Use the following pieces of information to provide a concise answer to the question enclosed in <question> tags. 
                    If you don't know the answer, just say that you don't know, don't try to make up an answer.
                    <context>
                    {context_str}
                    </context>

                    <question>
                    {query_str}
                    </question>

                    The response should be specific and use statistics or numbers when possible.

                    Assistant:"""

claude_prompt = PromptTemplate(template=PROMPT_TEMPLATE, 
                               input_variables=["context_str","query_str"])

### Extract the text chunks from the retrieveAPI response

In the cell below, we will fetch the context from the retrieval results.

In [7]:
# fetch context from the response
def get_contexts(retrievalResults):
    
    contexts = []
    for retrievedResult in retrievalResults: 
        contexts.append(retrievedResult['content']['text'])
        
    return contexts

In [8]:
contexts = get_contexts(retrievalResults)
pp.pprint(contexts)

[ '固德威(688390)_公司公告_固德威：2023年半年度报告新浪财经_新浪网                            财经首页\xa0'
  '|\xa0新浪首页\xa0|\xa0新浪导航              热点推荐 ·自选股-轻松管理您的千只股票 ·金融e路通-理财投资更轻松 '
  '·行情中心-通往财富之门          财经首页 股票 基金 滚动 公告 大盘 个股 新股 权证 报告 环球市场 博客财经博客股票博客 股票吧 '
  '港股 美股 行情中心 自选股        上证指数: 0000.00\u30000.00\u300000.00亿元\u3000|\u3000'
  '深圳成指: 0000.00\u30000.00\u300000.00亿元\u3000|\u3000沪深300: 0000.00\u3000'
  '0.00\u300000.00亿元                   读取中,请稍候00-00 00:00:00     --.--0.00 '
  '(0.000%)昨收盘:0.000今开盘:0.000最高价:0.000最低价:0.000 成交额:0成交量:0买入价:0.000卖出价:0.000 '
  '市盈率:0.000收益率:0.00052周最高:0.00052周最低:0.000              资讯与公告： 个股资讯 公司公告 年度报告 '
  '中期报告 一季度报告 三季度报告       固德威：2023年半年度报告\t\t\t\t\r'
  ' \t\t\t\t\t（下载公告）      公告日期:2023-08-30      '
  '公司代码：688390                                          '
  '公司简称：固德威固德威技术股份有限公司2023年半年度报告重要提示一、 '
  '本公司董事会、监事会及董事、监事、高级管理人员保证半年度报告内容的真实性、准确性、完整性，不存在虚假记载、误导性陈述或重大遗漏，并承担个别和连带的法律责任。二、 '
  '重大风险提示公司已在本报告中详细描述可能存在的相关风险，具体内容详见本报告第三节“管理层讨论与分析”之“五、风险因素”相关内容。三、 '
  '公司全体董事出席董事会会议。四、

### Initiate the user prompt and response via the LLM

Here, we are going to format our prompt using the context generated by the retrieve API associated to our KB as well as the user query to get the final response.

In [9]:
import json
prompt = claude_prompt.format(context_str=contexts, 
                              query_str=query)

In [10]:
response = llm(prompt)
print(response)

 根据提供的信息,我无法直接看到固德威2023年的营收情况,因为提供的信息只包含了2022年年度报告和2023年半年度报告,没有2023年全年的相关数据。固德威的2023年全年营收情况还需要等待其2023年年度报告公布。所以关于固德威2023营收情况,我无法直接给出答案。
