# RAG with Bedrock and Kendra

## Pre-requisites

* You must configure your Amazon Kendra index before running this notebook. For instructions check the documentation here: https://docs.aws.amazon.com/kendra/latest/dg/create-index.html
* During the preview, your account must have access to Amazon Bedrock, or you should have a shared role or profile with access to it
* During the preview, you must download the Bedrock SDK for installing in the next cells

For more information on Bedrock topics check the user guide.

In [3]:
%pip install ../dependencies/boto3-1.26.165-py3-none-any.whl --quiet
%pip install ../dependencies/botocore-1.29.165-py3-none-any.whl --quiet
%pip install --upgrade langchain

[0mNote: you may need to restart the kernel to use updated packages.
[0mNote: you may need to restart the kernel to use updated packages.
Collecting langchain
  Obtaining dependency information for langchain from https://files.pythonhosted.org/packages/59/d0/074f7fbd7323623cca4175e0323c2cff565d5cf8c6b58f5dc81f046aa29f/langchain-0.0.240-py3-none-any.whl.metadata
  Using cached langchain-0.0.240-py3-none-any.whl.metadata (14 kB)
Using cached langchain-0.0.240-py3-none-any.whl (1.4 MB)
Installing collected packages: langchain
  Attempting uninstall: langchain
    Found existing installation: langchain 0.0.200
    Uninstalling langchain-0.0.200:
      Successfully uninstalled langchain-0.0.200
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
aws-langchain 0.0.1 requires langchain==0.0.137, but you have langchain 0.0.240 which is incompatible.[0m[31m
[0mSuc

In [5]:
#### Un comment the following lines to run from your local environment outside of the AWS account with Bedrock access

#import os
#os.environ['BEDROCK_ASSUME_ROLE'] = 'arn:aws:iam::191767470724:role/PowerUserRole'
#os.environ['AWS_PROFILE'] = 'bedrock-internal'

In [23]:
import boto3
import json
import os
import sys

module_path = ".."
sys.path.append(os.path.abspath(module_path))
from utils import bedrock, print_ww

os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'
#boto3_bedrock = bedrock.get_bedrock_client(os.environ.get('BEDROCK_ASSUME_ROLE', None))

bedrock = boto3.client(
 service_name='bedrock',
 region_name='us-east-1',
 endpoint_url='https://bedrock.us-east-1.amazonaws.com'
)

In [26]:
# We will be using the Titan Embeddings Model to generate our Embeddings.
from langchain.embeddings import BedrockEmbeddings
from langchain.llms.bedrock import Bedrock

# - create the Anthropic Model
llm = Bedrock(model_id="anthropic.claude-v2", client=bedrock, model_kwargs={'max_tokens_to_sample':8000})
bedrock_embeddings = BedrockEmbeddings(client=bedrock)

In [58]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain.retrievers import AmazonKendraRetriever
import boto3

kendra_index_id = '2c1575af-b7aa-44cb-ae01-454598936576'
region = 'us-east-1'

kendra_client = boto3.client('kendra')


prompt_template = """Human: Use the following pieces of context to provide a concise answer to the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}

Question: {question}
Assistant:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=AmazonKendraRetriever(
        index_id=kendra_index_id,
        region_name=region,
        client=kendra_client,
        attribute_filter={
            'EqualsTo': {
                'Key': '_language_code',
                'Value': {'StringValue': 'es'}
            }
        }
    ),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT},
)
query = "How does Amazon Kendra learn?"
result = qa({"query": query})
print_ww(f"{result['result']}\n\nSource(s):\n{result['source_documents'][0].metadata['source']}")

 Based on the context, Amazon Kendra learns query suggestions based on queries that users search
for. Specifically:

- It continuously analyzes the queries in a customer's query log
- It identifies frequently searched queries and popular queries
- It suggests these popular queries to users as they start typing
- It requires a minimum number of unique users (default 3) to have searched a query before
suggesting it
- It evaluates how recently the queries were searched based on a "query log time window" set by the
customer
- It can either learn from all queries or only queries that include user information, depending on a
customer configuration

So in summary, Amazon Kendra learns by analyzing a customer's query logs and identifying frequently
searched queries to then suggest to users.

Source(s):
https://docs.aws.amazon.com/kendra/latest/APIReference/API_DescribeQuerySuggestionsConfig.html


-------

## Creating a Streamlit UI for this RAG application

In [4]:
%%writefile app.py

import streamlit as st
from streamlit_chat import message
from langchain.llms.bedrock import Bedrock
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from typing import Dict
import json
from io import StringIO
from random import randint
import boto3
from langchain import PromptTemplate


st.set_page_config(page_title="Retrieval Augmented Generation", page_icon=":robot:", layout="wide")
st.header("Document Insights Chatbot with Amazon Bedrock")

bedrock = boto3.client(
 service_name='bedrock',
 region_name='us-east-1',
 endpoint_url='https://bedrock.us-east-1.amazonaws.com'
)

# KENDRA ------------

from langchain.retrievers import AmazonKendraRetriever
from langchain.llms.bedrock import Bedrock

kendra_index_id = '2c1575af-b7aa-44cb-ae01-454598936576'
region = 'us-east-1'

kendra_client = boto3.client('kendra')

# - create the Anthropic Model
llm = Bedrock(model_id="anthropic.claude-v2", client=bedrock, model_kwargs={'max_tokens_to_sample':8000})

from langchain.chains.question_answering import load_qa_chain
from langchain.vectorstores import Pinecone
from langchain.indexes import VectorstoreIndexCreator
from langchain.indexes.vectorstore import VectorStoreIndexWrapper

vectorstore_pinecone = Pinecone.from_existing_index(
    embedding=bedrock_embeddings,
    index_name = index_name
)

wrapper_store_pinecone = VectorStoreIndexWrapper(vectorstore=vectorstore_pinecone)


# LANGCHAIN ------------

#qa = RetrievalQA.from_chain_type(

prompt_template = """
Human: Consider the context in the <context></context> XML tags and your own knowledge, to answer the question at the end.

<context>
{context}
</context>

Question: {question}
Assistant:
"""

prompt = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)


qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=AmazonKendraRetriever(
        index_id=kendra_index_id,
        region_name=region,
        client=kendra_client,
        #attribute_filter={
        #    'EqualsTo': {
        #        'Key': '_language_code',
        #        'Value': {'StringValue': 'es'}
        #    }
        #}
    ),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT},
)



#@st.cache_resource
#def load_chain(_prompt):
chatchain = RetrievalQA.from_chain_type(
        llm = Bedrock(
            #model_id ='anthropic.claude-instant-v1'
            model_id='anthropic.claude-v1',
            client=bedrock,
            model_kwargs={
                'max_tokens_to_sample':8000,
                'temperature':0,
                'top_p':0.9,
                'stop_sequences': ["Human"]
            }
        ),
        chain_type="stuff",
        retriever=AmazonKendraRetriever(
            index_id=kendra_index_id,
            region_name=region,
            client=kendra_client,
            #attribute_filter={
            #    'EqualsTo': {
            #        'Key': '_language_code',
            #        'Value': {'StringValue': 'es'}
            #    }
            #}
        ),
        return_source_documents=True,
        chain_type_kwargs={"prompt": prompt}
    )
memory = ConversationBufferMemory()
chain = ConversationChain(llm=llm, memory=memory)
#    return chain

#chatchain = load_chain(prompt)

# initialise session variables
if 'generated' not in st.session_state:
    st.session_state['generated'] = []
if 'past' not in st.session_state:
    st.session_state['past'] = []
if 'widget_key' not in st.session_state:
    st.session_state['widget_key'] = str(randint(1000, 100000000))

# Sidebar - the clear button is will flush the memory of the conversation
#st.sidebar.title("Sidebar")
st.sidebar.image('./images/AWS_logo_RGB.png', width=150)

st.markdown(
    f'''
        <style>
            .sidebar .sidebar-content {{
                width: 150px;
            }}
        </style>
    ''',
    unsafe_allow_html=True
)

# this is the container that displays the past conversation
response_container = st.container()
# this is the container with the input text box
container = st.container()

with container:
    # define the input text box
    with st.form(key='my_form', clear_on_submit=True):
        user_input = st.text_area("You:", key='input', height=50)
        submit_button = st.form_submit_button(label='Send')

    # when the submit button is pressed we send the user query to the chatchain object and save the chat history
    if submit_button and user_input:
        #input_prompt = prompt.format(
        #    user_input=user_input,
        #)
        output = chatchain({"query": user_input})
        for i in output["source_documents"]:
            sources = i.metadata["source"]
        #output = chatchain(input_prompt)["response"]
        st.session_state['past'].append(user_input)
        if sources:
            st.session_state['generated'].append(output["result"] + "Source document(s):\n" + sources)
            sources = ""
        else:
            st.session_state['generated'].append(output["result"])

# this loop is responsible for displaying the chat history
if st.session_state['generated']:
    with response_container:
        for i in range(len(st.session_state['generated'])):
            message(st.session_state["past"][i], is_user=True, key=str(i) + '_user', avatar_style="adventurer", seed=120)
            message(st.session_state["generated"][i], key=str(i), avatar_style="bottts", seed=123)


Overwriting app.py
