# Build a Q&A application with Bedrock, Langchain and FAISS index

This notebook explains steps requried to build a Question & Answer application using Retrieval Augmented Generation (RAG) architecture.
RAG combines the power of pre-trained LLMs with information retrieval - enabling more accurate and context-aware responses

(This notebook was tested on SageMaker Studio ml.m5.2xlarge instance with Datascience 3.0 kernel)

## Solution architecture

Below diagram provides an overview of the solution architecture. Key parts 

* Leverage Amazon Kendra or OpenSearch as the knowledge store or utilize a combination of Embeddings & Index store
* Range of options for Embeddings including Bedrock Titan, Cohere, Hugging face etc.
* FAISS, PineCone, Chroma, etc. as Vector stores
* LLM hosted by SageMaker Jumpstart endpoint or managed by Bedrock (Anthropic Claude, AI21 Jumbo Instruct or Titan)

<br>
<img src ="images/rag-architecture-options.png" width="600"/>

## Pre-requisites
NOTE: This step would take around 40-50 mins to install all dependencies. Time would vary based on the instance type used for this notebook.

In [None]:
!pip install faiss-cpu
!pip install langchain --upgrade
!pip install pypdf

In [None]:
!pip install sentence_transformers

In [None]:
!pip install sagemaker --upgrade

In [None]:
!pip install boto3 --upgrade

## Restart Kernel

In [None]:
#Restart Kernel after the installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)  

## Setup depedencies

In [2]:
#Check Python version is greater than 3.8 which is required by Langchain if you want to use Langchain
import sys
sys.version

'3.10.6 (main, Oct  7 2022, 20:19:58) [GCC 11.2.0]'

In [3]:
assert sys.version_info >= (3, 8)

In [4]:
import langchain

In [5]:
langchain.__version__

'0.0.314'

In [6]:
import os, json
from tqdm import tqdm
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter,CharacterTextSplitter,NLTKTextSplitter
import pathlib 

## Perform document pre-processing
Load the documents, perform clean-up of the text before generating embeddings

In [7]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=100,
    #separators=["\n\n", "\n", ".", "!", "?", " ", ",", ""],
    length_function=len,
    keep_separator=False,
    add_start_index=False
)


In [8]:
# Put your directory containing PDFs here
index_name = 'firetv'
directory = f'pdfs/{index_name}'

In [9]:
pdf_documents = [os.path.join(directory, filename) for filename in os.listdir(directory)]
pdf_documents

['pdfs/firetv/Amazon_Fire_TV_User_Guide.pdf']

In [10]:
langchain_documents = []
for document in pdf_documents:
    loader = PyPDFLoader(document)
    data = loader.load()
    langchain_documents.extend(data)


In [11]:
print("loaded document pages: ", len(langchain_documents))
print("Splitting all documents")
split_docs = text_splitter.split_documents(langchain_documents)
print("Num split pages: ", len(split_docs))

loaded document pages:  90
Splitting all documents
Num split pages:  189


In [12]:
split_docs[0].page_content

'Amazon Fire TV User Guide'

In [13]:
import regex as re
for d in split_docs:
    text = d.page_content
    text = re.sub(r"(\w+)-\n(\w+)", r"\1\2", text)
    text = re.sub(r"(?<!\n\s)\n(?!\s\n)", " ", text.strip()) # Remove newlines 
    text = re.sub(r"\n\s*\n", "\n\n", text)
    text = re.sub(r'[/X]', "", text)     #Remove hexadecimal chars
    text = re.sub(r"(\\u[0-9A-Fa-f]+)"," ",text) #Remove other speciail characters
    d.page_content = text

## Generate Embeddings
Use an embeddings model to generate embeddings of the cleaned-up doc

### Option 1- Bedrock Titan Embeddings

In [14]:
import boto3
import sagemaker
session = boto3.Session()
sagemaker_session = sagemaker.Session()
studio_region = sagemaker_session.boto_region_name 
bedrock = session.client("bedrock-runtime", region_name=studio_region)

from langchain.embeddings import BedrockEmbeddings
emb = BedrockEmbeddings(region_name ="us-east-1",model_id = "amazon.titan-e1t-medium")
emb.model_kwargs = {}

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml


### Option 2- Huggingface Embeddings - Requires sentence_transformers

In [15]:
from langchain.embeddings import HuggingFaceEmbeddings
emb = HuggingFaceEmbeddings()

## Setup local Vector store - FAISS 

In [16]:
from langchain.vectorstores import FAISS

In [17]:
print("Embed and create vector index")
db = FAISS.from_documents(split_docs, embedding=emb)

Embed and create vector index


### Save the indices locally as a file

In [18]:
index_path = 'faiss_indices'

In [19]:
print('Save the index created locally')
pathlib.Path(index_path).mkdir(parents=True, exist_ok=True)
db.save_local(folder_path=index_path, index_name= index_name)

Save the index created locally


### Load from local file cache

In [20]:
%%time
#Check if load local works properly
db_local = FAISS.load_local(folder_path=index_path, embeddings=emb, index_name=index_name)

CPU times: user 2.96 ms, sys: 0 ns, total: 2.96 ms
Wall time: 18.8 ms


### Perform a similarity search and get top 3 matching docs

In [21]:
query = "How to setup Parental controls?"
docs = db_local.similarity_search(query, k=3)
docs

[Document(page_content='Set Up Parental Controls With Parental Controls, you can block purchases and restrict access to Amazon movies, TV shows, games, apps, photos, and more. Note: Parental controls do not restrict content in third-party applications. Parental controls for third-party applications are determined by the app provider. When entering the PIN, you will need to use an Amazon remote or the Fire TV Remote App. You cannot use a third-party remote. 1. From the Home  screen, select Settings , and then select Parental Controls . 2. Using your remote, press the Select  button to turn parental controls On or Off. 3. Enter your Parental Controls PIN, and then select Next . Your PIN is the same PIN you use for other Amazon services such as Amazon Instant Video.', metadata={'source': 'pdfs/firetv/Amazon_Fire_TV_User_Guide.pdf', 'page': 15}),
 Document(page_content="Set time limits1. Select a child's profile. 2. In the pop-out menu, select Time Limits . 3. To set time limits, you need 

## Access LLM with the context from vecor store

In [22]:
from langchain.llms.bedrock import Bedrock

#Creating Anthropic Claude
model_args= {'max_tokens_to_sample':200,'temperature':0}
llm = Bedrock(model_id="anthropic.claude-v1", client=bedrock, model_kwargs=model_args)

### Method 1- Query with chain

In [24]:
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm, chain_type="stuff")
documents = db_local.similarity_search(query=query, k=5)
print(chain.run(input_documents=documents, question=query))

 Here are the steps to set up Parental Controls on an Amazon Fire TV device:

1. From the Home screen, select Settings, and then select Parental Controls. 

2. Using your remote, press the Select button to turn parental controls On.

3. Enter your Parental Controls PIN, and then select Next. Your PIN is the same PIN you use for other Amazon services such as Amazon Instant Video.

4. Select PIN Protect Purchases to change the button to ON. With Parental Controls enabled, all purchases on your Amazon Fire TV device will require a PIN.

5. To set time limits, select a child's profile. In the pop-out menu, select Time Limits. 

6. To set time limits, select Turn On Time Limits. 

7. Set the Screen Time, which is how long your child can watch videos and play with apps each day.

8


### Method 2- Query with Prompt template (Provides prompt customization)

In [25]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

prompt_template = """Human: Use the following pieces of context to provide a concise answer to the question at the end. 

{context}

Question: {question}
Assistant:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

In [26]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=db_local.as_retriever(
        search_type="similarity", search_kwargs={"k": 3}
    ),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT}
)

response = qa({'query':query})
print(response['result'])

 Here are the steps to set up Parental Controls on an Amazon Fire TV device:

1. From the Home screen, select Settings, and then select Parental Controls. 

2. Using your remote, press the Select button to turn parental controls On.

3. Enter your Parental Controls PIN, and then select Next. Your PIN is the same PIN you use for other Amazon services.

4. Select a child's profile. 

5. In the pop-out menu, select Time Limits.

6. To set time limits, select Turn On Time Limits.

7. Set the Screen Time, which is how long your child can watch videos and play with apps each day.

8. Set the Bedtime, which is the time after which your child cannot use their profile.

9. Select Done to save the changes.

10. To add content to your child's library, select


In [27]:
response['source_documents']

[Document(page_content='Set Up Parental Controls With Parental Controls, you can block purchases and restrict access to Amazon movies, TV shows, games, apps, photos, and more. Note: Parental controls do not restrict content in third-party applications. Parental controls for third-party applications are determined by the app provider. When entering the PIN, you will need to use an Amazon remote or the Fire TV Remote App. You cannot use a third-party remote. 1. From the Home  screen, select Settings , and then select Parental Controls . 2. Using your remote, press the Select  button to turn parental controls On or Off. 3. Enter your Parental Controls PIN, and then select Next . Your PIN is the same PIN you use for other Amazon services such as Amazon Instant Video.', metadata={'source': 'pdfs/firetv/Amazon_Fire_TV_User_Guide.pdf', 'page': 15}),
 Document(page_content="Set time limits1. Select a child's profile. 2. In the pop-out menu, select Time Limits . 3. To set time limits, you need 

## Implement RAG architecture with Kendra Index

In [None]:
kendra_index = "" #Provide Kendra index here

In [None]:
from langchain.schema.document import Document

kendra = boto3.client('kendra')
response = kendra.retrieve(IndexId=kendra_index,QueryText=query)
docs = [Document(page_content = r['Content']) for r in response['ResultItems']]
docs

In [None]:
chain = load_qa_chain(llm, chain_type="stuff")
print(chain.run(input_documents=docs, question=query))