## RAG-Based Q&A for FNMA Selling Guide 
 
### Advanced RAG Approaches: Retrieval Enhancement -- Parent-Child Chunks Retriever

> *This notebook should work well with the **`Amazon Bedrock and LangChain freamwork`** kernel in SageMaker Studio*

### Retrieval Enhancement -- Parent-Child Chunks Retriever
The concept here is to retrieve smaller chunks for better search quality, but add up surrounding context for LLM to reason upon.
<img src="./images/RAG_parent_Child.jpg" width="800" height="600">
### Similar Approahces -- Sentence Window Retriever
The sentence window retrieval is to expand context by sentences around the smaller retrieved chunk
<img src="./images/sentence_window_retrieval.jpg" width="800" height="600">

## Use Case
#### Purpose
To help answer questions based on the LLM and RAG architecture

The model will try to answer from the documents in easy language.

#### Dataset
Fannie Mae Selling Guide (PDF document)



## Implementation
In order to follow the RAG approach this notebook is using the LangChain framework where it has integrations with different services and tools that allow efficient building of patterns such as RAG. We will be using the following tools:

- **LLM (Large Language Model)**: Anthropic Claude V1 available through Amazon Bedrock

  This model will be used to understand the document chunks and provide an answer in human friendly manner.
- **Embeddings Model**: Amazon Titan Embeddings available through Amazon Bedrock

  This model will be used to generate a numerical representation of the textual documents
- **Document Loader**: PDF Loader available through LangChain

  This is the loader that can load the documents from a source, for the sake of this notebook we are loading the sample files from a local path. This could easily be replaced with a loader to load documents from enterprise internal systems.

- **Vector Store**: FAISS available through LangChain

  In this notebook we are using this in-memory vector-store to store both the embeddings and the documents. In an enterprise context this could be replaced with a persistent store such as AWS OpenSearch, RDS Postgres with pgVector, ChromaDB, Pinecone or Weaviate.


In [2]:
# !pip install chroma
# !pip install chromadb

In [2]:
import warnings
warnings.filterwarnings('ignore')

In [3]:
import json
import os
import sys

import boto3

module_path = ".."
sys.path.append(os.path.abspath(module_path))
from utils import bedrock, print_ww


# ---- ⚠️ Un-comment and edit the below lines as needed for your AWS setup ⚠️ ----

# os.environ["AWS_DEFAULT_REGION"] = "<REGION_NAME>"  # E.g. "us-east-1"
# os.environ["AWS_PROFILE"] = "<YOUR_PROFILE>"
# os.environ["BEDROCK_ASSUME_ROLE"] = "<YOUR_ROLE_ARN>"  # E.g. "arn:aws:..."

boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None)
)

Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock-runtime(https://bedrock-runtime.us-east-1.amazonaws.com)


## Configure langchain

We begin with instantiating the LLM and the Embeddings model. Here we are using Anthropic Claude for text generation and Amazon Titan for text embedding.

Note: It is possible to choose other models available with Bedrock. You can replace the `model_id` as follows to change the model.

`llm = Bedrock(model_id="amazon.titan-text-express-v1")`

Check Available text generation and embedding models Ids under Amazon Bedrock.





In [4]:
# We will be using the Titan Embeddings Model to generate our Embeddings.
from langchain.embeddings import BedrockEmbeddings
from langchain.llms.bedrock import Bedrock

# - create the Anthropic Model

# - create the Anthropic Model
llm = Bedrock(model_id="anthropic.claude-v2", client=boto3_bedrock, model_kwargs={'max_tokens_to_sample':1024, 'temperature':0.1,'top_p':0.5})
#llm = Bedrock(model_id="meta.llama2-13b-chat-v1", client=boto3_bedrock, model_kwargs={'temperature':0.1, 'max_gen_len':1024})

bedrock_embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v1", client=boto3_bedrock)

## Data Preparation
Let's first transform files to build the document store and vector index. For this example we will be using public FNMA Selling Guidedocuments from

Leverage [DirectoryLoader from PyPDF available under LangChain](https://python.langchain.com/en/latest/reference/modules/document_loaders.html) and splitting them into smaller chunks.

Note: The retrieved document/text should be large enough to contain enough information to answer a question; but small enough to fit into the LLM prompt. Also the embeddings model has a limit of the length of input tokens limited to 8192 tokens, which roughly translates to ~32,000 characters. For the sake of this use-case we are creating chunks of roughly 1000 characters with an overlap of 100 characters using [RecursiveCharacterTextSplitter](https://python.langchain.com/en/latest/modules/indexes/text_splitters/examples/recursive_text_splitter.html).

In [5]:
import numpy as np
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader, PyPDFDirectoryLoader

# loader = PyPDFDirectoryLoader("./data/")

# documents = loader.load()

import pickle
with open('./data/loaded_document.pkl', 'rb') as file:
    documents = pickle.load(file)


In [6]:
# This text splitter is used to create the parent documents
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=1250,chunk_overlap=125)
# This text splitter is used to create the child documents
# It should create documents smaller than the parent
child_splitter = RecursiveCharacterTextSplitter(chunk_size=600,chunk_overlap=60)

In [7]:
from langchain.chains.question_answering import load_qa_chain
from langchain_community.vectorstores import Chroma
from langchain.vectorstores import FAISS
from langchain.indexes import VectorstoreIndexCreator
from langchain.indexes.vectorstore import VectorStoreIndexWrapper
from langchain.storage import InMemoryStore

from langchain.retrievers import ParentDocumentRetriever


if 'vectordb' in globals(): # If you've already made your vectordb this will delete it so you start fresh
    vectordb.delete_collection()

# The vectorstore to use to index the child chunks
vectorstore = Chroma(
    collection_name="split_parents", embedding_function= bedrock_embeddings
)
# The storage layer for the parent documents
store = InMemoryStore()

retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter, 
    search_kwargs={"k": 5}
)

retriever.add_documents(documents)

## Question Answering

Now that we have our vector store in place, we can start asking questions.

In [8]:
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
prompt_template = """

Human: Use the following pieces of context to provide a concise answer to the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
<context>
{context}
</context

Question: {question}

Answer:"""

PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

In [9]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT}
)


In [10]:
query = "What are acceptable flood insurance policies for the lender?"

In [11]:
answer = qa({"query": query})
print_ww(answer['query'],'\n',answer['result'])

  warn_deprecated(


What are acceptable flood insurance policies for the lender?
  Based on the context provided, acceptable flood insurance policies for the lender are:

- A standard policy issued under the National Flood Insurance Program (NFIP).

- A policy issued by a private insurer, provided the terms and amount of coverage are at least equal
to that provided under an NFIP policy based on a review of the full policy issued by a private
insurer, and the insurer meets Fannie Mae's rating requirements as specified in Property Insurer
Rating Requirements in B7-3-01, General Property Insurance Requirements for All Property Types.


Let's ask a different question:

In [12]:
query_2 = 'When can rental income be used to qualify?'

In [13]:
answer_2 = qa({"query": query_2})
print_ww(answer_2['query'],'\n',answer_2['result'])

When can rental income be used to qualify?
  Based on the context provided, rental income can be used to qualify in the following instances:

- If the borrower currently owns a principal residence (or has a current housing expense) and has at
least a one-year history of receiving rental income or at least one year of documented property
management experience, there are no restrictions on using rental income from the subject or non-
subject property.

- If the borrower does not currently have a housing expense and has at least one year of receiving
rental income or documented property management experience, rental income from non-subject
properties can be used.

- Rental income from the subject property cannot be used if the borrower does not own a principal
residence and does not have a current housing expense.

- Rental income from a non-subject property that is newly acquired or newly placed in service less
than a year ago cannot be used.

So in summary, rental income can be used wit

In [14]:
query_3=   "Can part-time income be used to qualify?"

In [15]:
answer_3 = qa({"query": query_3})
print_ww(answer_3['query'],'\n',answer_3['result'])

Can part-time income be used to qualify?
  Based on the context provided, part-time income can be used to qualify as long as it meets the
requirements outlined. The key points are:

- Income must be verified in accordance with Section B3-3.1, Employment and Other Sources of Income.
This includes obtaining documentation on the amount and duration of the income.

- If the borrower will return to work full-time by the first payment date, their regular employment
income can be used to qualify.

- If they will not return to work full-time by the first payment date, the lender must use the
lesser of the temporary leave income or the regular employment income to qualify.

So in summary, yes part-time income can be used as long as it is properly documented and meets the
requirements around using the lesser of the temporary or regular income if the borrower is not
returning to full-time work by the first payment date. The income amounts must also be entered
accurately into DU.
