# Wealth Management AI Advisor Capstone

In this capstone project, you will design and develop a sophisticated Wealth Management AI Assistant using the GenAI technology stack on AWS. This AI assistant will leverage state-of-the-art retrieval-augmented generation techniques (RAG) utilizing Amazon OpenSearch Serverless to analyze investment funds comprehensively and provide real-time insights to users. The core objective of the project is to bridge the gap between complex financial data and investment decision-making, offering a user-friendly platform for investors and wealth managers.

## Section: Data Ingestion Workflow


![title](images/data-ingestion.png)


In this section, we will focus on ingesting fund information data to a Vector DB. We will ingest fund PDF documents using text extraction techniques, get high dimensional vector representation of the text(embedding) using different models and ingest the data to Open Search Serverless Index

### Step1: 

Here are some of the packages you will need for this capstone.

In [95]:
!pip install langchain
!pip install sentence-transformers
!pip install pypdf
!pip install -U opensearch-py
!pip install datasets
!pip install ragas

#### Imports

In [36]:
import json
import os
import sys
import boto3

#langchain related imports
from langchain.embeddings import BedrockEmbeddings
from langchain.llms.bedrock import Bedrock
from langchain.load.dump import dumps


#### Set up Bedrock Client

In [37]:
module_path = "../"
sys.path.append(os.path.abspath(module_path))
from utils import bedrock, print_ww

boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None)
)

#### Set up Bedrock Models

In [38]:
# - create the Anthropic Model
llm_claude_v2 = Bedrock(
    model_id="anthropic.claude-v2", client=boto3_bedrock, model_kwargs={"max_tokens_to_sample": 200}
)

#### TO-DO: 1A - Set up additional inference parameters

Set up following in `model_kwargs` parameter

1. temperature
2. top_p
3. top_k

Hint: Refer [Claude Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-claude.html) for details about each parameter

In [39]:
llm_claude_v2 = "<YOUR_CODE_HERE>"

#### TO-DO: 1B - Set up Meta LLama Model 

Hints: 
1. To get the model Id, use `list_foundation_models` function using the bedrock client
2. Refer [Meta LLama Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html) for details about each parameter

In [40]:
llm_llama13b = "<YOUR_CODE_HERE>"

#### Set up an Bedrock Embedding Model

In [41]:
bedrock_embeddings = BedrockEmbeddings(client=boto3_bedrock)

#### Set Up HuggingFaceEmbeddings Model

In [42]:
from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

#### TO-DO:  1C - Experiment with differnt embedding models (Optional)

Hint: Use [MTEB](https://huggingface.co/spaces/mteb/leaderboard) leader board to choose from range of best performing embedding model. Choose an model thats relatively smaller(~1GB) for this exercise. If the kernel crashes, try changing the Studio notebook instance type to larger instance

In [None]:
# embeddings = "<YOUR CODE HERE>"

### Step 2: Data Preparation
Let's first download some of the files to build our document store. For this example we will be using the files in the data folder.

#### TO-DO: 2A - Visually Inspect Fund documents

Hint: PDF files are in data sub-folder in the 07_capstone folder

#### TO-DO: 2B - Load Fund PDFs using LangChain Loader

Hint: LangChain offers various [PDF Loaders](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf). Examples of PDF loaders include `PyPDFLoader`, `UnstructuredPDFLoader`,`PyPDFDirectoryLoader`, `OnlinePDFLoader` etc. After reviewing options, pick the optimal Loader that fits this use case.

In [None]:
loader = "<YOUR_CODE_HERE>"

#### To-DO: 2C - Text Splitters

Hint: LangChain offers many text splitters, based on the Fund document type determine the best text splitter for this use case. Refer [Text Splitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/) for details

Considerations: 
1. Depending on the embedding dimension, you may want to experiment different chunk size and overlaps
2. You can create many Open Search Serverless Index depending on different configurations and evaluate responses from LLMs


In [None]:
text_splitter = "<YOUR_CODE_HERE>"

#### TO-DO: 2D - Create a Function to load the documents and Text Splitter

In [None]:
def process_documents():
    # Load all the PDFs from the data folder
    loader = "<YOUR_CODE_HERE>"
    # Include loader from 2B
    documents = "<YOUR_CODE_HERE>"
    # Include the Text splitter from 2C
    text_splitter = "<YOUR_CODE_HERE>"
    # Function returns text_splitter
    split_docs = "<YOUR_CODE_HERE>"
    
    return documents, split_docs

# use this variable in the Vector Store creation
documents, split_docs = process_documents()

#### TO-DO: 2E - Compute stats

1. Number of documents loader
2. Average lenght of each documents after splitting
3. Total number of documents after splitting
4. Average lenght of each chunk after splitting

Hint: You many have to iterate steps 2A-2C if these stats doesnt fit this use case

In [None]:
total_docs = "<YOUR_CODE_HERE>"
average_charaters_in_documents = "<YOUR_CODE_HERE>"
total_chunks = "<YOUR_CODE_HERE>"
average_charaters_in_chunks = "<YOUR_CODE_HERE>"

#### To-DO: 2F - Test Embedding Models

In this section you evaluate the embedding models on the documents after loading and splitting them

In [None]:
import numpy as np

text = """
 Anthropic announced Claude 3, a new family of state-of-the-art AI models that allows customers to choose the exact combination of intelligence, speed, and cost that suits their business needs. 
 The three models in the family are Claude 3 Haiku, the fastest and most compact model for near-instant responsiveness, 
 Claude 3 Sonnet, the ideal balanced model between skills and speed, and 
 Claude 3 Opus, the most intelligent offering for the top-level performance on highly complex tasks.
"""

# use embed_query and np.array()
text_embedding = "<YOUR_CODE_HERE>"
      
text_embedding_dim = "<YOUR_CODE_HERE>"

print(f"text embedding dimension: {text_embedding_dim}")

print(f"text embedding: {text_embedding}")

### Step 3: Set Up OpenSearch Serverless

![title](images/retrieval.png)

We set up opensearch related variables and create a new vector store and index in this section. This is a one time activity and doesnt have to be repeated. There is no TO-DO in this section, execute the below cell to create the index.

In [53]:
import boto3
import time

suffix="wealth-ai"
vector_store_name = f"capstone-rag-{suffix}"
index_name = f"capstone-rag-index-{suffix}"
encryption_policy_name = f"capstone-rag-sp-{suffix}"
network_policy_name = f"capstone-rag-np-{suffix}"
access_policy_name = f"capstone-rag-ap-{suffix}"
user_identity = boto3.client('sts').get_caller_identity()['Arn']
user_account = boto3.client('sts').get_caller_identity()['Account']
sagemaker_notebook_role = 'arn:aws:iam::' + user_account + ':role/aws-service-role/sagemaker.amazonaws.com/AWSServiceRoleForAmazonSageMakerNotebooks'

aoss_client = boto3.client('opensearchserverless')

print("Creating a security policy for AOSS collection..")
security_policy = aoss_client.create_security_policy(
    name = encryption_policy_name,
    policy = json.dumps(
        {
            'Rules': [{'Resource': ['collection/' + vector_store_name],
            'ResourceType': 'collection'}],
            'AWSOwnedKey': True
        }),
    type = 'encryption'
)

print("Creating a network policy for AOSS collection..")
network_policy = aoss_client.create_security_policy(
    name = network_policy_name,
    policy = json.dumps(
        [
            {'Rules': [{'Resource': ['collection/' + vector_store_name],
            'ResourceType': 'collection'}],
            'AllowFromPublic': True}
        ]),
    type = 'network'
)

print("Creating an AOSS collection..")
collection = aoss_client.create_collection(name=vector_store_name,type='VECTORSEARCH')

print("Waiting for an AOSS collection to be created..")
while True:
    status = aoss_client.list_collections(collectionFilters={'name':vector_store_name})['collectionSummaries'][0]['status']
    if status in ('ACTIVE', 'FAILED'): break
    print(".")
    time.sleep(10)

print("Creating an access policy for the AOSS collection..")
access_policy = aoss_client.create_access_policy(
    name = access_policy_name,
    policy = json.dumps(
        [
            {
                'Rules': [
                    {
                        'Resource': ['collection/' + vector_store_name],
                        'Permission': [
                            'aoss:CreateCollectionItems',
                            'aoss:DeleteCollectionItems',
                            'aoss:UpdateCollectionItems',
                            'aoss:DescribeCollectionItems'],
                        'ResourceType': 'collection'
                    },
                    {
                        'Resource': ['index/' + vector_store_name + '/*'],
                        'Permission': [
                            'aoss:CreateIndex',
                            'aoss:DeleteIndex',
                            'aoss:UpdateIndex',
                            'aoss:DescribeIndex',
                            'aoss:ReadDocument',
                            'aoss:WriteDocument'],
                        'ResourceType': 'index'
                    }],
                'Principal': [user_identity, sagemaker_notebook_role],
                'Description': 'Easy data policy'}
        ]),
    type = 'data'
)

host = collection['createCollectionDetail']['id'] + '.' + os.environ.get("AWS_DEFAULT_REGION", None) + '.aoss.amazonaws.com:443'

print("AOSS host: " + host)

In [None]:
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
from langchain.vectorstores import OpenSearchVectorSearch

service = 'aoss'
credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAuth(credentials, os.environ.get("AWS_DEFAULT_REGION", None), service)

#### TO-DO: 3A - Create an Open Search vector store

Hint: Use [OpenSearchVectorSearch](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.opensearch_vector_search.OpenSearchVectorSearch.html) to VectorStore initialized from documents and embeddings.

In [None]:
vector_store_1 = OpenSearchVectorSearch.from_documents(
    documents=<YOUR-CODE-HERE>, #Hint: split_docs from process_documents() function
    embedding=<YOUR-CODE-HERE>, #Hint: Bedrock or HuggingFace Embeddings
    index_name=<YOUR-CODE-HERE>, #Hint: unique index name for each vector store
    engine=<YOUR-CODE-HERE>, # Hint: “nmslib”, “faiss”, “lucene”; default: “nmslib”
    space_type=<YOUR-CODE-HERE>, # Hint: “l2”, “l1”, “cosinesimil”, “linf”, “innerproduct”; default: “l2”
    opensearch_url=host,
    http_auth=auth,
    timeout = 100,
    use_ssl = True,
    verify_certs = True,
    connection_class = RequestsHttpConnection,
    m=16,
    ef_construction=128,
    ef_search=128
)

#### TO-DO: 3B - Create additional vector stores

In this section, you can create additional vector stores so you can evaluate performance with different combinations. Here are some ideas

1. Create a vector for each embedding you created Bedrock embedding vs HuggingFace Embeddings
2. Create a vector store with different chunk size and overlap
3. Create a vecore store with different OpenSearch vector engine
3. Create a vecore store with different OpenSearch vector space type

In [None]:
vector_store_2 = <YOUR-CODE-HERE>

Hint: Optionally, you can create additional vector stores

In [None]:
# vector_store_n = <YOUR-CODE-HERE> #Optional

# Congrats: You have successfully completed the Data Ingestion Workflow

# Section: Text Generation Workflow

![title](images/generation.png)

In the generation workflow, we will embed the user query using similar models in the data ingestion workflow to perform semantic search against the Vector DB. We will use various techniuques to get the similar documents and pass it to the LLM to generate text

### Step 4: Retrieval Augmented Generation (RAG)



#### Similarity Search

To test the retrieval from the vector store we created, We can use the similarity search method to make a query and return the chunks of text. Pay close attention to the top k= 3 results returned


In [57]:
query = "What are the ticker symbols for Calvert Global Real Estate Fund"

results = vector_store_1.similarity_search(query, k=3)

print(dumps(results, pretty=True))

#### Similarity Search with filters

Note that expected chunks should return from `data/Calvert Global Real Estate Fund Fact Sheet.pdf`, however we see other sources in the top k=3 results. Lets refine this search using filter in the similarity search

In [62]:
query = "What are the ticker symbols for Calvert Global Real Estate Fund"

metadata_filter = {"term": {"metadata.source.keyword": "data/Calvert Global Real Estate Fund Fact Sheet.pdf"}}

results = vector_store_1.similarity_search(query, k=3, boolean_filter=metadata_filter)

print(dumps(results, pretty=True))

#### TO-DO: 4A - Similarity Search Methods

Explore following similarity search methods of [OpenSearchVectorSearch](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.opensearch_vector_search.OpenSearchVectorSearch.html#langchain_community.vectorstores.opensearch_vector_search.OpenSearchVectorSearch.from_documents). Understand different parameters of these methods

1. similarity_search
2. similarity_search_with_score
3. similarity_search_with_score_by_vector

### Step 5: RAG Generate Answers

A typical weath management advisor would looks for insights from Fund documents and here is an example of a question that would give insights about a fund

In [75]:
query_0 = "What is the Expense Ratio of shares in California Limited-Term Tax-Free Funds. Provide me the response in bullet point format"

For each query in the below sections define the following 

1. Define a Prompt Template by apply prompt engineering best practices for each query
2. Use LangChain RetrievalQA, Hint: [as_retriever](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.opensearch_vector_search.OpenSearchVectorSearch.html#langchain_community.vectorstores.opensearch_vector_search.OpenSearchVectorSearch.as_retriever) examples
3. print results
4. capture results in capstone_eval.csv

In [45]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

prompt_template_0 = """Human: Use the following pieces of context to provide a concise answer to the question at the end. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
I want just the answer, don't include any verbose text
Assistant:"""

PROMPT_0 = PromptTemplate(template=prompt_template_0, input_variables=["context", "question"])

context_0 = vector_store_1.similarity_search(query_0, search_kwargs={"k": 3})

print(dumps(context_0, pretty=True))

qa_prompt_0 = RetrievalQA.from_chain_type(
    llm=llm_claude_v2,
    chain_type="stuff",
    retriever = vector_store_1.as_retriever(search_kwargs={"k": 3, "filter" : {}}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT_0},
)

result_0 = qa_prompt_0({"query": query_0})
print_ww(result_0["result"])

NameError: name 'vector_store_1' is not defined

In [82]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

prompt_template_0 = """Human: Use the following pieces of context to provide a concise answer to the question at the end. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
I want just the answer, don't include any verbose text
Assistant:"""

PROMPT_0 = PromptTemplate(template=prompt_template, input_variables=["context", "question"])

context_0 = vector_store_1.similarity_search(query_0, search_kwargs={"k": 3})

print(dumps(context_0, pretty=True))

qa_prompt_0 = RetrievalQA.from_chain_type(
    llm=llm_claude_v2,
    chain_type="stuff",
    retriever = vector_store_2.as_retriever(search_kwargs={"k": 3, "filter" : {}}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT},
)

result_0 = qa_prompt({"query": query_0})
print_ww(result_0["result"])

#### To-DO: 5A - Query 1

In [None]:
query_1 = "What is the 30-day SEC unsubsidized yield of Institutional Shares in California Limited Term Tax Free Funds"

prompt_template_1 = <YOUR_CODE_HERE>

PROMPT_1 = <YOUR_CODE_HERE>

context_1 = <YOUR_CODE_HERE>

print(dumps(context_1, pretty=True))

qa_prompt_1 = <YOUR_CODE_HERE>

result_1 = <YOUR_CODE_HERE>

print_ww(result_1["result"])

#### To-DO: 5B - Query 2

In [None]:
query_2 = "What is the annual return by year for Vanguard 500 Index Fund"

prompt_template_2 = <YOUR_CODE_HERE>

PROMPT_2 = <YOUR_CODE_HERE>

context_2 = <YOUR_CODE_HERE>

print(dumps(context_2, pretty=True))

qa_prompt_2 = <YOUR_CODE_HERE>

result_2 = <YOUR_CODE_HERE>

print_ww(result_2["result"])

#### To-DO: 5C - Query 3

In [None]:
query_3 = "What is the YTD high-low NAV of Class A Shares in California Limited Term Tax Free Funds"

prompt_template_3 = <YOUR_CODE_HERE>

PROMPT_3 = <YOUR_CODE_HERE>

context_3 = <YOUR_CODE_HERE>

print(dumps(context_3, pretty=True))

qa_prompt_3 = <YOUR_CODE_HERE>

result_3 = <YOUR_CODE_HERE>

print_ww(result_3["result"])

#### To-DO: 5D - Query 4

In [None]:
query_4 = "What is Share class information of California Limited-Term Tax-Free Funds"

prompt_template_4 = <YOUR_CODE_HERE>

PROMPT_4 = <YOUR_CODE_HERE>

context_4 = <YOUR_CODE_HERE>

print(dumps(context_4, pretty=True))

qa_prompt_4 = <YOUR_CODE_HERE>

result_4 = <YOUR_CODE_HERE>

print_ww(result_4["result"])

#### To-DO: 5E - Query 5

In [None]:
query_5 = "What are the Top holdings (%) of California Limited-Term Tax-Free Funds"

prompt_template_5 = <YOUR_CODE_HERE>

PROMPT_5 = <YOUR_CODE_HERE>

context_5 = <YOUR_CODE_HERE>

print(dumps(context_5, pretty=True))

qa_prompt_5 = <YOUR_CODE_HERE>

result_5 = <YOUR_CODE_HERE>

print_ww(result_5["result"])

# Congrats, you have successfully completed this section

# Section: LLM Eval with Ragas

In this section, you will evaluate the results of your RAG application. Refer [RAGAS](https://github.com/explodinggradients/ragas) documentation for more help

In [85]:
def get_client():
    
    from utils import bedrock, print_ww
    boto3_bedrock = bedrock.get_bedrock_client(
        assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
        region=os.environ.get("AWS_DEFAULT_REGION", None)
    )
    
    return boto3_bedrock

In [86]:
from langchain.llms.bedrock import Bedrock

eval_llm = Bedrock(
    client=get_client(),
    model_id="anthropic.claude-v2",
    model_kwargs={"max_tokens_to_sample": 500, "temperature": 0.9},
    streaming=True
    )

In [92]:
import csv
import ast  # Safe evaluation of strings containing Python literals

data_samples = {
    'question': [],
    'ground_truth': [],
    'answer': [],
    'contexts': []
}

# Replace 'your_file.csv' with the path to your CSV file
csv_file_path = './eval/capstone_eval.csv'

with open(csv_file_path, mode='r', encoding='utf-8') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        data_samples['question'].append(row['question'])
        data_samples['ground_truth'].append(row['ground_truths'])
        data_samples['answer'].append(row['answer'])
        
        # Assuming 'contexts' and 'ground_truths' are stored as strings that look like lists
        # Convert them from string representation to actual Python lists
        # If this assumption is incorrect, you'll need to adjust the parsing
        try:
            contexts = ast.literal_eval(row['contexts'])
            #ground_truths = ast.literal_eval(row['ground_truths'])
        except (ValueError, SyntaxError):
            # Fallback in case of parsing error, adjust as necessary
            contexts = row['contexts'].split(';')  # Example fallback, adjust based on actual format
            #ground_truths = row['ground_truths'].split(';')  # Example fallback, adjust based on actual format
        
        data_samples['contexts'].append(contexts)
        #data_samples['ground_truth'].append(ground_truths)

In [93]:
data_samples

In [96]:
from datasets import Dataset 

eval_dataset = Dataset.from_dict(data_samples)

In [98]:
from ragas import evaluate
import nest_asyncio  # CHECK NOTES
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    answer_similarity,
    answer_correctness,
    context_relevancy
)

# NOTES: Only used when running on a jupyter notebook, otherwise comment or remove this function.
nest_asyncio.apply()

result = evaluate(
    eval_dataset,
    metrics=[
        faithfulness,
        answer_relevancy,
        context_recall,
        context_precision,
        answer_similarity,
        #answer_correctness,
        context_relevancy
    ],
    llm=eval_llm,
    embeddings=bedrock_embeddings,
)

result

In [99]:
df = result.to_pandas()
df

In [100]:
df.to_csv("eval/ragas_results_capstone.csv")

# Congrats you have successfully completed the capstone project !

### Clean up
You have reached the end of this workshop. Following cell will delete all created resources.


In [None]:
aoss_client.delete_collection(id=collection['createCollectionDetail']['id'])
aoss_client.delete_access_policy(name=access_policy_name, type='data')
aoss_client.delete_security_policy(name=encryption_policy_name, type='encryption')
aoss_client.delete_security_policy(name=network_policy_name, type='network')

## Conclusion
Congratulations on completing this moduel on retrieval augmented generation! This is an important technique that combines the power of large language models with the precision of retrieval methods. By augmenting generation with relevant retrieved examples, the responses we recieved become more coherent, consistent and grounded. You should feel proud of learning this innovative approach. I'm sure the knowledge you've gained will be very useful for building creative and engaging language generation systems. Well done!

In the above implementation of RAG based Question Answering we have explored the following concepts and how to implement them using Amazon Bedrock and it's LangChain integration.

- Loading documents and generating embeddings to create a vector store
- Retrieving documents to the question
- Preparing a prompt which goes as input to the LLM
- Present an answer in a human friendly manner
- keep source knowledge up to date, and improve trust in our system by providing citations with every answer.

### Take-aways
- Experiment with different Vector Stores
- Leverage various models available under Amazon Bedrock to see alternate outputs
- Explore options such as persistent storage of embeddings and document chunks
- Integration with enterprise data stores

# Thank You