# Retrieval Augmented Question & Answering with Amazon Bedrock using LangChain

### Challenges

When trying to solve a Question Answering task over a larger document corpus with the help of LLMs we need to master the following challenges (amongst others):
- How to manage large document(s) that exceed the token limit
- How to find the document(s) relevant to the question being asked

### Infusing knowledge into LLM-powered systems

We have two primary [types of knowledge for LLMs](https://www.pinecone.io/learn/langchain-retrieval-augmentation/): 
- **Parametric knowledge**: refers to everything the LLM learned during training and acts as a frozen snapshot of the world for the LLM. 
- **Source knowledge**: covers any information fed into the LLM via the input prompt. 

When trying to infuse knowledge into a generative AI - powered application we need to choose which of these types to target. Fine-tuning, explored in other workshops, deals with elevating the parametric knowledge through fine-tuning. Since fine-tuning is a resouce intensive operation, this option is well suited for infusing static domain-specific information like domain-specific langauage/writing styles (medical domain, science domain, ...) or optimizing performance towards a very specific task (classification, sentiment analysis, RLHF, instruction-finetuning, ...). 

In contrast to that, targeting the source knowledge for domain-specific performance uplift is very well suited for all kinds of dynamic information, from knowledge bases in structured and unstructured form up to integration of information from live systems. This Lab is about retrieval-augmented generation, a common design pattern for ingesting domain-specific information through the source knowledge. It is particularily well suited for ingestion of information in form of unstructured text with semi-frequent update cycles. 

In this notebook we explain how to utilize the RAG (retrieval-agumented generation) pattern originating from [this](https://arxiv.org/pdf/2005.11401.pdf) paper published by Lewis et al in 2021. It is particularily useful for Question Answering by finding and leveraging the most useful excerpts of documents out of a larger document corpus providing answers to the user questions.

#### Prepare documents
![Embeddings](./images/Embeddings_lang.png)

Before being able to answer the questions, the documents must be processed and a stored in a document store index
- Load the documents
- Process and split them into smaller chunks
- Create a numerical vector representation of each chunk using Amazon Bedrock Titan Embeddings model
- Create an index using the chunks and the corresponding embeddings
#### Ask question
![Question](./images/Chatbot_lang.png)

When the documents index is prepared, you are ready to ask the questions and relevant documents will be fetched based on the question being asked. Following steps will be executed.
- Create an embedding of the input question
- Compare the question embedding with the embeddings in the index
- Fetch the (top N) relevant document chunks
- Add those chunks as part of the context in the prompt
- Send the prompt to the model under Amazon Bedrock
- Get the contextual answer based on the documents retrieved

## Usecase
#### Dataset
In this example, you will use several years of Amazon's Letter to Shareholders as a text corpus to perform Q&A on.

## Implementation
In order to follow the RAG approach this notebook is using the LangChain framework where it has integrations with different services and tools that allow efficient building of patterns such as RAG. We will be using the following tools:

- **LLM (Large Language Model)**: Anthropic Claude available through Amazon Bedrock

  This model will be used to understand the document chunks and provide an answer in human friendly manner.
- **Embeddings Model**: Amazon Titan Embeddings available through Amazon Bedrock

  This model will be used to generate a numerical representation of the textual documents

- **Document Loader**: 
    - PDF Loader available through LangChain for PDFs

  These are loaders that can load the documents from a source, for the sake of this notebook we are loading the sample files from a local path. This could easily be replaced with a loader to load documents from enterprise internal systems.

- **Vector Store**: FAISS available through LangChain
  In this notebook we are using this in-memory vector-store to store both the embeddings and the documents. In an enterprise context this could be replaced with a persistent store such as AWS OpenSearch, RDS Postgres with pgVector, ChromaDB, Pinecone or Weaviate.

- **Index**: VectorIndex
  The index helps to compare the input embedding and the document embeddings to find relevant document.

- **Wrapper**: wraps index, vector store, embeddings model and the LLM to abstract away the logic from the user.

### Python 3.10

⚠️⚠️⚠️ For this lab we need to run the notebook based on a Python 3.10 runtime. ⚠️⚠️⚠️

If you carry out the workshop from your local environment outside of the Amazon SageMaker studio please make sure you are running a Python runtime > 3.10.

### Setup
To run this notebook you would need to install 2 more dependencies, [PyPDF](https://pypi.org/project/pypdf/) and [FAISS vector store](https://github.com/facebookresearch/faiss).

Then begin with instantiating the LLM and the Embeddings model. Here we are using Anthropic Claude to demonstrate the use case.

Note: It is possible to choose other models available with Bedrock. You can replace the `model_id` as follows to change the model.

`llm = Bedrock(model_id="...")`

In [1]:
%pip install --force-reinstall boto3 --quiet

Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install langchain==0.0.305 --force-reinstall --quiet
%pip install pypdf==3.8.1 faiss-cpu==1.7.4 --force-reinstall --quiet

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
botocore 1.31.85 requires urllib3<1.27,>=1.25.4; python_version < "3.10", but you have urllib3 2.1.0 which is incompatible.
llama-index 0.8.37 requires urllib3<2, but you have urllib3 2.1.0 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llama-index 0.8.37 requires urllib3<2, but you have urllib3 2.1.0 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


In [3]:
%pip install tiktoken==0.4.0 --force-reinstall --quiet

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
botocore 1.31.85 requires urllib3<1.27,>=1.25.4; python_version < "3.10", but you have urllib3 2.1.0 which is incompatible.
llama-index 0.8.37 requires urllib3<2, but you have urllib3 2.1.0 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


In [4]:
%pip install sqlalchemy==2.0.21 --force-reinstall --quiet

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llama-index 0.8.37 requires urllib3<2, but you have urllib3 2.1.0 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


Uncomment the following lines to run from your local environment outside of the AWS account with Bedrock access. If you are carrying the lab out in Amazon SageMaker Studio, you are set without. 

In [5]:
#import os
#os.environ['BEDROCK_ASSUME_ROLE'] = '<YOUR_VALUES>'
#os.environ['AWS_PROFILE'] = 'bedrock-user'

In [4]:
import boto3
import json
import os
import sys

module_path = ".."
sys.path.append(os.path.abspath(module_path))
from utils import bedrock, print_ww

bedrock_client = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None),
    runtime=True # Default. Needed for invoke_model() from the data plane
)

Create new client
  Using region: None
boto3 Bedrock client successfully created!
bedrock-runtime(https://bedrock-runtime.us-east-1.amazonaws.com)


In [5]:
from utils.TokenCounterHandler import TokenCounterHandler

token_counter = TokenCounterHandler()

### Setup langchain

We create an instance of the Bedrock classes for the LLM and the embedding models. At the time of writing, Bedrock supports one embedding model and therefore we do not need to specify any model id. To be able to compare token consumption across the different RAG-approaches shown in the workshop labs we use langchain callbacks to count token consumption.

In [6]:
# We will be using the Titan Embeddings Model to generate our Embeddings.
from langchain.embeddings import BedrockEmbeddings
from langchain.llms.bedrock import Bedrock

# - create the Anthropic Model
llm = Bedrock(model_id="anthropic.claude-v2", 
              client=bedrock_client, 
              model_kwargs={
                  'max_tokens_to_sample': 200
              }, 
              callbacks=[token_counter])

# - create the Titan Embeddings Model
bedrock_embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v1",
                                       client=bedrock_client)

### Data Preparation
Let's first download some of the files to build our document store.

In this example, you will use several years of Amazon's Letter to Shareholders as a text corpus to perform Q&A on.

In [7]:
# !mkdir -p ./data

from urllib.request import urlretrieve
# urls = [
#     'https://s2.q4cdn.com/299287126/files/doc_financials/2023/ar/2022-Shareholder-Letter.pdf',
#     'https://s2.q4cdn.com/299287126/files/doc_financials/2022/ar/2021-Shareholder-Letter.pdf',
#     'https://s2.q4cdn.com/299287126/files/doc_financials/2021/ar/Amazon-2020-Shareholder-Letter-and-1997-Shareholder-Letter.pdf',
#     'https://s2.q4cdn.com/299287126/files/doc_financials/2020/ar/2019-Shareholder-Letter.pdf'
# ]

filenames = [
    # "bedrock.pdf",
"Data_Scientist_Resume_0.pdf",
"Data_Scientist_Resume_1.pdf",
"Data_Scientist_Resume_3.pdf",
"Data_Scientist_Resume_2.pdf",
"Data_Scientist_Resume_6.pdf",
# ".DS_Store",
"Data_Scientist_Resume_7.pdf",
"Data_Scientist_Resume_5.pdf",
"Data_Scientist_Resume_4.pdf",
"Data_Scientist_Resume_10.pdf",
"Data_Scientist_Resume_9.pdf",
"Data_Scientist_Resume_8.pdf",
"Machine_Learning_Engineer_Resume_3.pdf",
"Machine_Learning_Engineer_Resume_2.pdf",
"Machine_Learning_Engineer_Resume_5.pdf",
"Machine_Learning_Engineer_Resume_4.pdf",
]

metadata = [
    dict(id=0, source=filenames[0]),
    dict(id=1, source=filenames[1]),
    dict(id=2, source=filenames[2]),
    dict(id=3, source=filenames[3]),
    dict(id=4, source=filenames[4]),
    dict(id=5, source=filenames[5]),
    dict(id=6, source=filenames[6]),
    dict(id=7, source=filenames[7]),
    dict(id=8, source=filenames[8]),
    dict(id=9, source=filenames[9]),
    dict(id=10, source=filenames[10]),
    dict(id=11, source=filenames[11]),
    dict(id=12, source=filenames[12]),
    dict(id=13, source=filenames[13]),
    dict(id=14, source=filenames[14]),
    # dict(year=2020, source=filenames[2]),
    # dict(year=2019, source=filenames[3])
    ]

data_root = "./data/"


As part of Amazon's culture, the CEO always includes a copy of the 1997 Letter to Shareholders with every new release. This will cause repetition, take longer to generate embeddings, and may skew your results. In the next section you will take the downloaded data, trim the 1997 letter (last 3 pages) and overwrite them as processed files.

In [8]:
from pypdf import PdfReader, PdfWriter
import glob

After downloading we can load the documents with the help of [DirectoryLoader from PyPDF available under LangChain](https://python.langchain.com/en/latest/reference/modules/document_loaders.html) and splitting them into smaller chunks.

Note: The retrieved document/text should be large enough to contain enough information to answer a question; but small enough to fit into the LLM prompt. Also the embeddings model has a limit of the length of input tokens limited to 512 tokens, which roughly translates to ~2000 characters. For the sake of this use-case we are creating chunks of roughly 1000 characters with an overlap of 100 characters using [RecursiveCharacterTextSplitter](https://python.langchain.com/en/latest/modules/indexes/text_splitters/examples/recursive_text_splitter.html).

In [9]:
import numpy as np
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader

documents = []

for idx, file in enumerate(filenames):
    loader = PyPDFLoader(data_root + file)
    document = loader.load()
    for document_fragment in document:
        document_fragment.metadata = metadata[idx]
        
    print(f'{len(document)} {document}\n')
    documents += document


# - in our testing Character split works better with this PDF data set
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size = 1000,
    chunk_overlap  = 100,
)

docs = text_splitter.split_documents(documents)
# Before we are proceeding we are looking into some interesting statistics regarding the document preprocessing we just performed:

2 [Document(page_content='Data Scientist Resume\nContact Information\nName: Alex Johnson\nAddress: 123 Main St, Anytown, USA\nPhone: (555) 123-4567\nEmail: alex.johnson@example.com\nLinkedIn: linkedin.com/in/alexjohnson\nGitHub: github.com/alexjohnson\nProfessional Summary\nHighly skilled and analytical Data Scientist with experience in machine learning, data mining, and data visualization.\nProven ability to drive business decisions by providing actionable insights from data analytics. Proficient in statistical\nsoftware and programming languages including Python, R, and SQL.\nEducation\nB.Sc. in Data Science, State University, 2019\nRelevant Coursework: Machine Learning, Statistical Analysis, Big Data Analytics\nWork Experience\nData Scientist, Tech Solutions Inc., June 2019 - Present\n- Developed and implemented statistical models for data analysis that increased sales by 15%.\n- Created data visualizations to communicate complex analysis and insights to non-technical team members.\

In [10]:
avg_doc_length = lambda documents: sum([len(doc.page_content) for doc in documents])//len(documents)
print(f'Average length among {len(documents)} documents loaded is {avg_doc_length(documents)} characters.')
print(f'After the split we have {len(docs)} documents as opposed to the original {len(documents)}.')
print(f'Average length among {len(docs)} documents (after split) is {avg_doc_length(docs)} characters.')

Average length among 30 documents loaded is 778 characters.
After the split we have 45 documents as opposed to the original 30.
Average length among 45 documents (after split) is 519 characters.


We had 3 PDF documents and one txt file which have been split into smaller ~500 chunks.

Now we can see how a sample embedding would look like for one of those chunks.

In [11]:
sample_embedding = np.array(bedrock_embeddings.embed_query(docs[0].page_content))
print("Sample embedding of a document chunk: ", sample_embedding)
print("Size of the embedding: ", sample_embedding.shape)

Sample embedding of a document chunk:  [ 0.58984375  0.12207031 -0.00946045 ...  0.4296875  -0.48242188
  0.15625   ]
Size of the embedding:  (1536,)


Following the very same approach embeddings can be generated for the entire corpus and stored in a vector store.

This can be easily done using [FAISS](https://github.com/facebookresearch/faiss) implementation inside [LangChain](https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/faiss.html) which takes  input the embeddings model and the documents to create the entire vector store. Using the Index Wrapper we can abstract away most of the heavy lifting such as creating the prompt, getting embeddings of the query, sampling the relevant documents and calling the LLM. [VectorStoreIndexWrapper](https://python.langchain.com/en/latest/modules/indexes/getting_started.html#one-line-index-creation) helps us with that.

**⚠️⚠️⚠️ NOTE: it might take few minutes to run the following cell ⚠️⚠️⚠️**

In [12]:
from langchain.chains.question_answering import load_qa_chain
from langchain.vectorstores import FAISS
from langchain.indexes import VectorstoreIndexCreator
from langchain.indexes.vectorstore import VectorStoreIndexWrapper

vectorstore_faiss = FAISS.from_documents(
    docs,
    bedrock_embeddings,
)

wrapper_store_faiss = VectorStoreIndexWrapper(vectorstore=vectorstore_faiss)

In [13]:
# ----------------------------------------
# test on using loaded documents directly

In [14]:
from langchain.chains.question_answering import load_qa_chain
from langchain.vectorstores import FAISS
from langchain.indexes import VectorstoreIndexCreator
from langchain.indexes.vectorstore import VectorStoreIndexWrapper

vectorstore_faiss = FAISS.from_documents(
    documents, #load documents from pdfs
    bedrock_embeddings,
)

wrapper_store_faiss = VectorStoreIndexWrapper(vectorstore=vectorstore_faiss)

In [15]:
query = "find good candidate for data scientist position?"
query_embedding = vectorstore_faiss.embedding_function(query)

relevant_documents = vectorstore_faiss.similarity_search_by_vector(query_embedding)
print(f'{len(relevant_documents)} documents are fetched which are relevant to the query.')
print('----')
for i, rel_doc in enumerate(relevant_documents):
    print_ww(f'## Document {i+1}: {rel_doc.page_content}.......')
    print('---')
### Question Answering

# Now that we have our vector store in place, we can start asking questions.

4 documents are fetched which are relevant to the query.
----
## Document 1: Data Scientist Resume
Contact Information
Name: Samantha Reed
Address: 456 Oak Lane, Techville, USA
Phone: (555) 654-3210
Email: samantha.reed@example.com
LinkedIn: linkedin.com/in/samanthareed
GitHub: github.com/samanthareed
Professional Summary
Highly skilled and analytical Data Scientist with experience in machine learning, data mining, and
data visualization.
Proven ability to drive business decisions by providing actionable insights from data analytics.
Proficient in statistical
software and programming languages including Python, R, and SQL.
Education
B.Sc. in Data Science, Tech University, 2018
Relevant Coursework: Machine Learning, Statistical Analysis, Big Data Analytics
Work Experience
Data Scientist, Innovatech Ltd., July 2018 - Present
- Developed and implemented statistical models for data analysis that increased sales by 15%.
- Created data visualizations to communicate complex analysis and insig

Now we have the relevant documents, it's time to use the LLM to generate an answer based on these documents. 

We will take our inital prompt, together with our relevant documents which were retreived based on the results of our similarity search. We then by combining these create a prompt that we feed back to the model to get our result. At this point our model should give us highly informed information on how we can change the tire of our specific car as it was outlined in our manual.

LangChain provides an abstraction of how this can be done easily.

### Quick way
You have the possibility to use the wrapper provided by LangChain which wraps around the Vector Store and takes input the LLM.
This wrapper performs the following steps behind the scences:
- Takes input the question
- Create question embedding
- Fetch relevant documents
- Stuff the documents and the question into a prompt
- Invoke the model with the prompt and generate the answer in a human readable manner.

In [20]:
query = "find top candidates for data scientist position?"

answer = wrapper_store_faiss.query(question=query, llm=llm)
print_ww(answer)

 Based on the provided resumes, here are the top 2 candidates for the data scientist position:

1. Samantha Reed
- B.Sc. in Data Science from Tech University in 2018, so has 3 years of experience
- Currently working as Data Scientist at Innovatech Ltd since 2018
- Developed statistical models that increased sales by 15%
- Strong programming skills in Python, R, SQL, Java

2. Alex Johnson
- B.Sc. in Data Science from State University in 2019, so has 2 years of experience
- Currently working as Data Scientist at Tech Solutions Inc since 2019
- Also developed statistical models that increased sales by 15%
- Strong programming skills in Python, R, SQL, Java

David Lee seems to be a less strong candidate based on having only 1 year of experience since
graduating in 2021. But Samantha Reed and Alex Johnson both have impressive resumes showing
successful data science projects and technical


Let's ask a different question:

In [24]:
query_2 = "find some good machine learning engineers?"

answer_2 = wrapper_store_faiss.query(question=query_2, llm=llm)
print_ww(answer_2)

 Based on the resumes provided, here are some potential good machine learning engineer candidates:

- Avery Walker - Has an MS in Computer Science with a specialization in machine learning. Worked as
a ML engineer at Smart Solutions Tech, where he deployed ML models and pipelines that increased
efficiency by 25%. Has experience with deep learning and computer vision.

- Riley Morgan - Also has an advanced degree in CS with ML specialization. Worked as a ML engineer
at NextGen Robotics deploying models and collaborating with data scientists. Has R&D experience in
CV and NLP.

- Bailey Jordan - Recently completed an MS in CS with an ML focus. Worked at Cutting Edge AI Labs
deploying ML models and pipelines. Has experience with deep learning and advanced ML algorithms.

All three have the educational background, work experience, and technical skills that would make
them strong candidates for ML engineering roles. Their resumes highlight hands-on experience
building, deploy


In [32]:
query_3 = "which machine learning engineer has soccer skills"

answer_3 = wrapper_store_faiss.query(question=query_3, llm=llm)
print_ww(answer_3)

 Based on the information provided, there are no mentions of any of the machine learning engineers
having soccer skills. The context focuses on their programming skills, machine learning expertise,
data engineering knowledge, projects and certifications, but does not indicate if any of them have
experience or skills related to soccer specifically. Since no soccer skills are listed, I don't have
enough information to determine which machine learning engineer has soccer skills.


### Customisable option
In the above scenario you explored the quick and easy way to get a context-aware answer to your question. Now let's have a look at a more customizable option with the help of [RetrievalQA](https://python.langchain.com/en/latest/modules/chains/index_examples/vector_db_qa.html) where you can customize how the documents fetched should be added to prompt using `chain_type` parameter. Also, if you want to control how many relevant documents should be retrieved then change the `k` parameter in the cell below to see different outputs. In many scenarios you might want to know which were the source documents that the LLM used to generate the answer, you can get those documents in the output using `return_source_documents` which returns the documents that are added to the context of the LLM prompt. `RetrievalQA` also allows you to provide a custom [prompt template](https://python.langchain.com/en/latest/modules/prompts/prompt_templates/getting_started.html) which can be specific to the model.

Note: In this example we are using Anthropic Claude as the LLM under Amazon Bedrock, this particular model performs best if the inputs are provided under `Human:` and the model is requested to generate an output after `Assistant:`. In the cell below you see an example of how to control the prompt such that the LLM stays grounded and doesn't answer outside the context.

In [33]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

prompt_template = """

Human: Use the following pieces of context to provide a concise answer to the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}

Assistant:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore_faiss.as_retriever(
        search_type="similarity", search_kwargs={"k": 3}
    ),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT},
    callbacks=[token_counter]
)

In [34]:
query = "recommend good data scientist candidates?"
result = qa({"query": query})
print_ww(result['result'])

print(f"\n{result['source_documents']}")


Token Counts:
Total: 12571
Embedding: N/A
Prompt: 10615
Generation:1956

 Based on the resumes provided, Alex Johnson seems to be a strong candidate for a data scientist
role. The resume highlights relevant education in data science, coursework in machine learning and
statistics, and professional experience as a data scientist. The skills listed also align well with
common data scientist requirements. Without more information I cannot make a more definitive
recommendation, but Alex Johnson's resume indicates he would likely be a good fit for a data
scientist position.

[Document(page_content='Data Scientist Resume\nContact Information\nName: David Lee\nAddress: 1234 Willow Pass, Datatown, USA\nPhone: (555) 333-2121\nEmail: david.lee@example.com\nLinkedIn: linkedin.com/in/davidlee\nGitHub: github.com/davidlee\nProfessional Summary\nHighly skilled and analytical Data Scientist with experience in machine learning, data mining, and data visualization.\nProven ability to drive business dec

In [35]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

prompt_template = """

Human: Use the following pieces of context to provide a concise answer to the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
Try to match candidate with following criteria
Professional Summary
Experienced Machine Learning Engineer with a strong background in designing, building, and deploying scalable
machine learning models. Adept at data engineering and deploying AI solutions into production environments. Looking
to leverage deep learning expertise to tackle new challenges in a dynamic team setting.
Education
M.S. in Computer Science, Specialization in Machine Learning
Relevant Coursework: Deep Learning, Advanced Machine Learning, Distributed Systems

{context}

Question: {question}

Assistant:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore_faiss.as_retriever(
        search_type="similarity", search_kwargs={"k": 3}
    ),
    return_source_documents=True,
    chain_type_kwargs={"prompt": PROMPT},
    callbacks=[token_counter]
)

In [36]:
query = "recommend good data machine learning engineer?"
result = qa({"query": query})
print_ww(result['result'])

print(f"\n{result['source_documents']}")


Token Counts:
Total: 13452
Embedding: N/A
Prompt: 11345
Generation:2107

 Based on the provided information, Riley Morgan seems to match the criteria for a machine learning
engineer well. The key factors are:

- Relevant education (M.S. in Computer Science with specialization in ML)

- Work experience as a machine learning engineer at NextGen Robotics, including deploying ML models
to production and collaborating with data scientists.

- Specific ML projects mentioned, like increasing system efficiency by 25% with ML pipelines.

- Familiarity with relevant ML tools and techniques like TensorFlow, Keras, computer vision, and
NLP.

The other candidate Avery Walker also has strong ML credentials, but Riley Morgan's work experience
seems like the closest match for the role based on the criteria provided. Riley would be a good data
machine learning engineer candidate to recommend.

[Document(page_content='Data Visualization: Matplotlib, Seaborn, Tableau\nMachine Learning: Scikit-learn, Ten

In [46]:
query = "recommend good data scientist candidates?"
result = qa({"query": query})
print_ww(result['result'])

print(f"\n{result['source_documents']}")


Token Counts:
Total: 3861
Embedding: N/A
Prompt: 3327
Generation:534

 Based on the resumes provided, Alex Johnson seems to be a good candidate for a data scientist role.
Alex has a B.Sc. in Data Science, relevant coursework and work experience as a Data Scientist at
Tech Solutions Inc. since 2019. The resume highlights Alex's skills in developing and implementing
statistical models, data analysis, visualization, and managing data science projects.

[Document(page_content='Data Scientist Resume\nContact Information\nName: David Lee\nAddress: 1234 Willow Pass, Datatown, USA\nPhone: (555) 333-2121\nEmail: david.lee@example.com\nLinkedIn: linkedin.com/in/davidlee\nGitHub: github.com/davidlee\nProfessional Summary\nHighly skilled and analytical Data Scientist with experience in machine learning, data mining, and data visualization.\nProven ability to drive business decisions by providing actionable insights from data analytics. Proficient in statistical\nsoftware and programming languages

In [25]:
query = "How was Amazon impacted by COVID-19?"

result = qa({"query": query})

print_ww(result['result'])

print(f"\n{result['source_documents']}")

NameError: name 'qa' is not defined

## Conclusion
Congratulations on completing this moduel on retrieval augmented generation! This is an important technique that combines the power of large language models with the precision of retrieval methods. By augmenting generation with relevant retrieved examples, the responses we recieved become more coherent, consistent and grounded. You should feel proud of learning this innovative approach. I'm sure the knowledge you've gained will be very useful for building creative and engaging language generation systems. Well done!

In the above implementation of RAG based Question Answering we have explored the following concepts and how to implement them using Amazon Bedrock and it's LangChain integration.

- Loading documents of different kind and generating embeddings to create a vector store
- Retrieving documents to the question
- Preparing a prompt which goes as input to the LLM
- Present an answer in a human friendly manner

### Take-aways
- Experiment with different Vector Stores
- Leverage various models available under Amazon Bedrock to see alternate outputs
- Explore options such as persistent storage of embeddings and document chunks
- Integration with enterprise data stores

# Thank You