# Module 5 - In-context Q&A with with Retrieval Augmented Generation (RAG)
____

<div class="alert alert-block alert-info"> 
    <b>NOTE:</b> You will need to use a Jupyter Kernel with Python 3.9 or above to use this notebook. If you are in Amazon SageMaker Studio, you can use the "Data Science 3.0" image.
</div>

In this notebook we will walk through Q&A with a document first by extracting text from a document using Amazon Textract, generating chunks of text and store them into a Vector DB, and then performing Q&A with a Anthropic Claude model via Amazon Bedrock and get precise answers from the model. Later on, we will also implement a chat application with chat history to chat with documents.

In [None]:
import json
import os
import sys
import sagemaker
import boto3

role = sagemaker.get_execution_role()
data_bucket = sagemaker.Session().default_bucket()
bedrock = boto3.client('bedrock-runtime')
br = boto3.client('bedrock')
s3 = boto3.client("s3")
print(f"SageMaker bucket is {data_bucket}, and SageMaker Execution Role is {role}")

In [None]:
MODEL_ID = "anthropic.claude-instant-v1"

---
# Perform Common sense reasoning and QA on a document

In this section, we will perform common sense reasoning and Q&A on a document. This section does the following

- Generates text from documents and stores them into S3 in plaintext format
- Generate embeddings from the text
- Uses an in-memory vector database to store the embeddings. In this case we will use [FAISS](https://ai.meta.com/tools/faiss/#:~:text=FAISS%20(Facebook%20AI%20Similarity%20Search,more%20scalable%20similarity%20search%20functions.).
- Perform similarity search on the in-memory vector db to find relevant pieces of text that have relavancy to the asked question (by the user)
- Generate the context for the LLM using the search results
- Give the model the context and the original question asked
- Get the answer back from the LLM
- Profit

> _"Wait but that's a lot of steps just for getting an answer back? Why?"_

We would love to explain and dive deeper into why, but here's a paper that does a better job of explain the why? and the how? - https://arxiv.org/pdf/2005.11401.pdf . In short, LLMs know too much, _sometimes a bit too much that it may get confused and wander into the proverbial forest of it's own world knowledge and go start gathering firewood, when it was actually asked to go pick some fruit_. To solve this problem, and to get accurate/factual answers, we use this method of Retrieval-Augmented Generation (aka RAG), just to give the LLM a bit more _context_ to work with such that it gives us the desired output (like a fruit basket in our example, so that it knows it's only supposed to pick fruits) .

As a first step, we read a file (document) using Amazon Textract using LangChain Textract Document Loader.

In [None]:
from IPython.display import IFrame

qa_document_path=f"s3://{data_bucket}/textract-linearized-output/uploads/health_plan"

IFrame("./sample-docs/health_plan.pdf", width=600, height=800)

Let's look at the extracted text from the S3 location where we extracted all the documents.

In [None]:
from read_doc_from_s3 import read_document
from IPython.display import display_markdown

document = read_document(doc_path=qa_document_path)

full_text = ""
for index,page in enumerate(document):
    full_text += page.strip() + "\n\n" 

# display_markdown(full_text, raw=True)
print(full_text)

Now that we have extracted the document, we split the document into smaller chunks, this is required because we may have a large multi-page document and our LLMs may have token limits. It will also ensure that we only get the relevant parts of the document to build the context instead of full page texts. Then these chunks will be loaded into the Vector DB for performing similarity search in the subsequent steps. 

However, before we store the document in the VectorDB, we will have to generate embeddings on the text. We will use Titan Embeddings G1 Text model for that purpose. Let's start by splitting the document into smaller chunks. We will use two levels of text splitting here -

1. Since our extracted text is already in markdown formatted, we will use LangChain's [MarkdownHeaderTextSplitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/markdown_header_metadata), this splitter is built specially for markdown text and does chunking based on markdown headers (titles and subtitles marked by `#` and or `##`). This helps keep the text chunks related to the header of the paragraphs.
2. Next, we take the markdown split text and further chunk it using LangChain's [RecursiveCharacterTextSplitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/recursive_text_splitter). This splitter tries to split on a list of characters in order until the chunks are small enough. The default list of characters is `["\n\n", "\n", " ", ""]`. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.


In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter, MarkdownHeaderTextSplitter

headers_to_split_on = [
    ("#", "Title"),
    ("##", "Subtitile")
]

markdown_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)
md_header_splits = markdown_splitter.split_text(full_text)

text_splitter = RecursiveCharacterTextSplitter(chunk_size=400,
                                               chunk_overlap=50)
texts = text_splitter.split_documents(md_header_splits)
for index, text in enumerate(texts):
    print(f"==== Chunk {index+1} ====")
    print(text)
    print("\n")



We have split the document into smaller chunks. We will now perform a couple of things-

- Generate embeddings of these chunks
- Store these embeddings into a vector database



## Vector database

This vector database is going to store the embeddings that we generate. This notebook uses ChromaDB and will be transient and in memory. ChromaDB is a vector store that allows developers to quickly search for embeddings of multimedia documents that are similar to each other. It solves limitations of traditional query search engines that are optimized for hash-based searches, and provides more scalable similarity search functions. The VectorStore APIs that use ChromaDB within LangChain are available [here](https://python.langchain.com/docs/integrations/vectorstores/chroma). 

We will use Hugging Face Sentence Transformer embedding model to generate the embeddings and _Cosine similarity_ as a metric for similarity between two pieces of texts.

> **_Cosine similarity_** is a metric used to measure how similar two vectors are irrespective of their size. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space. The cosine similarity is particularly used in positive space, where the outcome is neatly bounded in $[0,1]$. 
>
>The formula to calculate the cosine similarity `S` between two vectors `A` and `B` is:
>
>$[ S(A, B) = \frac{A \cdot B}{||A|| \, ||B||} ]$
>
>Where:
>- $A â‹… B$ is the dot product of the vectors $A$ and $B$
>- $||A||$ and $||B||$ are the norms (or magnitudes) of vectors $A$ and $B$, respectively
>
>In text analysis, cosine similarity is often used to measure the similarity between two documents. It is calculated using the _term frequency-inverse document frequency_ (TF-IDF) weights of the terms in the documents.
>
>When the cosine similarity is close to 1, it indicates that the two vectors are very similar to each other. When it is close to 0, it indicates that the vectors are dissimilar. If the cosine similarity is 1, the vectors are identical (considering the direction, not the magnitude), and if it's -1, they are completely dissimilar. 
>
>This metric is widely used in information retrieval and text mining to assess the similarity between documents or the relevance of documents to a query.

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma

encode_kwargs = {'normalize_embeddings': True}
embeddings = HuggingFaceEmbeddings(encode_kwargs=encode_kwargs)

# we initalize ChromaDB collection with Cosine similarity
vector_db = Chroma.from_documents(documents=texts, 
                                  embedding=embeddings, 
                                  collection_metadata={"hnsw:space": "cosine"})

<div class="alert alert-block alert-info"> 
    <b>NOTE:</b> Since we are loading the Chroma Vector DB in memory, it will load into the SageMaker Studio instance's memory you may want to free up memory from time to time. To do that, uncomment the line below and execute this cell
</div>

<div class="alert alert-block alert-warning"> 
    <b>CAUTION:</b> The code cell below will delete the vector db index/collection.
</div>

In [None]:
#vector_db.delete_collection()

We have loaded our vector db with the document, now let's run a query.

In [None]:
query = "What is the annual deductible per person?"
docs = vector_db.similarity_search(query)

print(f"=========Following are the text chunks relevant to the question - '{query}'=========\n")
for doc in docs:
    print(f"Text chunk: {doc.page_content}")
    print("\n")

The query returns all the chunks from the document that is similar to the query, by default it returns the Top 4 similar chunks. Let's see how to return just Top 3 with confidence scores.

In [None]:
docs = vector_db.similarity_search_with_score(query, k = 3)

print(f"=========Following are the text chunks (with similarity scores) relevant to the question - '{query}'=========\n")
for doc in docs:
    print(f"Text chunk: {doc[0].page_content}")    
    print(f"Similarity Score: {1 - doc[1]}") # since cosine_distane = (1 - cosine_similarity)
    print("\n")

## Vector store-backed retriever
---

According to LangChain documentation-

>A vector store retriever is a retriever that uses a vector store to retrieve documents. It is a lightweight wrapper around the vector store class to make it conform to the retriever interface. It uses the search methods implemented by a vector store, like similarity search and MMR, to query the texts in the vector store.

Wrapping our vector db in a retriever wrapper is going to be useful when we use it in the Q&A chain for our chatbot in subsequent sections. But let's take a look how it works. The functionality is pretty similar to before (i.e. querying) with a slightly different interface.

We first define a retriever with search type `similarity_score_threshold`, other option is mmr (max marginal relevancy). Note that the search_type depends on which vector DB you are using, some vector DBs may or may not support mmr etc.

>similarity_score_threshold considers the cosine similarity of the reference text vs the embeddings.

We also define how many top results to return, in this case 5. Finally we query the retriever using `get_relevant_documents` by passing in the query.


In [None]:
import warnings
warnings.filterwarnings('ignore')

query = "What is the total pharmacy out-of-pocket?"

# At the top we initialized the Chroma DB with cosine relevance function
# here score_threshold is the cosine distance and not the similarity
# cosine_distane = 1 - cosine_similarity
# 0.5 is 1 - 0.5

retriever = vector_db.as_retriever(search_type='similarity_score_threshold', search_kwargs={'score_threshold': 0.5, "k": 5})
relevant_docs = retriever.get_relevant_documents(query)   
relevant_docs

for doc in relevant_docs:
    if 'Title' in doc.metadata:
        print(f"Title: {doc.metadata['Title']}")
    if 'Subtitile' in doc.metadata:
        print(f"Subtitile: {doc.metadata['Subtitile']}")
    print(f"Text chunk: {doc.page_content}")    
    print("\n")

In [None]:
import warnings
warnings.filterwarnings('ignore')

query = "What is the color of the sky?"

retriever = vector_db.as_retriever(search_type='similarity_score_threshold', search_kwargs={'score_threshold': 0.5, "k": 5})
relevant_docs_u = retriever.get_relevant_documents(query)   
relevant_docs_u

## Build context from retrieved documents
---

We now have the two relevant pieces of text that "contain" the anwer to our question, we are not quite there yet. So we will use a technique that we used earlier to build context and ask the quetion to the Llama-2 model. In this case, we will use the two text chunks we retrieved from the vector db to create the context by simply concatenating them.

In [None]:
full_context = str()
for doc in relevant_docs:
    full_context += doc.page_content+" "
    
print(full_context.strip(".").strip())

The similarity seach query gave us a good output but we want some more key details out of it. Let's use an LLM to ask this question, but this time using the context that we created above

In [None]:
from langchain.llms import Bedrock
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

template = """

Answer the question as truthfully as possible strictly using only the provided text, and if the answer is not contained within the text, say "I don't know". Skip any preamble text and reasoning and give just the answer.

<text>{document}</text>
<question>{question}</question>
<answer>"""

prompt = PromptTemplate(template=template, input_variables=["document","question"])
bedrock_llm = Bedrock(client=bedrock, model_id=MODEL_ID, model_kwargs={'temperature':0})


llm_chain = LLMChain(prompt=prompt, llm=bedrock_llm)
answer = llm_chain.run(document=full_context, question="What is the per-person pharmacy out-of-pocket?")
print(answer.strip())

Now let's run it with a different question

In [None]:
answer = llm_chain.run(document=full_context, question="Who is the administrator for this plan?")
print(answer.strip())

The model doesn't know the answer because our context in `full_context` has no information about the administrator of the plan, and we asked the model to strictly answer from within the provided context. This means we will have to run a similarity search on the Vector database again using our new question, create the full context again, and then ask the question. Thankfully, LangChain makes it easy for us and we will see how.

### Performing Q&A with RAG with `load_qa_chain`
---

For this purpose, we will first define a question, and then generate embeddings from it. Once we have that we can perform similarity search on the vector database to find relevant pieces of information from the document. These relevant pieces of information will then be passed on to the model so that it can answer the question. We will use LangChain's `load_qa_chain` to perform Q&A with the model. The load qa chain does the work with prompt creation and all the context generation with help from the vector database.

NOTE: In order to use the `RetrievalQA` from LangChain, your prompt template must have the two variables `context` and `question`. Using any other variable names will cause an error.

In [None]:
from langchain.chains import RetrievalQA
from langchain.llms import Bedrock
from langchain.prompts import PromptTemplate

retriever = vector_db.as_retriever(search_type='similarity_score_threshold', search_kwargs={'score_threshold': 0.5, "k": 3})

template = """

Answer the question as truthfully as possible strictly using only the provided text, and if the answer is not contained within the text, say "I don't know". Skip any preamble text and reasoning and give just the answer.

<text>{context}</text>
<question>{question}</question>
<answer>"""

# define the prompt template
qa_prompt = PromptTemplate(template=template, input_variables=["context","question"])

# initialize the LLM
bedrock_llm = Bedrock(client=bedrock, model_id=MODEL_ID, model_kwargs={'temperature':0})

chain_type_kwargs = { "prompt": qa_prompt, "verbose": False } # change verbose to True if you need to see what's happening
qa = RetrievalQA.from_chain_type(
    llm=bedrock_llm, 
    chain_type="stuff", 
    retriever=retriever,
    chain_type_kwargs=chain_type_kwargs,
    return_source_documents=False,  # Change this to True if you want to see the sources being used by the LLM to answer the question
    verbose=False # change verbose to True if you need to see what's happening
)

question="Who is the administrator for this plan?"

result = qa(question)
print("\n============ Answer ============\n")
print(result["result"].strip())

if "source_documents" in result:
    print("\n============ Sources ============\n")
    for doc in result["source_documents"]:
        print(f"{doc.page_content}")
        print("\n")

 <div class="alert alert-block alert-warning"> 
    <b>NOTE:</b> Change <code>return_source_documents</code> to <code>True</code> in the code cell above in the <code>RetrievalQA.from_chain_type()</code> call if you want to see the sources (text chunks) being used by the LLM to answer the question
</div>

Perfect! our model now can precisely answer the question. But how did it work?

- First, the question text was taken and the embedding was generated using the Amazon Titan embedding model. This all happened inside the `retriever` as we defined earler with `retriever = vector_db.as_retriever(search_type='mmr', search_kwargs={"k": 5})` our `vector_db` is a FAISS object that was initialized with Amazon Titan embedding model.
- Next the `RetrievalQA` chain runs a similarity search with the generated embdeddings (from the question) to find out relevant pieces of text that are similar to the question we are try to get an answer for.
- Then the chain builds the full context using the returned chunks and generates the full prompt using the `qa_prompt` template we provided.
- Finally, the chain invokes the model to get the response

## Chat with your document
---

We will now create a simple chat application to chat with our document. This application will not only perform in-context Q&A, but will also be able to answer questions based on chat history. For the chatbot we need `context management, history, vector stores, and many other things`. We will start by with a ConversationalRetrievalChain

This uses conversation memory and RetrievalQAChain which Allow for passing in chat history which can be used for follow up questions.Source: https://python.langchain.com/en/latest/modules/chains/index_examples/chat_vector_db.html

_We will use Gradio to quickly spin up our chat interface. So we will install Gradio next. Then we will define our Conversation chain and plug that into the chat application._

In [None]:
!pip install -q --disable-pip-version-check --root-user-action=ignore gradio --upgrade

We will also initialize our Claude model with `temperature=0` for less creative responses, and some stop words so that the model knows when to stop generating tokens. We initialize two instances of the LLM, one is used to rephrase the question by looking at the chat history (`bedrock_llm_condense`), and the other is used to answer questions using the RAG generated context `bedrock_llm`. You can potentially use the same LLM model instance for both, but this demonstrates that you can use two different models for your conversational interface to perform different tasks.

In [None]:

bedrock_llm_condense = Bedrock(client=bedrock, 
                      model_id=MODEL_ID, 
                      model_kwargs={"temperature": 0,"stop_sequences": ["\n\nHuman:","</standalone_question>"]})

bedrock_llm = Bedrock(client=bedrock, 
                      model_id=MODEL_ID, 
                      model_kwargs={"temperature": 0.3,"stop_sequences": ["\n\nHuman:","</answer>"]})



To build our chat application, we will use a built-in LangChain chain called `ConversationalRetrievalChain`. This chain allows us to build conversational interface that is capable of retaining chat history, and perform RAG on our vector DB retriever simultaneously, without us having to code each of those steps individually. The purpose of the chat history, is when provided as a context to the model, the model will recall the conversation and may enrich the responses (with the help of additional context retrieved by the `retriever`) based on the current question.

In [None]:
from langchain.prompts import PromptTemplate
from langchain.chains import ConversationalRetrievalChain
import warnings
warnings.filterwarnings('ignore')

def get_chat_history(inputs) -> str:
    res = []
    for human, ai in inputs:
        res.append(f"Human:{human}\nAssistant:{ai}")
    return "\n".join(res)

def create_prompt_template():
    _template = """
    
Given the following chat history and a follow up question, in its original language, rephrase the follow up question to be a standalone question but don't change its meaning. 
Skip the preamble and just get to the question.

<chat_history>    
{chat_history}
</chat_history>    

<follow_up_question>
{question}
</follow_up_question>

<standalone_question>
"""
    conversation_prompt = PromptTemplate.from_template(_template)
    return conversation_prompt

template = """

<human>
Answer the question as truthfully as possible strictly using only the provided document and the human-ai chat history, in its original language, 
and if the answer is not contained within the document, say "I don't know". Skip any preamble text and reasoning and give just the 
answer. If the user greets you, just greet them back. Always respond in English.
</human>

<document>
{context}

{chat_history}
</document>

<question>
{question}
</question>

<answer>
"""

# define the prompt template
qa_prompt = PromptTemplate(template=template, input_variables=["context","question","chat_history"])

retriever = vector_db.as_retriever(search_type='similarity_score_threshold', search_kwargs={'score_threshold': 0.4, "k": 5})


qa = ConversationalRetrievalChain.from_llm(llm=bedrock_llm, 
                                           retriever=retriever, 
                                           condense_question_prompt=create_prompt_template(),
                                           condense_question_llm = bedrock_llm_condense,
                                           combine_docs_chain_kwargs={"prompt": qa_prompt},
                                           get_chat_history=get_chat_history,
                                           verbose=False    # uncomment this to see logs
                                          )

questions = [
    "Hi, my name is John Doe, I have a family of 4 members.",
    "Who is the plan administrator for this plan?",
    "What is the annual deductible per person?",
    "What last name should I use in the form and how much total for all of my family member would it cost?"
]
chat_history = []

for question in questions:
    result = qa({"question": question, "chat_history":chat_history})
    chat_history.append((question, result["answer"]))
    print(f"-> **Question**: {question} \n")
    print(f"**Answer**: {result['answer'].strip()} \n")

We just had an automated chat session with a bunch of pre-determined questions and we also noticed that from the fourth question, the model is able to answer the name since we have access to the chat history. Keep in mind, as the chat session goes longer, the chat history can get bigger and bigger . In such cases, it is important to limit how far you want to remember the chat so that you don't run out of token limits, and encounter slower responses.

## The Chat App with Gradio
---

Next we will build a simple chat app using Gradio and the same method we used above using `ConversationalRetrievalChain` and our vector database as a retriever. Note that our vector database is currently loaded with only one document. But you can imagine that you could have any number of documents loaded into the vector database.

Once you run the following code cell, here are some questions you can ask in the chat interface-

- Who is the plan Administrator?
- Who are the third party administrator?
- What is the per-person deductible?
- What is ERISA?
- Do you remember my name?       ---> Test if the bot remembers your name
- Based on your previous answers, who are the primary and the third party administrators of the plan? ---> Test chat history
- What is Co-pay?
- What is a deductible?
- What is the co-pay maximum for a family?  ---> Info not in the document
- What is the co-pay maximum for a person?  ---> Info not in the document
- What is the deductible maximum for a family?
- what is the maximum out of pocket for pharmacy for a person?
- what is the maximum out of pocket for pharmacy for a family?


In [None]:
import random
import gradio as gr
from langchain.prompts import PromptTemplate
from langchain.chains import ConversationalRetrievalChain
import warnings
warnings.filterwarnings('ignore')

def get_chat_history(inputs) -> str:
    res = []
    for human, ai in inputs:
        res.append(f"Human:{human}\nAssistant:{ai}")
    return "\n".join(res)

def create_prompt_template():
    _template = """
    
Given the following chat history and a follow up question, in its original language, rephrase the follow up question to be a standalone question but don't change its meaning. 
Skip the preamble and just get to the question.

<chat_history>    
{chat_history}
</chat_history>    

<follow_up_question>
{question}
</follow_up_question>

<standalone_question>
"""
    conversation_prompt = PromptTemplate.from_template(_template)
    return conversation_prompt

template = """

<human>
Answer the question as truthfully as possible strictly using only the provided document and the human-ai chat history, in its original language, 
and if the answer is not contained within the document, say "I don't know". Skip any preamble text and reasoning and give just the 
answer. If the user greets you, just greet them back. Always respond in English.
</human>

<document>
{context}

{chat_history}
</document>

<question>
{question}
</question>

<answer>
"""

# define the prompt template
qa_prompt = PromptTemplate(template=template, input_variables=["context","question","chat_history"])

retriever = vector_db.as_retriever(search_type='similarity_score_threshold', search_kwargs={'score_threshold': 0.4, "k": 5})


qa = ConversationalRetrievalChain.from_llm(llm=bedrock_llm, 
                                           retriever=retriever, 
                                           condense_question_prompt=create_prompt_template(),
                                           condense_question_llm = bedrock_llm_condense,
                                           combine_docs_chain_kwargs={"prompt": qa_prompt},
                                           get_chat_history=get_chat_history,
                                           verbose=False    # uncomment this to see logs
                                          )
chat_history = []

def qa_fn(message, history):
    result = qa({"question": message, "chat_history":chat_history})
    chat_history.append((message, result["answer"]))
    return result['answer'].strip()

gr.ChatInterface(qa_fn).launch()

## Cleanup
---

Let's clean up the documents we uploaded to S3 earlier in Module 1 and also delete the vector db collection.

In [None]:
vector_db.delete_collection()

In [None]:
!aws s3 rm s3://{data_bucket} --recursive

Let's shut down the Kernel in this notebook. Follow the steps below to shut down the Notebook kernel.

- Click on the "_Kernel_" menu on the top
- Click on the "_Shut Down Kernel_" option

# Conclusion
---

Thanks for joining us in this workshop! In this workshop -

1. You learnt about the various API calls of Amazon Textract
2. You exctracted layout linearized text from documents using Amazon Textract as well as plain text, and compared the two
3. You bulk processed a large number of files by uploading them into an S3 bucket
4. You reviewed the AWS Step Functions workflow to see how the documents were being processed.
5. You did a number of exercises with Amazon Bedrock and generative AI (LLM), with Anthropic Claude Instant v1 model such as
   - Single and multi-page document classification
   - Single page and multi-page summarization
   - Structured data extraction using templates, and Structured output parser with LangChain
   - Performed Q&A on table data from documents, and performed table self-querying with the LLM and LangChain
   - Finally, you performed document Q&A with RAG, and we built a chat-bot to chat with our documents

For more resources on intelligent document processing and generative AI visit the links below-

1. [Document processing at scale with IDP CDK constructs, samples](https://github.com/aws-solutions-library-samples/guidance-for-low-code-intelligent-document-processing-on-aws)
2. [Guidance for Low Code Intelligent Document Processing on AWS](https://aws.amazon.com/solutions/guidance/low-code-intelligent-document-processing-on-aws/)
3. [Intelligent Document Processing with AWS AI Services workshop](https://catalog.us-east-1.prod.workshops.aws/workshops/c2af04b2-54ab-4b3d-be73-c7dd39074b20/en-US)
4. [New Tools for Building with Generative AI on AWS.](https://aws.amazon.com/blogs/machine-learning/intelligent-document-processing-with-amazon-textract-amazon-bedrock-and-langchain/?utm_content=bufferfda52&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer#:~:text=New%20Tools%20for%20Building%20with%20Generative%20AI%20on%20AWS.)
5. Read the blog - [Intelligent document processing with Amazon Textract, Amazon Bedrock, and LangChain](https://aws.amazon.com/blogs/machine-learning/intelligent-document-processing-with-amazon-textract-amazon-bedrock-and-langchain/?utm_content=bufferfda52&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer)

## Don't forget to complete the session survey in the mobile app