# RAG-based coversation using Amazon BedRock, SageMaker and Redis
This example demonstrates how to create a RAG-based conversation system using Amazon Bedrock, SageMaker, and Redis. The steps involved are:

- Setting up the environment and installing necessary packages.
- Loading a document from the AWS Machine Learning Blog.
- Splitting the document into smaller chunks.
- Connecting to the Bedrock Embedding endpoint.
- Using Redis as a vector store to store the document embeddings.
- Performing a similarity search in the vector store.
- Creating a chat application using the Bedrock model, Redis vector store, and Langchain library.
- Using the chat application to answer questions based on the loaded document.
- Using Redis as a memory store for the conversation buffer memory to retain context and provide a conversational experience.

## Setup the environment

Install necessary packages bedrock sdand set up the environment.

In [1]:
%pip install --quiet redis langchain pypdf pyyaml


Note: you may need to restart the kernel to use updated packages.


In [2]:
import logging
import warnings

# Disable warnings and verbose logging
logger = logging.getLogger()
logger.setLevel(logging.ERROR)
warnings.filterwarnings("ignore")

import boto3

region = boto3.session.Session().region_name
bedrock = boto3.client("bedrock")


Getting the Foundation Models from Bedrock. The foundation models are the models that are available for use in Bedrock. The list of foundation models can be obtained using the `list_foundation_models()` function.


In [3]:
response = bedrock.list_foundation_models()
model = bedrock.get_foundation_model(
    modelIdentifier="amazon.titan-embed-g1-text-02"
)
print(model)


{'ResponseMetadata': {'RequestId': 'b7fa5f65-f1f6-436e-801b-d4785569194e', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sun, 15 Oct 2023 07:10:19 GMT', 'content-type': 'application/json', 'content-length': '373', 'connection': 'keep-alive', 'x-amzn-requestid': 'b7fa5f65-f1f6-436e-801b-d4785569194e'}, 'RetryAttempts': 0}, 'modelDetails': {'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-g1-text-02', 'modelId': 'amazon.titan-embed-g1-text-02', 'modelName': 'Titan Text Embeddings v2', 'providerName': 'Amazon', 'inputModalities': ['TEXT'], 'outputModalities': ['EMBEDDING'], 'customizationsSupported': [], 'inferenceTypesSupported': ['ON_DEMAND']}}


In order to invoke the model with our prompts, we need to create a Bedrock Runtime. Bedrock Runtime currently supports `InvokeModel` and `InvokeModelWithResponseStream` actions.

In [4]:
bedrock_runtime = boto3.client(
    service_name="bedrock-runtime", region_name=region
)


## Document Loading

For this example we use one of the AWS Machine Learnign Blog posts, [Announcing New Tools to Help Every Business Embrace Generative AI - by Swami Sivasubramanian](https://aws.amazon.com/blogs/machine-learning/announcing-new-tools-to-help-every-business-embrace-generative-ai/)

In [5]:
from langchain.document_loaders import PyPDFLoader

pdf_file = "data/announcing_new_tools_to_help_every_business_embrace_generative_ai.pdf"

loader = PyPDFLoader(pdf_file)
docs = loader.load()

print(len(docs))


9


We could see the PyPDFLoader, loaded the file and split it into 9 documents, one page per document. Now, let's see the print out the first document.

In [6]:
print(docs[0].page_content)


AWS Machine Learning BlogAnnouncing New Tools to Help Every Business EmbraceGenerative AIby Swami Sivasubramanian | on 28 SEP 2023 | in Announcements, Artiﬁcial Intelligence, Generative AI |Permalink |  Comments |  ShareFrom startups to enterprises, organizations of all sizes are getting started with generative AI. They want tocapitalize on generative AI and translate the momentum from betas, prototypes, and demos into real-worldproductivity gains and innovations. But what do organizations need to bring generative AI into the enterprise andmake it real? When we talk to customers, they tell us they need security and privacy, scale and price-performance, and most importantly tech that is relevant to their business. We are excited to announce newcapabilities and services today to allow organizations big and small to use generative AI in creative ways, buildingnew applications and improving how they work. At AWS, we are hyper-focused on helping our customers in a fewways:• Making it easy

Splitting the document

In [7]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1500, chunk_overlap=150
)

splits = text_splitter.split_documents(docs)
print(len(splits))


19


Connect to Bedrock Embedding endpoint

In [8]:
from langchain.embeddings import BedrockEmbeddings

embeddings = BedrockEmbeddings(client=bedrock_runtime)


## Redis as Vector Store
In this example, we use Redis as the vector store. Redis has a [vector similarity search](https://redis.io/docs/interact/search-and-query/search/vectors/) capability which makes it an ideal choice for both development and production.

### Redis Deployment Patterns

- EC2: In this approach, we simply deploy the Redis stack in EC2 and, depending on the workload, utilize persistence and clustering features.
- ECS or EKS: In this approach, we can deploy the Redis stack container simply to a Fargate cluster, which is a cost-effective and scalable approach. The workshop, [Solving Data Challenges in Cloud Applications with Redis](https://redislabs.awsworkshop.io/) is a good, well-rounded set of examples on utilizing Redis features using ECS and Fargate.
- [Redis Cloud](https://redis.com/redis-enterprise-cloud/overview/)
- Redis on local Docker: This is an easy approach for development and non-production environments. This can be achieved by running the following command:
    ```bash
    # Remove the Redis container and recreate a new one.
    !docker container rm -f redis-stack
    !docker run -d --name redis-stack -p 6379:6379 redis/redis-stack
    ``` 


### Store Embeddings
In the following we create embeddings for the splits and store them in the vector store. 

In [9]:
from langchain.vectorstores.redis import Redis

redis_url = "redis://redis:6379"
index_name = "doc_index"

vectordb = Redis.from_documents(
    documents=splits,
    embedding=embeddings,
    redis_url=redis_url,
    index_name=index_name,
)


Now let's test our vector store by performing a similarity search. We will ask a question and ask for the top 3 (`k=3`) most similar documents. 

In [10]:
question = "How many languages are supported for embeddings?"

result = vectordb.similarity_search(question, k=3)

for r in result:
    print(r.page_content)
    print("-" * 100)


Embeddings supports more than 25 languages and a context length of upto 8,192 tokens, making it well suited to work with single words, phrases, or entire documents based on thecustomer’s use case. The model returns output vectors of 1,536 dimensions, giving it a high degree of accuracy,while also optimizing for low-latency, cost-eﬀective results. With new models and capabilities, it’s easy to useyour organization’s data as a strategic asset to customize foundation models and build more diﬀerentiatedexperiences.Third, because the data customers want to use for customization is such valuable IP, they need it to remain secureand private. With security and privacy built in since day one, Amazon Bedrock customers can trust that their dataremains protected. None of the customer’s data is used to train the original base FMs. All data is encrypted at restand in transit. And you can expect the same AWS access controls that you have with any other AWS service.Today, we are excited to build on th

## Chat Application
In this section, we will create a chat application using the Bedrock model, Redis vector store, and Langchain library. We'll use the conversation buffer memory and the retrieval chain. The chat application will be able to answer questions based on the document we loaded earlier. 

We will also use Redis as our memory store for the conversation buffer memory. This will retain the context and provide a conversational experience.

In [11]:
from langchain.memory.chat_message_histories import RedisChatMessageHistory
import uuid

# Generate a random session id
session_id = str(uuid.uuid4())

message_history = RedisChatMessageHistory(url=redis_url, session_id=session_id)
print(session_id)


260f88fb-394a-4058-8a32-77eee5c6c716


In [12]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    chat_memory=message_history,
    return_messages=True,
)


In [13]:
from langchain.chains import ConversationalRetrievalChain
from langchain.llms.bedrock import Bedrock


llm = Bedrock(model_id="anthropic.claude-v2", client=bedrock_runtime)

conv_chain = ConversationalRetrievalChain.from_llm(
    llm, retriever=vectordb.as_retriever(), memory=memory
)


### Start the conversation
We start our conversation by asking a question. The question is passed to the conversation buffer memory, which returns the most relevant document. The document is then passed to the retrieval chain, which returns the most relevant answer. The answer is then returned to the user.

In [14]:
question = "what are the three key features of amazon bedrock?"

result = conv_chain({"question": question})
print(result["answer"])


 Based on the context provided, the three key features of Amazon Bedrock seem to be:

1. Model choice and flexibility - Amazon Bedrock offers a variety of foundation models (FMs) from different providers, allowing customers to choose the models that work best for their use cases.

2. Customization capabilities - Customers can customize the foundation models using their own private data through techniques like fine-tuning and RAIL (Rapid Adaptation with Integrated Learning). This allows them to build more tailored generative AI applications.

3. Security and governance - Amazon Bedrock has security and privacy built-in and offers new governance capabilities like integration with CloudWatch and CloudTrail. This allows customers to use Bedrock securely and monitor usage and activity.


In [15]:
memory.chat_memory.messages


[HumanMessage(content='what are the three key features of amazon bedrock?'),
 AIMessage(content=' Based on the context provided, the three key features of Amazon Bedrock seem to be:\n\n1. Model choice and flexibility - Amazon Bedrock offers a variety of foundation models (FMs) from different providers, allowing customers to choose the models that work best for their use cases.\n\n2. Customization capabilities - Customers can customize the foundation models using their own private data through techniques like fine-tuning and RAIL (Rapid Adaptation with Integrated Learning). This allows them to build more tailored generative AI applications.\n\n3. Security and governance - Amazon Bedrock has security and privacy built-in and offers new governance capabilities like integration with CloudWatch and CloudTrail. This allows customers to use Bedrock securely and monitor usage and activity.')]

Now we ask another question relying on the memory of the previous question(s).

In [16]:
question = "Elaborate the third feature more."

result = conv_chain({"question": question})

print(result["answer"])


 Based on the context provided, some of the key security and governance capabilities of Amazon Bedrock include:

- Amazon Bedrock is now a HIPAA eligible service and can be used in compliance with GDPR, allowing it to be used in regulated industries like healthcare and finance.

- It has new governance capabilities including integration with Amazon CloudWatch to track usage metrics and build customized dashboards. 

- It also has integration with AWS CloudTrail to monitor API activity and troubleshoot issues. 

- Security and privacy are built into Bedrock since day one. Customers' data remains encrypted at rest and in transit. 

- It has the same AWS access controls as other AWS services. 

- None of the customer's data is used to train the original foundation models. 

- Data remains secure and private when customizing foundation models.

So in summary, Amazon Bedrock has robust security and governance capabilities to ensure customer data privacy, monitor usage, comply with regulatio

In [17]:
memory.chat_memory.messages


[HumanMessage(content='what are the three key features of amazon bedrock?'),
 AIMessage(content=' Based on the context provided, the three key features of Amazon Bedrock seem to be:\n\n1. Model choice and flexibility - Amazon Bedrock offers a variety of foundation models (FMs) from different providers, allowing customers to choose the models that work best for their use cases.\n\n2. Customization capabilities - Customers can customize the foundation models using their own private data through techniques like fine-tuning and RAIL (Rapid Adaptation with Integrated Learning). This allows them to build more tailored generative AI applications.\n\n3. Security and governance - Amazon Bedrock has security and privacy built-in and offers new governance capabilities like integration with CloudWatch and CloudTrail. This allows customers to use Bedrock securely and monitor usage and activity.'),
 HumanMessage(content='Elaborate the third feature more.'),
 AIMessage(content=" Based on the context 