# Bedrock Implementation of Medium Analyzer

In this notebook we will use Claude for the LLM via Bedrock and the Titan Embeddings Model to build out a simple RAG Workflow orchestrated by LangChain.

## Setup
Can use any Python environment that has Boto3 access to the Bedrock models.

### Credits
Bedrock OSS RAG Reference: https://github.com/aws-samples/amazon-bedrock-workshop/blob/main/06_OpenSource_examples/01_Langchain_KnowledgeBases_and_RAG_examples/01_qa_w_rag_claude.ipynb

In [None]:
#%pip install langchain>=0.1.11
#%pip install pypdf==4.1.0
#%pip install langchain-community faiss-cpu==1.8.0 tiktoken==0.6.0 sqlalchemy==2.0.28

In [None]:
import boto3
import botocore
import langchain
from langchain.embeddings.cache import CacheBackedEmbeddings
from langchain.vectorstores import FAISS
from langchain.storage import LocalFileStore
from langchain.document_loaders import PyPDFDirectoryLoader
from langchain.chains import RetrievalQA

boto3_bedrock = boto3.client('bedrock-runtime')

## Sample Boto3 Inference With Claude V2

In [None]:
import json

model_id = 'anthropic.claude-v2'
accept = "application/json"
contentType = "application/json"

prompt_data = """Human: Write me a small paragraph saying nice things about me.

Assistant:
"""
print(prompt_data)

body = json.dumps({"prompt": prompt_data, "max_tokens_to_sample": 500})
response = boto3_bedrock.invoke_model(
    body=body, modelId=model_id, accept=accept, contentType=contentType
)
response_body = json.loads(response.get("body").read())
print(response_body.get("completion"))

## Embeddings & Vector Store Setup

In [None]:
# where our embeddings will be stored
store = LocalFileStore("./cache/")

In [None]:
# instantiate a loader: this loads our data, use PDF in this case
loader = PyPDFDirectoryLoader("sagemaker-articles/")

In [None]:
# by default the PDF loader both loads and splits the documents for us
pages = loader.load_and_split()
print(len(pages))

## Chain Creation

We instantiate the LLM and Embeddings model we are using and point towards our vector database with the embeddings.

In [None]:
from langchain.embeddings import BedrockEmbeddings
from langchain.llms.bedrock import Bedrock

# - create the LLM and Embeddings Models
llm = Bedrock(model_id="anthropic.claude-v2", client=boto3_bedrock, model_kwargs={'max_tokens_to_sample':200})
bedrock_embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v1", client=boto3_bedrock)

# pass in our vector store
embedder = CacheBackedEmbeddings.from_bytes_store(
    bedrock_embeddings,
    store
)

In [None]:
# create vector store, we use FAISS in this case
vector_store = FAISS.from_documents(pages, embedder)

In [None]:
# this is the entire retrieval system
medium_qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vector_store.as_retriever(),
    return_source_documents=True,
    verbose=True
)

In [None]:
# helper method to structure prompt template, optionally use langchain prompt template
def fill_prompt(template, human_text):
    # Replace the placeholder 'Human:' with the provided human_text
    filled_prompt = template.replace("Human:", f"Human: {human_text}")
    return filled_prompt

# Your template
prompt_data = """Human:

Assistant:
"""

# sample input
human_input = "You are an incredible friend, always supportive and kind."
result = fill_prompt(prompt_data, human_input)
print(result)

## Sample Inference with RAG and Vanilla Bedrock Model

In [None]:
sample_prompts = ["What does Ram Vegiraju write about?",
                 "What is Amazon SageMaker?",
                 "What is Amazon SageMaker Inference?",
                 "What are the different hosting options for Amazon SageMaker?",
                 "What is Serverless Inference with Amazon SageMaker?",
                 "What's the difference between Multi-Model Endpoints and Multi-Container Endpoints?",
                 "What SDKs can I use to work with Amazon SageMaker?"]

In [None]:
for prompt in sample_prompts:
    print(prompt)
    print()
    print("------------------------------------")
    print("Vanilla Bedrock Response")
    print("------------------------------------")
    print()
    prompt_template = fill_prompt(prompt_data, prompt)
    body = json.dumps({"prompt": prompt_template, "max_tokens_to_sample": 500})
    response = boto3_bedrock.invoke_model(
        body=body, modelId=model_id, accept=accept, contentType=contentType
    )
    response_body = json.loads(response.get("body").read())
    print(response_body.get("completion"))
    print()
    print("------------------------------------")
    print("RAG Enabled Response")
    print("------------------------------------")
    response_rag = medium_qa_chain({"query":prompt})
    print(response_rag['result'])
    print()