# Text Generation with Anthropic Claude 3, Amazon Bedrock and Llama-index

## Introduction

In this notebook we will show you how to use Elastic search, Amazon Bedrock, Anthropic Claude 3 and llama-index to build a Retrieval Augmented Generation (RAG) solution


#### Use case

To demonstrate the RAG capability, let's take the use case of an AI Assistant that can help answer questions from a personal document. 


#### Persona
You are Bob,an Application Developer at Anycompany. Anycompany is experiencing an overwhelming number of customer queries. Anycompany has built a secure and performant conversational AI Assistant to answer frequently asked questions. Now Anycompany wants this conversational AI assistant to be able to answer questions are which are specific to the company. 

In this workshop, you will build a context aware conversational AI Assistant for Anycompany 

#### Implementation
To fulfill this use case, in this notebook we will show how to create a RAG Application to answer questions from business data. We will use  Elasticsearch, Anthropic Claude 3 Sonnet Foundation model, Amazon Bedrock and llama-index. 
We're using an Elastic Cloud deployment of Elasticsearch for this notebook. If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration) for a free trial.

#### Python 3.10

⚠  For this lab we need to run the notebook based on a Python 3.10 runtime. ⚠


## Installation

To run this notebook you would need to install dependencies - llama-index and llama-index-llms-bedrock.

In [None]:
%pip install llama-index --force-reinstall --quiet
%pip install llama-index-llms-bedrock --force-reinstall --quiet
%pip install llama-index-embeddings-bedrock --force-reinstall --quiet
%pip install llama-index-vector-stores-elasticsearch --force-reinstall --quiet

## Kernel Restart

Restart the kernel with the updated packages that are installed through the dependencies above

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

## Setup 

Import the necessary libraries

In [None]:
from llama_index.llms.bedrock import Bedrock
from llama_index.core.llms import ChatMessage

## Initialization

Initiate Bedrock Runtime through llama_index

In [None]:
model_id = 'anthropic.claude-3-sonnet-20240229-v1:0' # change this to use a different version from the model provider

llm = Bedrock(
   model=model_id
)

## Connect to Elasticsearch

We'll use the Cloud ID to identify our deployment, because we are using Elastic Cloud deployment. To find the Cloud ID for your deployment, go to [Cloud ID](https://cloud.elastic.co/deployments) and select your deployment.

We will use ElasticsearchStore to connect to our elastic cloud deployment. This would help create and index data easily. 

In [None]:
from getpass import getpass
from llama_index.vector_stores.elasticsearch import ElasticsearchStore

cloud_id = getpass("Elastic deployment Cloud ID: ")
cloud_username = "elastic"
cloud_password = getpass("Elastic deployment Password: ")
index_name= "new-index-1"

es = ElasticsearchStore(
    index_name=index_name,
    es_cloud_id=cloud_id, # found within the deployment page
    es_user="elastic",
    es_password=cloud_password # provided when creating deployment. Alternatively can reset password.
)

## Create Ingestion Pipeline
Create Ingestion pipeline to load the file, create embeddings and load it into Elasticsearch

In [None]:
from llama_index.core.ingestion import IngestionPipeline
from llama_index.embeddings.bedrock import BedrockEmbedding
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import SimpleDirectoryReader
import json, os

embeddingmodelId = 'amazon.titan-embed-text-v1' # change this to use a different embedding model

model = BedrockEmbedding(model=embeddingmodelId)

pipeline = IngestionPipeline(transformations=[SentenceSplitter(chunk_size=350, chunk_overlap=50),model,],
        vector_store=es
    )

def get_documents_from_file(file):
   """Reads a json file and returns list of Documents"""

   with open(file=file, mode='rt') as f:
       conversations_dict = json.loads(f.read())
      
   # Build Document objects using fields of interest.
   documents = [Document(text=item['conversation'],
                         metadata={"conversation_id": item['conversation_id']})
                for
                item in conversations_dict]
   return documents

TMP_DIR = os.path.join(os.path.dirname(os.path.realpath('__file__')), 'media')

reader = SimpleDirectoryReader(
    input_dir=TMP_DIR, required_exts=[".json"]
)

documents = reader.load_data()

pipeline.run(documents=documents)
print(".....Done running pipeline.....\n")

## Model Invocation and Response Generation

Now that we have the passages stored in Elasticsearch and LLM is initialized, we can now ask a question to get the relevant passages.

In [None]:
from llama_index.core import VectorStoreIndex, QueryBundle, Response, Settings

Settings.embed_model= model

index = VectorStoreIndex.from_vector_store(es)
query_engine = index.as_query_engine(llm, similarity_top_k=10)

query="Give me summary of water related issues"
bundle = QueryBundle(query, embedding=Settings.embed_model.get_query_embedding(query))
result = query_engine.query(bundle)
print(result)

## Delete Elasticsearch Index

Delete the Elasticsearch index

In [None]:
from elasticsearch import Elasticsearch

es = Elasticsearch(cloud_id=cloud_id, basic_auth=(cloud_username, cloud_password))
es.options(ignore_status=[400,404]).indices.delete(index=index_name)

## Conclusion
You have now experimented with using `llama-index` SDK to get an exposure to Anthropic Claude 3 and Amazon Bedrock. Using llama-index you have generated an email responding to a customer due to their negative feedback.

### Take aways
- Adapt this notebook to experiment with different Claude 3 models available through Amazon Bedrock. 
- Change the prompts to your specific usecase and evaluate the output of different models.
- Play with the token length to understand the latency and responsiveness of the service.
- Apply different prompt engineering principles to get better outputs.

## Thank You