# ReAct Agent over QueryEngine (RAG) tools using LlamaIndex, Amazon Bedrock and Elastic

#### Implementation
We're using an Elastic Cloud deployment of Elasticsearch for this notebook. If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration) for a free trial.

#### Python 3.10

⚠  For this lab we need to run the notebook based on a Python 3.10 runtime. ⚠


## Installation

To run this notebook you would need to install dependencies - llama-index and llama-index-llms-bedrock.

In [None]:
%pip install llama-index --force-reinstall --quiet
%pip install llama-index-llms-bedrock --force-reinstall --quiet
%pip install llama-index-embeddings-bedrock --force-reinstall --quiet
%pip install llama-index-vector-stores-elasticsearch --force-reinstall --quiet
%pip install wget --force-reinstall --quiet

## Kernel Restart

Restart the kernel with the updated packages that are installed through the dependencies above

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

## Setup 

Import the necessary libraries

In [None]:
from llama_index.llms.bedrock import Bedrock
from llama_index.core.llms import ChatMessage
from llama_index.embeddings.bedrock import BedrockEmbedding, Models

## Initialization

Initiate Bedrock Runtime through llama_index

In [None]:
model_id = 'us.anthropic.claude-3-7-sonnet-20250219-v1:0' # change this to use a different version from the model provider

llm = Bedrock(
   model=model_id
)

embed_model = BedrockEmbedding(model = "amazon.titan-embed-text-v2:0")

## Connect to Elasticsearch

We'll use the Cloud ID to identify our deployment, because we are using Elastic Cloud deployment. To find the Cloud ID for your deployment, go to [Cloud ID](https://cloud.elastic.co/deployments) and select your deployment.

We will use ElasticsearchStore to connect to our elastic cloud deployment. This would help create and index data easily. 

In [None]:
from getpass import getpass
from llama_index.vector_stores.elasticsearch import ElasticsearchStore

cloud_id = getpass("Elastic deployment Cloud ID: ")
cloud_username = "elastic"
cloud_password = getpass("Elastic deployment Password: ")
lyft_index_name= "lyft-index-1"
uber_index_name= "uber-index-1"

In [None]:
es_lyft = ElasticsearchStore(
    index_name=lyft_index_name,
    es_cloud_id=cloud_id, # found within the deployment page
    es_user="elastic",
    es_password=cloud_password # provided when creating deployment. Alternatively can reset password.
)

In [None]:
es_uber = ElasticsearchStore(
    index_name=uber_index_name,
    es_cloud_id=cloud_id, # found within the deployment page
    es_user="elastic",
    es_password=cloud_password # provided when creating deployment. Alternatively can reset password.
)

## Download Data

In [None]:
import wget

lyft_downloaded_file = wget.download('https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf')
uber_downloaded_file = wget.download('https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf')

## Load Data

In [None]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

lyft_docs = SimpleDirectoryReader(input_files=["lyft_2021.pdf"]).load_data()
uber_docs = SimpleDirectoryReader(input_files=["uber_2021.pdf"]).load_data()

## Settings

In [None]:
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

## Create Index

In [None]:
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex

lyft_storage_context = StorageContext.from_defaults(vector_store=es_lyft)
uber_storage_context = StorageContext.from_defaults(vector_store=es_uber)

lyft_index = VectorStoreIndex.from_documents(lyft_docs, storage_context=lyft_storage_context)
uber_index = VectorStoreIndex.from_documents(uber_docs, storage_context=uber_storage_context)

## Create Query Engines

In [None]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
uber_engine = uber_index.as_query_engine(similarity_top_k=3)

## Create QueryEngine Tools

In [None]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

## Create ReAct Agent

In [None]:
from llama_index.core.agent import ReActAgent

agent = ReActAgent.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
)

## Query with ReAct Agent

In [None]:
response = agent.chat("What was Lyft's revenue growth in 2021?")

## Generate Response

In [None]:
response.response

## Generate response by combining data from multiple sources
Here, overall behavior of a system emerges from the interactions of its parts.

In [None]:
response = agent.chat(
    "Compare and contrast the revenue growth of Uber and Lyft in 2021, then"
    " give an analysis"
)

## Generate Response

In [None]:
response.response

## Conclusion
You have now experimented with using `llama-index` SDK to get an exposure to Anthropic Claude 3.7 and Amazon Bedrock. Using llama-index you have generated an email responding to a customer due to their negative feedback.

### Take aways
- Adapt this notebook to experiment with different Claude 3 models available through Amazon Bedrock. 
- Change the prompts to your specific usecase and evaluate the output of different models.
- Play with the token length to understand the latency and responsiveness of the service.
- Apply different prompt engineering principles to get better outputs.

## Thank You