<a href="https://colab.research.google.com/github/gardner/nais/blob/master/chat_engine_react.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chat Engine - ReAct Agent Mode

ReAct is an agent based chat mode built on top of a query engine over your data.

For each chat interaction, the agent enter a ReAct loop:
* first decide whether to use the query engine tool and come up with appropriate input
* (optional) use the query engine tool and observe its output
* decide whether to repeat or give final response

This approach is flexible, since it can flexibility choose between querying the knowledge base or not.
However, the performance is also more dependent on the quality of the LLM.
You might need to do more coercing to make sure it chooses to query the knowledge base at right times, instead of hallucinating an answer.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [11]:
%pip -q install llama-index llama-index-llms-gemini langchain-google-genai llama-index-readers-file llama-index-llms-langchain llama-index-embeddings-langchain langchain-community

In [1]:
import os
from getpass import getpass

# Safely input your API key
# See https://ai.google.dev/gemini-api/docs/api-key

os.environ["GOOGLE_API_KEY"] = getpass("Enter your Gemini API key: ")

Enter your Gemini API key: ··········


## Download Data

In [2]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2025-05-12 22:16:48--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2025-05-12 22:16:49 (3.24 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



### Get started by setting up the models

In [3]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from langchain_google_genai import ChatGoogleGenerativeAI
from llama_index.llms.langchain import LangChainLLM
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from llama_index.embeddings.langchain import LangchainEmbedding

llm = LangChainLLM(ChatGoogleGenerativeAI(model="gemini-2.0-flash"))
embed_model = LangchainEmbedding(GoogleGenerativeAIEmbeddings(model="models/text-embedding-004"))

### Check out some test embeddings

The `embed_model.get_text_embedding()` function uses the Gemini API to get numerical representations of text.

You can change the text and see the different values for `cosine_sim`. Sentences that have similar _meaning_ will have a higher `cosine_sim` value.

Try sentences that use different words but mean the same thing.

In [10]:
import numpy as np
from scipy.spatial import distance

embedding1 = embed_model.get_text_embedding("Gardner likes surfing.")
embedding2 = embed_model.get_text_embedding("He enjoys swimming.")

# Cosine similarity
cosine_sim = np.dot(embedding1, embedding2) / (np.linalg.norm(embedding1) * np.linalg.norm(embedding2))

print(f"Cosine similarity: {cosine_sim:.4f}\n\n")
print(f"Raw embedding1:\n{str(embedding1)[:80]}...\n---\n")
print(f"Raw embedding2:\n{str(embedding2)[:80]}...\n---\n")



Cosine similarity: 0.7744


Raw embedding1:
[0.004486949183046818, -0.0017152897780761123, -0.06783394515514374, -0.00014385...
---

Raw embedding2:
[-0.011345882900059223, 0.0017464363481849432, -0.052332378923892975, -0.0084953...
---



In [9]:
# More examples

sentence_pairs = [
    ("Daily Routines", "I drink coffee every morning", "Each day I have coffee at breakfast"),
    ("Weather", "The weather is extremely cold today", "It's freezing outside right now"),
    ("Cooking", "She's cooking dinner", "He's preparing breakfast"),
    ("Unrelated", "The cat is sleeping on the sofa", "The quarterly report is due tomorrow"),
    ("Garden", "The garden needs more water", "Python is a programming language"),
    ("Opposites", "The project was a huge success", "The project failed completely"),
]

# Calculate and display similarities
for topic, sentence1, sentence2 in sentence_pairs:
    embedding1 = embed_model.get_text_embedding(sentence1)
    embedding2 = embed_model.get_text_embedding(sentence2)

    cosine_sim = np.dot(embedding1, embedding2) / (np.linalg.norm(embedding1) * np.linalg.norm(embedding2))

    print(f"{topic}: {cosine_sim:.3f}")
    print(f"  '{sentence1}'")
    print(f"  '{sentence2}'")
    print()

Daily Routines: 0.953
  'I drink coffee every morning'
  'Each day I have coffee at breakfast'

Weather: 0.916
  'The weather is extremely cold today'
  'It's freezing outside right now'

Cooking: 0.892
  'She's cooking dinner'
  'He's preparing breakfast'

Unrelated: 0.597
  'The cat is sleeping on the sofa'
  'The quarterly report is due tomorrow'

Garden: 0.575
  'The garden needs more water'
  'Python is a programming language'

Opposites: 0.877
  'The project was a huge success'
  'The project failed completely'



### Read in the data

In [12]:
import json
docs = SimpleDirectoryReader(input_dir="./data/paul_graham/").load_data()
for doc in docs:
  print(f"id: {doc.doc_id}")
  print(f"metadata: {json.dumps(doc.metadata, indent=4)}")
  print(f"text:\n{doc.text[:150]}...") # Show the first 150 characters of the actual text in the doc

id: a171a56d-d892-4815-bd1d-8cc2d27e0a81
metadata: {
    "file_path": "/content/data/paul_graham/paul_graham_essay.txt",
    "file_name": "paul_graham_essay.txt",
    "file_type": "text/plain",
    "file_size": 75042,
    "creation_date": "2025-05-12",
    "last_modified_date": "2025-05-12"
}
text:


What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write ess...


### Create embeddings of the data and create a vector store


In [14]:
index = VectorStoreIndex.from_documents(docs, show_progress=True, embed_model=embed_model)

Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/22 [00:00<?, ?it/s]

Configure chat engine

In [15]:
chat_engine = index.as_chat_engine(chat_mode="react", llm=llm, verbose=True)

Chat with your data

In [16]:
response = chat_engine.chat(
    "Use the tool to answer what did Paul Graham do in the summer of 1995?"
)

> Running step 7a58e7a2-9a12-463b-bf09-fb0f8045f616. Step input: Use the tool to answer what did Paul Graham do in the summer of 1995?
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: query_engine_tool
Action Input: {'input': 'What did Paul Graham do in the summer of 1995?'}
[0m[1;3;34mObservation: In the summer of 1995, Paul Graham and Robert started trying to write software to build online stores.
[0m> Running step 1e2f3e22-7a20-4be2-aef6-74c28e99306d. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: In the summer of 1995, Paul Graham and Robert started trying to write software to build online stores.
[0m

In [None]:
print(response)

In [None]:
response = chat_engine.chat("What did I ask you before?")

In [None]:
print(response)

Reset chat engine

In [None]:
chat_engine.reset()

In [None]:
response = chat_engine.chat("What did I ask you before?")

In [None]:
print(response)