![image](https://raw.githubusercontent.com/IBM/watson-machine-learning-samples/master/cloud/notebooks/headers/watsonx-Prompt_Lab-Notebook.png)
# Use watsonx, `llama-3-1-70b-instruct` and LlamaIndex to make simple chat conversation and tool calls.

#### Disclaimers

- Use only Projects and Spaces that are available in watsonx context.


## Notebook content

This notebook provides a detailed demonstration of the steps and code required to showcase support for Chat models, including the integration of tools using [LlamaIndex](https://docs.llamaindex.ai/en/stable/module_guides/deploying/chat_engines/), ReActAgent and watsonx.ai models.

Some familiarity with Python is helpful. This notebook uses Python 3.11.


## Learning goal

The purpose of this notebook is to show how to use chat models like `meta-llama/llama-3-1-70b-instruct` using the LlamaIndex tools and integration with ReActAgent.

LlamaIndex is an open source data orchestration framework for building large language model (LLM) applications. LlamaIndex is available in Python and TypeScript and leverages a combination of tools and capabilities that simplify the process of context augmentation for generative AI (gen AI) use cases through a Retrieval-Augmented (RAG) pipeline. 

More examples can be found [here](https://docs.llamaindex.ai/en/stable/examples/llm/ibm_watsonx/).


## Table of Contents

This notebook contains the following parts:

- [Setup](#setup)
- [Foundation Models on watsonx](#models)
- [LlamaIndex integration](#chatwatsonx)
- [Using ReActAgent for chatting](#reactagent)
- [Summary](#summary)

<a id="setup"></a>
## Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Create a <a href="https://cloud.ibm.com/catalog/services/watson-machine-learning" target="_blank" rel="noopener no referrer">Watson Machine Learning (WML) Service</a> instance (a free plan is offered and information about how to create the instance can be found <a href="https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/wml-plans.html?context=wx&audience=wdp" target="_blank" rel="noopener no referrer">here</a>).

### Install and import the `datasets` and dependencies

In [1]:
!pip install -U "llama-index-llms-ibm>=0.3.0" | tail -n 1

### Define the WML credentials
Use the code cell below to define the WML credentials that are required to work with watsonx Foundation Model inferencing.

**Action:** Provide the IBM Cloud user API key. For details, see <a href="https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui" target="_blank" rel="noopener no referrer">Managing user API keys</a>.

In [3]:
import getpass
from ibm_watsonx_ai import Credentials

credentials = Credentials(
    url="https://us-south.ml.cloud.ibm.com",
    api_key=getpass.getpass("Enter your WML API key and hit enter: "),
)

### Define the project ID
You need to provide the project ID to give the Foundation Model the context for the call. If you have a default project ID set in Watson Studio, the notebook obtains that project ID. Otherwise, you need to provide the project ID in the code cell below.

In [4]:
import os

try:
    project_id = os.environ["PROJECT_ID"]
except KeyError:
    project_id = input("Enter your project_id and hit enter: ")

<a id="models"></a>
## Set up a Foundation Model on `watsonx.ai`


Specify the `model_id` of the model you will use for the chat with tools.

In [5]:
model_id = "meta-llama/llama-3-1-70b-instruct"

<a id="chatwatsonx"></a>
## LlamaIndex integration

`WatsonxLLM` is a wrapper around watsonx.ai models that provides chat integration around these models.

### Initialize the `WatsonxLLM` class

In [6]:
from llama_index.llms.ibm import WatsonxLLM

llm = WatsonxLLM(
    model_id=model_id,
    url=credentials.url,
    apikey=credentials.api_key,
    project_id=project_id,
)

Answer a simple question using a defined object.

In [7]:
from llama_index.core.llms import ChatMessage, MessageRole

msg = ChatMessage(role=MessageRole.USER, content='Answer in short sentence: what is generative AI?')
print(llm.chat([msg]))

assistant: Generative AI refers to artificial intelligence technologies that can generate new content, such as text, images, or audio, based on patterns and structures learned from existing data.


Using streaming.

In [8]:
msg = ChatMessage(role=MessageRole.USER, content='Answer in short sentence: how to drive a car?')
for x in llm.stream_chat([msg]):
    print(x.delta, end="", flush=True)

To drive a car, start the engine, check your mirrors and surroundings, release the brake, and slowly press the accelerator while steering in your desired direction.

<a id="reactagent"></a>
## Use ReActAgent for chatting

Let's define the assistant tools for calculations and the `ReActAgent` object for chatting and streaming.

More details about `ReActAgent` itself [here](https://docs.llamaindex.ai/en/stable/examples/agent/react_agent/).

In [9]:
from llama_index.core.tools import FunctionTool

def multiply(a: float, b: float) -> float:
    """Multiply two floats and returns the result float"""
    return a * b

def add(a: float, b: float) -> float:
    """Add a and b."""
    return a + b

def subtract(a: float, b: float) -> float:
    """Subtract a and b."""
    return a - b

def multiply(a: float, b: float) -> float:
    """Multiply a and b."""
    return a * b

def divide(a: float, b: float) -> float:
    """Divide a and b."""
    return a / b


add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)
multiply_tool = FunctionTool.from_defaults(fn=multiply)
divide_tool = FunctionTool.from_defaults(fn=divide)

tools = [add_tool, subtract_tool, multiply_tool, divide_tool]

### Initialize ReAct agent

In [10]:
from llama_index.core.agent import ReActAgent

agent = ReActAgent.from_tools(tools, llm=llm, verbose=True)

### Answer question using tools

In [11]:
response = agent.chat("What is 20 + (2 * 4)? Calculate step by step ")

> Running step f25abc83-78c7-4f62-b2e5-c66e0ba4b066. Step input: What is 20 + (2 * 4)? Calculate step by step 
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me calculate the expression 2 * 4.
Action: multiply
Action Input: {'a': 2, 'b': 4}
[0m[1;3;34mObservation: 8
[0m> Running step 99cd4dad-1cf3-45f0-8587-d9f851b984a5. Step input: None
[1;3;38;5;200mThought: Now that I have the result of the multiplication, I can proceed with the addition.
Action: add
Action Input: {'a': 20, 'b': 8}
[0m[1;3;34mObservation: 28
[0m> Running step b07c5c9e-d55f-437a-bfda-877fb544b1d3. Step input: None
[1;3;38;5;200mThought: I have now calculated the entire expression 20 + (2 * 4).
Answer: 28
[0m

### Using chat history

Have a conversation with your data:

In [13]:
from llama_index.core.llms import ChatMessage, MessageRole

chat_history = [
    ChatMessage(role=MessageRole.USER, content="You are a pirate!"), 
    ChatMessage(role=MessageRole.SYSTEM, content="Amma pirate! My name is Jack Blackbeak.")
]

response = agent.chat('Who are you?', chat_history=chat_history)

> Running step 73ea7cca-22d5-4e48-9494-3b6f26c3c6b3. Step input: Who are you?
[1;3;34mObservation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
[0m> Running step dd73b261-f418-4304-a3c8-86c861f5e704. Step input: None
[1;3;38;5;200mThought: The current language of the user is: English. I need to introduce myself as a pirate.
Answer: Me name be Captain Jack Blackbeak, the most feared pirate to ever sail the Seven Seas.
[0m

### Using chat history and tools

In [14]:
from llama_index.core.llms import ChatMessage, MessageRole

msg = ChatMessage(role=MessageRole.USER, 
                  content=f"I was born in Nevada, {45} years ago. I am AI engineer")

response = agent.chat(f'Currently we have {2024}. When I was born?', chat_history=[msg])

> Running step de928379-a0ae-4c74-a343-221675a661fa. Step input: Currently we have 2024. When I was born?
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.

To find the birth year, we can subtract the age from the current year. However, the tool 'subtract' requires two numbers to subtract. We can use the given age and the current year as the two numbers.
Action: subtract
Action Input: {'a': 2024, 'b': 45}
[0m[1;3;34mObservation: 1979
[0m> Running step a0d00f41-29be-44a4-906b-b9e03aec38f6. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: You were born in 1979.
[0m

### Streaming

To enable streaming, you simply need to call the `stream_chat` endpoint instead of the `chat` endpoint.

In [15]:
response = agent.stream_chat("What is (20 / 5) + (2 * 4)? Calculate step by step ")

> Running step 14d7baa9-7652-4bb3-803c-b5526498ff01. Step input: What is (20 / 5) + (2 * 4)? Calculate step by step 
[1;3;38;5;200mThought: To calculate (20 / 5) + (2 * 4), I need to use the tools to perform division, multiplication and addition. I will start with division.
Action: divide
Action Input: {'a': 20, 'b': 5}
[0m[1;3;34mObservation: 4.0
[0m> Running step 1d50fc9a-d45a-41ee-8357-4ad3a46baa01. Step input: None
[1;3;38;5;200mThought: Now that I have the result of the division, I need to calculate the multiplication part of the expression, which is 2 * 4.
Action: multiply
Action Input: {'a': 2, 'b': 4}
[0m[1;3;34mObservation: 8
[0m> Running step ea5bccef-15ad-43ea-b7ae-9a8defbb469b. Step input: None
[1;3;38;5;200mThought: Now that I have the results of both the division and the multiplication, I can calculate the final result by adding them together.
Action: add
Action Input: {'a': 4.0, 'b': 8}
[0m[1;3;34mObservation: 12.0
[0m> Running step 583a8edc-4061-4ca2-8248-58

To use the generator, the user can simply call the `response.print_response_stream()` function or consume the generator via the `response_gen` field from the `response`.

In [16]:
for x in response.response_gen:
    print(x, end="", flush=True)

 The result of (20 / 5) + (2 * 4) is 12.0.

<a id="summary"></a>
## Summary and next steps

You successfully completed this notebook!

You learned how to build a simple agent using ReActAgent and `WatsonLLM`.

Check out our _<a href="https://ibm.github.io/watsonx-ai-python-sdk/samples.html" target="_blank" rel="noopener no referrer">Online Documentation</a>_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Author

**Wojciech Rębisz**, Software Engineer at Watson Machine Learning.

Copyright © 2025 IBM. This notebook and its source code are released under the terms of the MIT License.