# watsonx

[IBM® watsonx.ai™](https://www.ibm.com/products/watsonx-ai) is an enterprise-grade studio for developing AI services and deploying them into your applications of choice―with a collection of the APIs, tools, models and runtimes you need to turn your ideas and requirements into reality.

## Features

When using this client class, AutoGen's messages are automatically tailored to accommodate the specific requirements of Watsonx's API.

Additionally, this client class provides support for function/tool calling and will track token usage and cost correctly as per Watsonx's API costs (as of Nov 2024).

## Getting started

First you need to install the `autogen-agentchat~=0.2` package to use AutoGen with the Watsonx API library.

``` bash
pip install autogen-agentchat[watsonx]~=0.2
```

Watsonx.ai provides a number of models to use, included below. See the list of [models here](https://www.ibm.com/products/watsonx-ai/foundation-models#generative).

See the sample `OAI_CONFIG_LIST` below showing how the Watsonx client class is used by specifying the `api_type` as `watsonx`.

```python
[
    {
        "api_type": "watsonx",
        "model": "ibm/granite-3-8b-instruct",
        "api_key": "your watsonx api key",
        "space_id": "your watsonx space id",
    }
]
```

As an alternative to the `api_key` and `space_id` key and value in the config, you can set the environment variable `WATSONX_API_KEY` and `WATSONX_SPACE_ID` to your Watsonx credentials.

## API parameters

The following parameters can be added to your config for the Watsonx API. See [this link](https://ibm.github.io/watsonx-ai-python-sdk/fm_model_inference.html#) for further information on them and their default values.


Example:
```python
[
    {
        "api_type": "watsonx",
        "model": "ibm/granite-3-8b-instruct",
        "api_key": "your watsonx api key",
        "space_id": "your watsonx space id",
        "name-of-param": value_of_param,
    }
]
```


## Two-Agent Coding Example

In this example, we run a two-agent chat with an AssistantAgent (primarily a coding agent) to generate code to count the number of prime numbers between 1 and 10,000 and then it will be executed.

In [1]:
import os

config_list = [
    {
        # Watsonx also hosts a variety of models other than the granite series
        "model": "meta-llama/llama-3-1-70b-instruct",
        "api_key": os.environ.get("WATSONX_API_KEY"),
        "url": os.environ.get("WATSONX_URL"),
        "space_id": os.environ.get("WATSONX_SPACE_ID"),
        "api_type": "watsonx",
    }
]

Importantly, we have tweaked the system message so that the model doesn't return the termination keyword, which we've changed to FINISH, with the code block.

In [2]:
from pathlib import Path

from autogen import AssistantAgent, UserProxyAgent
from autogen.coding import LocalCommandLineCodeExecutor

# Setting up the code executor
workdir = Path("coding")
workdir.mkdir(exist_ok=True)
code_executor = LocalCommandLineCodeExecutor(work_dir=workdir)

# Setting up the agents

# The UserProxyAgent will execute the code that the AssistantAgent provides
user_proxy_agent = UserProxyAgent(
    name="User",
    code_execution_config={"executor": code_executor},
    is_termination_msg=lambda msg: "FINISH" in msg.get("content"),
)

system_message = """You are a helpful AI assistant who writes code and the user executes it.
Solve tasks using your coding and language skills.
In the following cases, suggest python code (in a python coding block) for the user to execute.
Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.
IMPORTANT: Wait for the user to execute your code and then you can reply with the word "FINISH". DO NOT OUTPUT "FINISH" after your code block."""

# The AssistantAgent will take the coding request and return code
assistant_agent = AssistantAgent(
    name="Watsonx Assistant",
    system_message=system_message,
    llm_config={"config_list": config_list},
)

flaml.automl is not available. Please install flaml[automl] to enable AutoML functionalities.


In [3]:
# Start the chat, with the UserProxyAgent asking the AssistantAgent the message
chat_result = user_proxy_agent.initiate_chat(
    assistant_agent,
    message="Provide code to count the number of prime numbers from 1 to 10000.",
)

[33mUser[0m (to Watsonx Assistant):

Provide code to count the number of prime numbers from 1 to 10000.

--------------------------------------------------------------------------------
[33mWatsonx Assistant[0m (to User):

```python
# Count the number of prime numbers from 1 to 10000

def count_primes(n):
    primes = []
    for possiblePrime in range(2, n + 1):
        isPrime = True
        for num in range(2, int(possiblePrime ** 0.5) + 1):
            if possiblePrime % num == 0:
                isPrime = False
                break
        if isPrime:
            primes.append(possiblePrime)
    return len(primes)

print(count_primes(10000))
```

--------------------------------------------------------------------------------




Replying as User. Provide feedback to Watsonx Assistant. Press enter to skip and use auto-reply, or type 'exit' to end the conversation:  


[31m
>>>>>>>> NO HUMAN INPUT RECEIVED.[0m
[31m
>>>>>>>> USING AUTO REPLY...[0m
[31m
>>>>>>>> EXECUTING CODE BLOCK (inferred language is python)...[0m
[33mUser[0m (to Watsonx Assistant):

exitcode: 0 (execution succeeded)
Code output: 1229


--------------------------------------------------------------------------------
[33mWatsonx Assistant[0m (to User):

FINISH

--------------------------------------------------------------------------------




Replying as User. Provide feedback to Watsonx Assistant. Press enter to skip and use auto-reply, or type 'exit' to end the conversation:  


[31m
>>>>>>>> NO HUMAN INPUT RECEIVED.[0m


## Tool Call Example

In this example, instead of writing code, we will show how LLMs can perform tool calling.

We'll use a simple travel agent assistant program where we have a tool for weather.

In [4]:
import json
import os
from typing import Literal

from typing_extensions import Annotated

import autogen

Create our two agents.

In [5]:
# Create the agent for tool calling
chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="""For currency exchange and weather forecasting tasks,
        only use the functions you have been provided with.
        Output 'HAVE FUN!' when an answer has been provided.""",
    llm_config={"config_list": config_list},
)

# Note that we have changed the termination string to be "HAVE FUN!"
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    code_execution_config={"executor": code_executor},
    human_input_mode="NEVER",
    is_termination_msg=lambda x: x.get("content", "") and "HAVE FUN" in x.get("content", ""),
    max_consecutive_auto_reply=1,
)

Create a function, annotating it so that the description can be passed through to the LLM.

We associate it with the agents using `register_for_execution` for the user_proxy so it can execute the function and `register_for_llm` for the chatbot (powered by the LLM) so it can pass the function definition to the LLM.

In [6]:
# Weather function


# Example function to make available to model
def get_current_weather(location, unit="fahrenheit"):
    """Get the weather for some location"""
    if "chicago" in location.lower():
        return json.dumps({"location": "Chicago", "temperature": "13", "unit": unit})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "55", "unit": unit})
    elif "new york" in location.lower():
        return json.dumps({"location": "New York", "temperature": "11", "unit": unit})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})


# Register the function with the agent


@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Weather forecast for US cities.")
def weather_forecast(
    location: Annotated[str, "City name"],
) -> str:
    weather_details = get_current_weather(location=location)
    weather = json.loads(weather_details)
    return f"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}"

We pass through our customer's message and run the chat.

Finally, we ask the LLM to summarise the chat and print that out.

In [7]:
# start the conversation
res = user_proxy.initiate_chat(
    chatbot,
    message="What's the weather in New York",
    summary_method="reflection_with_llm",
)

print(f"LLM SUMMARY: {res.summary['content']}")

[33muser_proxy[0m (to chatbot):

What's the weather in New York

--------------------------------------------------------------------------------
[33mchatbot[0m (to user_proxy):

[32m***** Suggested tool call (chatcmpl-tool-baba90d4634d42b9935f8063dd80c69c): weather_forecast *****[0m
Arguments: 
{"location": "New York"}
[32m**************************************************************************************************[0m

--------------------------------------------------------------------------------
[35m
>>>>>>>> EXECUTING FUNCTION weather_forecast...[0m
[33muser_proxy[0m (to chatbot):

[33muser_proxy[0m (to chatbot):

[32m***** Response from calling tool (chatcmpl-tool-baba90d4634d42b9935f8063dd80c69c) *****[0m
New York will be 11 degrees fahrenheit
[32m***************************************************************************************[0m

--------------------------------------------------------------------------------




[33mchatbot[0m (to user_proxy):

HAVE FUN!

--------------------------------------------------------------------------------




LLM SUMMARY: The user asked for the weather in New York and the AI provided a JSON for a function call to retrieve the weather forecast for New York, which resulted in a temperature of 11 degrees Fahrenheit.




We can see that the agent recommended we call the tool and passed through the right parameters. The `user_proxy` executed them and this was passed back to the agent to interpret them and respond. Finally, the agent was asked to summarise the whole conversation.