# Level 2: Simple Agentic 

This tutorial is designed to give new users an overview of how to use llama-stack's builtin agent framework effectively. It demonstrates basic single-tool usage and provides guidance on switching between a local development environment and a remote kubernetes cluster.

## Overview

This tutorial walks you through how to build your own AI agent who can search the web following these steps:
1. Connecting to a llama-stack server (remote or local).
2. Configuring an agent for tool use.
3. Running the agent.


## Prerequisites

Before starting, ensure you have the following:
- Access to a remote cluster or a local Podman setup.
- A Tavily API key is required. You can register for one at [https://tavily.com/](https://tavily.com/).


## Building a Web Search Agent
### 1. Connecting to a llama-stack server.
#### 1.1 Setting Up the Environment
- Import the necessary libraries.
- Defining user variables.
- Setting your ENV variables: Use the [`.env.example`](../../../.env.example) to create a new file called `.env` and ensure you add all the relevant environment variables
- Connection options
    - **Remote Server**: Set `remote=True` and provide the remote llama-stack endpoint.
    - **Local Server**: Set `remote=False` and provide the local llama-stack endpoint (default port: 8321).
    
    For detailed llama-stack server setup instructions, refer to:
    - [Remote Setup Guide](../../../kubernetes/README.md)
    - [Local Setup Guide](../../../local_setup_guide.md)

In [2]:
from llama_stack_client.lib.agents.agent import Agent
from llama_stack_client.lib.agents.event_logger import EventLogger
from llama_stack_client.lib.agents.react.agent import ReActAgent
from llama_stack_client.lib.agents.react.tool_parser import ReActOutput
from llama_stack_client import LlamaStackClient
import os
import sys
sys.path.append('..')  
from src.utils import step_printer
from src.client_tools import get_location
from termcolor import cprint
from dotenv import load_dotenv
load_dotenv()

remote = True # Use the `remote` variable to switching between a local development environment and a remote kubernetes cluster.
stream_output = False # Set to True to stream the output of the agent.
inference_model = "ibm-granite/granite-3.2-8b-instruct"
print(f"Model: {inference_model}")

tavily_search_api_key = os.getenv("TAVILY_SEARCH_API_KEY")  # necessary for websearch tool
if remote:
    base_url = os.getenv("REMOTE_BASE_URL")
else:
    base_url = "http://localhost:8321"

Model: ibm-granite/granite-3.2-8b-instruct


#### 1.2 Client Initialization
- Initialize the `LlamaStackClient` with the appropriate base URL and provider data.  
- Connect the llama-stack client to the llama-stack server using the base URL.

In [3]:
client = LlamaStackClient(
        base_url=base_url,
        provider_data={"tavily_search_api_key": tavily_search_api_key}
)

### 2. Configuring an agent for tool use.
- **Agent Initialization**: 

- Create an `Agent` instance with the desired model, instructions and tools."

- **Instructions**: The `instructions` parameter, also referred to as the system prompt, specifies the agent's role and behavior. In this example, the agent is configured as a helpful web search assistant. It is instructed to use a tool whenever a web search is required and to respond in a friendly and helpful tone.

- **Tools**: The `tools` parameter defines the tools available to the agent. In this case, the `builtin::websearch` tool is used, which enables the agent to perform web searches. This tool is essential for retrieving up-to-date information from the web. Internally, it leverages Tavily Search to execute the search queries efficiently.

- **How It Works**: When a user query is provided, the agent processes the input and determines whether a tool is required to fulfill the request. If the query involves retrieving information from the web, the agent invokes the `builtin::websearch` tool. The tool interacts with Tavily Search to fetch real-time data, which is then processed and returned to the user in a friendly and helpful tone. This workflow ensures that the agent can handle a wide range of queries effectively.

For more details on the `builtin::websearch` tool and its capabilities, refer to the [Llama-stack tools documentation](https://llama-stack.readthedocs.io/en/latest/building_applications/tools.html#web-search-providers). 

In [4]:
agent = Agent(
    client, 
    model=inference_model,
    instructions="""You are a helpful websearch assistant. When you are asked to search the web you must use a tool. 
            Whenever a tool is called, be sure return the response in a friendly and helpful tone.
            """ ,
    tools=["builtin::websearch"],
    sampling_params={"max_tokens":4096}
)

### 3. Running the agent.
- Define user prompts to interact with the agent.
- Use the agent to create a session and process user queries.
- Display the agent's responses for each query.

In [5]:
user_prompts = [
    "What’s latest in OpenShift?",
    "Search for recent news articles about advancements in quantum computing.",
    "Find coffee shops near Boston Convention and Exhibition Center that are open now, have Wi-Fi.",
]
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    session_id = agent.create_session("web-session")
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )
    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: What’s latest in OpenShift?[0m

---------- 📍 Step 1: InferenceStep ----------
🛠️ Tool call Generated:
[33mTool call: brave_search, Arguments: {'query': 'latest updates on OpenShift'}[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are the latest updates on OpenShift:

1. **Red Hat unveils OpenShift 4.18: Enhanced security and virtualization**
   - This version introduces improvements in network flexibility with User Defined Networks and Border Gateway Protocol for pods and virtual machines.
   - For a detailed list of updates and improvements, check out the [OpenShift 4.18 Release Notes](https://www.redhat.com/en/blog/what-you-need-to-know-red-hat-openshift-418).

2. **Experience the many enhancements in OpenShift 4.18**
   - This article provides an overview of the latest features, including updates for plug-ins and templates.
   - For a comprehensive review, refer to [What's new in Red Hat Developer Hub 1.4](https://developers.redhat.com/blog/2025/02/25/whats-new-developers-red-hat-openshift-418).

3. **Red Hat OpenShift 4.17: What you need to know**
   - OpenShift 4.17, based on Kubernetes 1.30 and CRI-O 1.30, offers expanded control p


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some recent news articles about advancements in quantum computing:

1. **Quantum Computers News -- ScienceDaily**
   - [Read the latest news in developing quantum computers](https://www.sciencedaily.com/news/computers_math/quantum_computers/)
   - A team has developed a structure called a 3D photonic-crystal cavity that could enable transformative advancements in quantum computing.

2. **Quantum Computing News -- ScienceDaily**
   - [Read the latest about the development of quantum computers](https://www.sciencedaily.com/news/matter_energy/quantum_computing/)
   - The same team has again made headlines with their 3D photonic-crystal cavity, which could lead to significant advancements in quantum computing.

3. **Quantum computing | MIT News | Massachusetts Institute of Technology**
   - [A blueprint for making quantum computers easier to program](https://news.mit.edu/topic/quantum-computing)
   - A study fro


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some coffee shops near the Boston Convention and Exhibition Center that are currently open and offer Wi-Fi:

1. **Cocorico Boulangerie** - Reviews on Yelp mention that they are open now, offer delivery and takeout, and have free Wi-Fi. You can find them at [this link](https://www.yelp.com/search?find_desc=Coffee&find_near=boston-convention-and-exhibition-center-boston).

2. **Blue Bottle Coffee** - Also listed on Yelp, they are open now, offer delivery and takeout, and have free Wi-Fi. Check out their details [here](https://www.yelp.com/search?find_desc=Coffee&find_near=boston-convention-and-exhibition-center-boston).

3. **Flour Bakery + Café** - This place is also mentioned on Yelp. They are open now, offer delivery and takeout, and have free Wi-Fi. You can find more information [here](https://www.yelp.com/search?find_desc=Coffee&find_near=boston-convention-and-exhibition-center-boston).

4. **Tatte Bakery

## Output Analysis
Here, we can observe that the `builtin::websearch` tool is used to perform a web search. The outputs are displayed in the notebook with color-coded text to help interpret the process:

- **Blue Text**: Represents the user's input or query.
- **Yellow Text**: Displays the LLM's inference response. 
- **Green Text**: Indicates the tool execution process, such as the tool being called and the query being sent to the web search API.

Great! 
We can see that the model returned some relevant and recent information about OpenShift. This was only possible due to its ability to call tools like the web search, demonstrating the agent's capacity to retrieve real-time data effectively.

For the second query, the model fetched the latest news articles, including one from April 2025. This is particularly impressive since the Granite 3.2 8B model, released in February 2025, has a knowledge cutoff date of April 2024. This demonstrates the agent's capability to extend beyond its static knowledge base by dynamically retrieving current information.  

For the third query about nearby coffee shops, the agent effectively used the `builtin::websearch` tool to provide practical, location-specific results with valid coffee shops and links, highlighting its real-world applicability.  

## Prompt chaining with websearch tool and client tool

In this section, we demonstrate a more sophisticated use case that combines location detection with web search. Instead of explicitly providing location information, we'll:

1. **Automatic Location Detection**: Use the `get_location` client tool to automatically determine the user's current location
2. **Contextual Search**: Leverage the detected location to find relevant nearby places


For example, when a user asks "Find coffee shops near me that are open now, have Wi-Fi", the agent will:
- First detect the user's current location
- Then use that location to search for nearby coffee shops
- Present a comprehensive response with relevant options

This demonstrates how builtin websearch tool and client tool can work together to provide intelligent, context-aware responses without requiring explicit location input from the user.

In [7]:
agent = Agent(
    client, 
    model=inference_model,
    instructions="""You are a helpful assistant. 
    When a user asks about their location, you MUST use the get_location tool.
    """ ,
    tools=[get_location, "builtin::websearch"],
    sampling_params={"max_tokens":4096}
)
user_prompts = [
    "Where am I?",
    "Find coffee shops near me that are open now, have Wi-Fi."
]
session_id = agent.create_session("web-session")  # for prompt chaining, queries must share the same session_id.
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )

    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: Where am I?[0m

---------- 📍 Step 1: InferenceStep ----------
🛠️ Tool call Generated:
[33mTool call: get_location, Arguments: {'query': 'current location'}[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mYou are currently in Dublin, Leinster, Ireland.
[0m


[34mProcessing user query: Find coffee shops near me that are open now, have Wi-Fi.[0m

---------- 📍 Step 1: InferenceStep ----------
🛠️ Tool call Generated:
[33mTool call: brave_search, Arguments: {'query': 'coffee shops open now with Wi-Fi near Dublin'}[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some coffee shops near Dublin that are open now and offer Wi-Fi:

1. [Best Coffee Shops Near Dublin, Dublin - Yelp](https://www.yelp.com/search?find_desc=Best+Coffee+Shops&find_loc=Dublin) - This link provides reviews on some of the best coffee shops in Dublin, including Accents Coffee & Tea Lounge, The Bakehouse, Queen of Tarts, Beanhive, and Kaph.

2. [BEST of Dublin, Republic of Ireland 24/7 Coffee - Yelp](https://www.yelp.com/search?find_desc=24/7+coffee&find_loc=Dublin) - This link lists top 10 24/7 coffee spots in Dublin, including Starbucks, Generator Dublin, Spar, Net House, Frank And Honest Gourmet Coffee Shop, Cloud Nine, Sophie's Bar, Tea Time Express, and Supermacs.

3. [Best Free Wi-Fi Cafes in Dublin - UPDATED: March, 2025](https://laptopfriendlycafe.com/free-wifi-cafes-in-dublin) - This article provides a list of cafes in Dublin where you can work or relax with free Wi-Fi.

4. [The 10 Best Pla

### ReAct Agent with websearch tool and client tool

This section demonstrates the ReAct (Reasoning and Acting) framework in action.

For example, when asked "Find coffee shops near me that are open now, have Wi-Fi", the agent:
1. **Reasons** it needs location information first
2. **Acts** by calling the `get_location` client tool
3. **Observes** the location result
4. **Reasons** about the next step
5. **Acts** by calling the `websearch` tool with observed location
6. **Observes** and processes the search results

Unlike prompt chaining which follows fixed steps, ReAct dynamically breaks down tasks and adapts its approach based on the results of each step. This makes it more flexible and capable of handling complex, real-world queries effectively.

In [8]:
agent = ReActAgent(
            client=client,
            model=inference_model,
            tools=[get_location, "builtin::websearch"],
            response_format={
                "type": "json_schema",
                "json_schema": ReActOutput.model_json_schema(),
            },
            sampling_params={"max_tokens":40960},
        )
user_prompts = [
    "Find coffee shops near me that are open now, have Wi-Fi."
]
session_id = agent.create_session("web-session")
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )
    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: Find coffee shops near me that are open now, have Wi-Fi.[0m

---------- 📍 Step 1: InferenceStep ----------
🤖 Model Response:
[33m{
    "thought": "I need to find coffee shops near the user's current location that are open now and have Wi-Fi. I will use the `get_location` tool to get the user's current location and then use the `web_search` tool to find the coffee shops that meet the criteria.",
    "action": {
        "tool_name": "get_location",
        "tool_params": [{"name": "query", "value": "current location"}]
    },
    "answer": null
}
[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33m{
    "thought": "Now that I have the user's location, I will use the `web_search` tool to find coffee shops near Dublin, Leinster, IE that are open now and have Wi-Fi.",
    "action": {
        "tool_name": "web_search",
        "tool_params": [{"name": "query", "value": "coffee shops open now with Wi-Fi near Dublin, Leinster, IE"}]
    },
    "answer": null
}
[0m

---------- 📍 Step 4: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 5: InferenceStep ----------
🤖 Model Response:
[33m{
    "thought": "I have found several coffee shops near Dublin, Leinster, IE that are open now and have Wi-Fi. I will now summarize the information and provide the user with the top results.",
    "action": null,
    "answer": "Here are some coffee shops near Dublin, Leinster, IE that are open now and have Wi-Fi: \n1. Shoe Lane Coffee \n2. Beanhive \n3. Queen of Tarts \n4. Cocobrew \n5. 9th Degree Coffee Roasters \n6. Two Pups Coffee \n7. The Bald Barista \n8. The Bakehouse \n9. Keoghs Café \n10. Bread 41 \n\nYou can find more details and reviews on Yelp. Enjoy your coffee!"
}
[0m



## Key Takeaways
- This tutorial demonstrates how to set up and use websearch tool with AI Agent.
- It highlights the flexibility of switching between remote and local environments.
- By following these steps, users can quickly get started with AI Agents and explore its capabilities.
- It further showcased two advanced approach using prompting chaining and ReAct Agent.

For more advanced use cases, please refer to our other example Jupyter notebooks.