# Level 2: Simple Agentic 

This tutorial is designed to give new users an overview of how to use llama-stack's builtin agent framework effectively. It demonstrates basic single-tool usage and provides guidance on switching between a local development environment and a remote kubernetes cluster.

## Overview

This tutorial walks you through how to build your own AI agent who can search the web following these steps:
1. Connecting to a llama-stack server (remote or local).
2. Configuring an agent for tool use.
3. Running the agent.


## Prerequisites

Before starting, ensure you have the following:
- Access to a remote cluster or a local Podman setup.
- A Tavily API key is required. You can register for one at [https://tavily.com/](https://tavily.com/).

### Setting your ENV variables:

Use the [`.env.example`](../../../.env.example) to create a new file called `.env` and ensure you add all the relevant environment variables below. 

- `REMOTE_BASE_URL` (string): the URL for your llama-stack instance if using remote connection.
- `TAVILY_SEARCH_API_KEY` (string): your API key for tavily search. One can get one by going to: https://tavily.com/home.

### Installing dependencies

This code requires `llama-stack` and the `llama-stack-client`, both at version `0.2.2`. Lets begin by installing them:


In [None]:
# !pip install llama-stack-client==0.2.2 llama-stack==0.2.2

## Building a Web Search Agent
### 1. Connecting to a llama-stack server.
#### 1.1 Setting Up the Environment
- Import the necessary libraries.
- Defining user variables.
- Connection options
    - **Remote Server**: Set `remote=True` and provide the remote llama-stack endpoint.
    - **Local Server**: Set `remote=False` and provide the local llama-stack endpoint (default port: 8321).
    
    For detailed llama-stack server setup instructions, refer to:
    - [Remote Setup Guide](https://github.com/opendatahub-io/llama-stack-on-ocp/tree/main/kubernetes)
    - [Local Setup Guide](https://github.com/opendatahub-io/llama-stack-demos/blob/main/local_setup_guide.md)

In [5]:
from llama_stack_client.lib.agents.agent import Agent
from llama_stack_client.lib.agents.event_logger import EventLogger
from llama_stack_client.lib.agents.react.agent import ReActAgent
from llama_stack_client.lib.agents.react.tool_parser import ReActOutput
from llama_stack_client import LlamaStackClient
import os
import sys
sys.path.append('..')  
from src.utils import step_printer
from termcolor import cprint
from dotenv import load_dotenv
load_dotenv()

remote = True # Use the `remote` variable to switching between a local development environment and a remote kubernetes cluster.
stream_output = False # Set to True to stream the output of the agent.
inference_model = "ibm-granite/granite-3.2-8b-instruct"
print(f"Model: {inference_model}")

tavily_search_api_key = os.getenv("TAVILY_SEARCH_API_KEY")  # necessary for websearch tool
if remote:
    base_url = os.getenv("REMOTE_BASE_URL")
else:
    base_url = "http://localhost:8321"

Model: ibm-granite/granite-3.2-8b-instruct


#### 1.2 Client Initialization
- Initialize the `LlamaStackClient` with the appropriate base URL and provider data.  
- Connect the llama-stack client to the llama-stack server using the base URL.

In [6]:
client = LlamaStackClient(
        base_url=base_url,
        provider_data={"tavily_search_api_key": tavily_search_api_key}
)

### 2. Configuring an agent for tool use.
- **Agent Initialization**: 

- Create an `Agent` instance with the desired model, instructions and tools."

- **Instructions**: The `instructions` parameter, also referred to as the system prompt, specifies the agent's role and behavior. In this example, the agent is configured as a helpful web search assistant. It is instructed to use a tool whenever a web search is required and to respond in a friendly and helpful tone.

- **Tools**: The `tools` parameter defines the tools available to the agent. In this case, the `builtin::websearch` tool is used, which enables the agent to perform web searches. This tool is essential for retrieving up-to-date information from the web. Internally, it leverages Tavily Search to execute the search queries efficiently.

- **How It Works**: When a user query is provided, the agent processes the input and determines whether a tool is required to fulfill the request. If the query involves retrieving information from the web, the agent invokes the `builtin::websearch` tool. The tool interacts with Tavily Search to fetch real-time data, which is then processed and returned to the user in a friendly and helpful tone. This workflow ensures that the agent can handle a wide range of queries effectively.

For more details on the `builtin::websearch` tool and its capabilities, refer to the [Llama-stack tools documentation](https://llama-stack.readthedocs.io/en/latest/building_applications/tools.html#web-search-providers). 

In [16]:
agent = Agent(
    client, 
    model=inference_model,
    instructions="""You are a helpful websearch assistant. When you are asked to search the web you must use a tool. 
            Whenever a tool is called, be sure return the response in a friendly and helpful tone.
            """ ,
    tools=["builtin::websearch"],
    sampling_params={"max_tokens":4096}
)

### 3. Running the agent.
- Define user prompts to interact with the agent.
- Use the agent to create a session and process user queries.
- Display the agent's responses for each query.

In [17]:
user_prompts = [
    "What’s latest in OpenShift?",
    "Search for recent news articles about advancements in quantum computing.",
    "Find coffee shops near Dublin city centre that are open now, have Wi-Fi.",
]
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    session_id = agent.create_session("web-session")
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )
    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: What’s latest in OpenShift?[0m

---------- 📍 Step 1: InferenceStep ----------
🛠️ Tool call Generated:
[33mTool call: brave_search, Arguments: {'query': 'latest updates on OpenShift'}[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are the latest updates on OpenShift:

1. **Azure Red Hat OpenShift: April 2025 Update**
   - Boosts security with Managed Identity.
   - Expands global reach.
   - Enhances the platform for enterprise Kubernetes.
   - OpenShift 4.16 is now available for new ARO cluster installs with the latest features and improvements.

2. **Red Hat OpenShift 4.18: Enhanced Security and Virtualization**
   - OLM (Operator Lifecycle Manager) enables complementary solutions like Red Hat Advanced Cluster Management, Red Hat Advanced Cluster Security, Red Hat OpenShift Service Mesh, and Red Hat OpenShift Virtualization.
   - The latest release introduces cert-manager Operator for Red Hat OpenShift v1.15 with a technology preview of the istio-csr agent for seamless integration with Red Hat OpenShift Service Mesh.

3. **Red Hat OpenShift 4.17: What you need to know**
   - Based on Kubernetes 1.30 and CRI-O 1.30.
   - Features expande


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some recent news articles about advancements in quantum computing:

1. **Quantum Computing News -- ScienceDaily**
   - Title: "Quantum Computing News March 2, 2025 Top Headlines"
   - Content: This article covers several recent breakthroughs in quantum computing, including quantum interference in molecule-surface collisions, understanding the relationship between chirality and electric flow, and the development of a topological quantum processor.
   - [Read Full Story](https://www.sciencedaily.com/news/matter_energy/quantum_computing/)

2. **Recent Breakthroughs Accelerate The Race For Quantum Computing - Forbes**
   - Title: "Recent Breakthroughs Accelerate The Race For Quantum Computing"
   - Content: This piece discusses the rapid progress in the field, with major companies like Microsoft, Google, and IBM making significant strides. It highlights Microsoft's Majorana 1 chip, Google's Willow chip, and IBM'


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some coffee shops near Dublin city centre that are open now and offer Wi-Fi:

1. **BOI Hub, Grand Canal Square**: This spot by the bank offers free Wi-Fi and comfortable seating. You'll need to grab your coffee from a nearby outlet. It's a great place for a productive work session. [More info](https://lovin.ie/travel-1/cities/10-best-places-to-grab-a-coffee-and-work-in-dublin-with-free-wi-fi)

2. **The Fumbally Cafe**: Known for its incredible salads, 3FE coffee, and free Wi-Fi, this cafe is perfect for those who want to work or study while enjoying their coffee. They're also dog-friendly. [More info](https://lovindublin.com/dublin/best-places-work-remotely-dublin-free-wifi)

3. **Various Cafes in Dublin**: Dublin is filled with cafes that offer free Wi-Fi, good coffee, and sometimes even plugs for charging your devices. Some popular ones include Brindle Coffee & Wine, Proper Order, and many more. [More info

## Output Analysis
Here, we can observe that the `builtin::websearch` tool is used to perform a web search. The outputs are displayed in the notebook with color-coded text to help interpret the process:

- **Blue Text**: Represents the user's input or query.
- **Yellow Text**: Displays the LLM's inference response. 
- **Green Text**: Indicates the tool execution process, such as the tool being called and the query being sent to the web search API.

Great! 
We can see that the model returned some relevant and recent information about OpenShift. This was only possible due to its ability to call tools like the web search, demonstrating the agent's capacity to retrieve real-time data effectively.

For the second query, the model fetched the latest news articles, including 2 from March 2025. This is particularly impressive since the Granite 3.2 8B model, released in February 2025, has a knowledge cutoff date of April 2024. This demonstrates the agent's capability to extend beyond its static knowledge base by dynamically retrieving current information.  

For the third query about nearby coffee shops, the agent effectively used the `builtin::websearch` tool to provide practical, location-specific results with valid coffee shops and links, highlighting its real-world applicability.  

## Testing location-based queries


### Basic web search tool agent in one shot

In [18]:
agent = Agent(
    client, 
    model=inference_model,
    instructions="""You are a helpful websearch assistant. When you are asked to search the web you must use a tool. 
            Whenever a tool is called, be sure return the response in a friendly and helpful tone.
            """ ,
    tools=["builtin::websearch"],
    sampling_params={"max_tokens":4096}
)
user_prompts = [
    "Find coffee shops near me that are open now, have Wi-Fi.",
]
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    session_id = agent.create_session("web-session")
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )
    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: Find coffee shops near me that are open now, have Wi-Fi.[0m

---------- 📍 Step 1: InferenceStep ----------
🛠️ Tool call Generated:
[33mTool call: brave_search, Arguments: {'query': 'coffee shops open now with Wi-Fi near me'}[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some coffee shops near you that are currently open and offer Wi-Fi:

1. **THE BEST 10 Coffee & Tea in DEARBORN, MI - Yelp**
   - They are open now and offer delivery, takeout, free Wi-Fi, and outdoor seating. They also have vegan breakfast options, cozy couches, and pastries.
   - [Check them out on Yelp](https://www.yelp.com/search?cflt=coffee&find_loc=Dearborn,+MI)

2. **Haraz Coffee House**
   - Their Dearborn location is open now and closes at 11PM. They also offer delivery and takeout.
   - [View their menu and order online](https://haraz-coffee.myshopify.com/pages/locations-and-hours)

3. **Qahwah - Qahwah House**
   - This coffee shop is highly rated for its cozy atmosphere, freshly brewed coffee, and careful bean sourcing.
   - [Learn more about Qahwah House](https://qahwahhouse.com/)

4. **Biggby Coffee (Halal) - CoffeeSpots.com**
   - Known for great taste, quality, and a friendly team. They have a

### Web search agent + Prompt chaining 

In [19]:
agent = Agent(
    client, 
    model=inference_model,
    instructions="""You are a helpful websearch assistant. When you are asked to search the web you must use a tool. 
            Whenever a tool is called, be sure return the response in a friendly and helpful tone.
            """ ,
    tools=["builtin::websearch"],
    sampling_params={"max_tokens":4096}
)
user_prompts = [
    "Where am i?",
    "Im in Dublin, Ireland",
    "Find coffee shops near me that are open now, have Wi-Fi."
]
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    session_id = agent.create_session("web-session")
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )
    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: Where am i?[0m

---------- 📍 Step 1: InferenceStep ----------
🤖 Model Response:
[33mI'm sorry for any confusion, but as a web search assistant, I don't have the ability to determine your physical location. I can help you find information about your location if you provide me with a city, country, or landmark.
[0m


[34mProcessing user query: Im in Dublin, Ireland[0m

---------- 📍 Step 1: InferenceStep ----------
🤖 Model Response:
[33mSure, I can help with that! Here are some current happenings in Dublin, Ireland:

1. **Dublin Pride**: This year's Dublin Pride Parade is scheduled for September 24th. It's a celebration of the LGBTQ+ community and a great event to attend if you're in town.

2. **Trinity College Dublin Open Days**: If you're interested in academia, Trinity College is hosting open days for prospective students. These events usually occur in September and October.

3. **Dublin Theatre Festival**: This annual event showcases a variety of the


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some coffee shops near you that are currently open and offer Wi-Fi:

1. **THE BEST 10 Coffee & Tea in DEARBORN, MI - Yelp**
   - They are open now and offer delivery, takeout, free Wi-Fi, and outdoor seating. They also have vegan breakfast options, cozy couches, and pastries.
   - [Check them out on Yelp](https://www.yelp.com/search?cflt=coffee&find_loc=Dearborn,+MI)

2. **Haraz Coffee House**
   - Their Dearborn location is open now and closes at 11PM. They also offer delivery and takeout.
   - [View their menu and order online](https://haraz-coffee.myshopify.com/pages/locations-and-hours)

3. **Qahwah - Qahwah House**
   - This coffee shop is highly rated for its cozy atmosphere, freshly brewed coffee, and careful bean sourcing.
   - [Learn more about Qahwah House](https://qahwahhouse.com/)

4. **Biggby Coffee (Halal) - CoffeeSpots.com**
   - Known for great taste, quality, and a friendly team. They have a

### Web search ReAct Agent

In [None]:
agent = ReActAgent(
            client=client,
            model=inference_model,
            instructions="""You are a helpful websearch assistant. 
            When you are asked to search the web you must use a tool. 
            You are a helpful assistant that uses reasoning to solve problems step by step. Break down complex problems into simpler steps.
            """,
            tools=["builtin::websearch"],
            response_format={
                "type": "json_schema",
                "json_schema": ReActOutput.model_json_schema(),
            },
            sampling_params={"max_tokens":4096},
        )
user_prompts = [
    "Find coffee shops near me that are open now, have Wi-Fi."
]
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    session_id = agent.create_session("web-session")
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )
    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: Find coffee shops near me that are open now, have Wi-Fi.[0m

---------- 📍 Step 1: InferenceStep ----------
🤖 Model Response:
[33m{
  "thought": "To find coffee shops near the user that are open now and have Wi-Fi, I will use the 'brave_search' tool. I will formulate a search query that includes the user's location, the requirement for the shops to be open now, and the need for Wi-Fi. I will then execute the search and present the results to the user.",
  "action": {
    "tool_name": "brave_search",
    "tool_params": [
      {
        "name": "query",
        "value": "coffee shops open now with Wi-Fi near me"
      }
    ]
  }

,
  "answer": "I'm currently searching for coffee shops that are open now and offer Wi-Fi near your location. Please wait a moment while I gather this information."
}
[0m



## Key Takeaways
- This tutorial demonstrates how to set up and use a single tool with AI Agent.
- It highlights the flexibility of switching between remote and local environments.
- By following these steps, users can quickly get started with AI Agents and explore its capabilities.

For more advanced use cases, please check our other example jupyter notebooks.