# Level 2: Simple Agentic 

This tutorial is designed to give new users an overview of how to use llama-stack's builtin agent framework effectively. It demonstrates basic single-tool usage and provides guidance on switching between a local development environment and a remote kubernetes cluster.

## Overview

This tutorial walks you through how to build your own AI agent who can search the web following these steps:
1. Connecting to a llama-stack server (remote or local).
2. Configuring an agent for tool use.
3. Running the agent.


## Prerequisites

Before starting, ensure you have the following:
- Access to a remote cluster or a local Podman setup.
- API keys and user variables configured (e.g., `inference_model`, `TAVILY_SEARCH_API_KEY`, `LLAMA_STACK_ENDPOINT`).
- A Tavily API key is required. You can register for one at [https://tavily.com/](https://tavily.com/).


## Building a Web Search Agent
### 1. Connecting to a llama-stack server.
#### 1.1 Setting Up the Environment
- Import the necessary libraries.
- Defining user variables.
- Connection options
    - **Remote Server**: Set `remote=True` and provide the remote llama-stack endpoint.
    - **Local Server**: Set `remote=False` and provide the local llama-stack endpoint (default port: 8321).
    
    For detailed llama-stack server setup instructions, refer to:
    - [Remote Setup Guide](https://github.com/opendatahub-io/llama-stack-on-ocp/tree/main/kubernetes)
    - [Local Setup Guide](https://github.com/opendatahub-io/llama-stack-demos/blob/main/local_setup_guide.md)

In [1]:
from llama_stack_client.lib.agents.agent import Agent
from llama_stack_client.lib.agents.event_logger import EventLogger
from llama_stack_client import LlamaStackClient
import sys
sys.path.append('..')  
from src.utils import step_printer
from termcolor import cprint

inference_model = "ibm-granite/granite-3.2-8b-instruct"
tavily_search_api_key = "your-tavily-api-key"  # Replace with your Tavily API key

remote = True # Use the `remote` variable to switching between a local development environment and a remote kubernetes cluster.
remote_llama_stack_endpoint = "your-llama-stack-endpoint"  # Replace with your llama-stack endpoint if remote is set to True
local_llama_stack_endpoint = "http://localhost:8321"  # or set it to default port 8321 if remote is set to False.

stream_output = False # Set to True to stream the output of the agent.

print(f"Model: {inference_model}")

Model: ibm-granite/granite-3.2-8b-instruct


#### 1.2 Client Initialization
- Initialize the `LlamaStackClient` with the appropriate base URL and provider data.  
- Connect the llama-stack client to the llama-stack server using the base URL.

In [2]:
base_url = remote_llama_stack_endpoint if remote else local_llama_stack_endpoint
client = LlamaStackClient(
        base_url=base_url,
        provider_data={"tavily_search_api_key": tavily_search_api_key}
)

### 2. Configuring an agent for tool use.
- **Agent Initialization**: 

- Create an `Agent` instance with the desired model, instructions and tools."

- **Instructions**: The `instructions` parameter, also referred to as the system prompt, specifies the agent's role and behavior. In this example, the agent is configured as a helpful web search assistant. It is instructed to use a tool whenever a web search is required and to respond in a friendly and helpful tone.

- **Tools**: The `tools` parameter defines the tools available to the agent. In this case, the `builtin::websearch` tool is used, which enables the agent to perform web searches. This tool is essential for retrieving up-to-date information from the web. Internally, it leverages Tavily Search to execute the search queries efficiently.

- **How It Works**: When a user query is provided, the agent processes the input and determines whether a tool is required to fulfill the request. If the query involves retrieving information from the web, the agent invokes the `builtin::websearch` tool. The tool interacts with Tavily Search to fetch real-time data, which is then processed and returned to the user in a friendly and helpful tone. This workflow ensures that the agent can handle a wide range of queries effectively.

For more details on the `builtin::websearch` tool and its capabilities, refer to the [Llama-stack tools documentation](https://llama-stack.readthedocs.io/en/latest/building_applications/tools.html#web-search-providers). 

In [3]:
agent = Agent(
    client, 
    model=inference_model,
    instructions="""You are a helpful websearch assistant. When you are asked to search the web you must use a tool. 
            Whenever a tool is called, be sure return the response in a friendly and helpful tone.
            """ ,
    tools=["builtin::websearch"],
    sampling_params={"max_tokens":4096}
)

### 3. Running the agent.
- Define user prompts to interact with the agent.
- Use the agent to create a session and process user queries.
- Display the agent's responses for each query.

In [4]:
user_prompts = [
    "What’s latest in OpenShift?",
    "Search for recent news articles about advancements in quantum computing.",
    "Find coffee shops near Dublin city center that are open now, have Wi-Fi.",
]
for prompt in user_prompts:
    print("\n"+"="*50)
    cprint(f"Processing user query: {prompt}", "blue")
    print("="*50)
    session_id = agent.create_session("web-session")
    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": prompt,
            }
        ],
        session_id=session_id,
        stream=stream_output
    )
    if stream_output:
        for log in EventLogger().log(response):
            log.print()
    else:
        step_printer(response.steps) # print the steps of an agent's response in a formatted way. 


[34mProcessing user query: What’s latest in OpenShift?[0m

---------- 📍 Step 1: InferenceStep ----------
🛠️ Tool call Generated:
[33mTool call: brave_search, Arguments: {'query': 'latest updates on OpenShift'}[0m

---------- 📍 Step 2: ToolExecutionStep ----------
🔧 Executing tool...



---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are the latest updates on OpenShift:

1. **Azure Red Hat OpenShift: April 2025 Update**
   - Boosts security with Managed Identity.
   - Expands global reach.
   - Enhances the platform for enterprise Kubernetes.
   - OpenShift 4.16 is now available for new ARO cluster installs with the latest features and improvements.

2. **OpenShift 4.18 Enhancements**
   - Red Hat Developer Hub, based on the Backstage project, provides software templates and plugins for OpenShift deployments, access to pipeline runs, viewing clusters from OCM, and more.
   - OpenShift Pipelines provide an integrated user experience with the developer perspective of the Red Hat OpenShift Container Platform web console.

3. **Red Hat OpenShift 4.17: What you need to know**
   - Based on Kubernetes 1.30 and CRI-O 1.30.
   - Features expanded control plane options, increased flexibility for virtualization and networking, and new capabilities to 


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some recent news articles about advancements in quantum computing:

1. **Quantum Computing News -- ScienceDaily**
   - Title: "Quantum Computing News March 2, 2025 Top Headlines"
   - Content: This article covers several recent breakthroughs in quantum computing, including quantum interference in molecule-surface collisions, understanding the relationship between chirality and electric flow, and the development of a topological quantum processor.
   - [Read Full Story](https://www.sciencedaily.com/news/matter_energy/quantum_computing/)

2. **Recent Breakthroughs Accelerate The Race For Quantum Computing - Forbes**
   - Title: "Recent Breakthroughs Accelerate The Race For Quantum Computing"
   - Content: This piece discusses the rapid progress in the field, with major companies like Microsoft, Google, and IBM making significant strides. Microsoft's Majorana 1 chip, Google's Willow chip, and IBM's long-term qu


---------- 📍 Step 3: InferenceStep ----------
🤖 Model Response:
[33mHere are some coffee shops near Dublin city center that are currently open and offer Wi-Fi:

1. **Keoghs Café**
   - Address: South Inner City
   - Yelp Review: "Such a handy little coffee shop to get to in the city centre!"
   - Note: It's recommended to check their exact opening hours as they might vary.

2. **Cafes with Free Wi-Fi in Dublin**
   - Dublin is known for its numerous cafes offering free Wi-Fi, perfect for students, freelancers, and remote workers. They serve a variety of food and drinks to keep you energized.

3. **Industrial Design Coffee Shop**
   - This spot is popular among tech enthusiasts, serving award-winning coffee. It's a great place for a productive work session.

4. **Brindle Coffee & Wine**
   - Located in a terraced street, this café is perfect for people-watching while enjoying your coffee.

5. **Proper Order**
   - A favorite among Dublin's coffee lovers, it serves baked goods from No 

## Output Analysis
Here, we can observe that the `builtin::websearch` tool is used to perform a web search. The outputs are displayed in the notebook with color-coded text to help interpret the process:

- **Blue Text**: Represents the user's input or query.
- **Yellow Text**: Displays the LLM's inference response. 
- **Green Text**: Indicates the tool execution process, such as the tool being called and the query being sent to the web search API.

Great! 
We can see that the model returned some relevant and recent information about OpenShift. This was only possible due to its ability to call tools like the web search, demonstrating the agent's capacity to retrieve real-time data effectively.

For the second query, the model fetched the latest news articles, including 2 from March 2025. This is particularly impressive since the Granite 3.2 8B model, released in February 2025, has a knowledge cutoff date of April 2024. This demonstrates the agent's capability to extend beyond its static knowledge base by dynamically retrieving current information.  

For the third query about nearby coffee shops, the agent effectively used the `builtin::websearch` tool to provide practical, location-specific results with valid coffee shops and links, highlighting its real-world applicability.  

## Key Takeaways
- This tutorial demonstrates how to set up and use a single tool with AI Agent.
- It highlights the flexibility of switching between remote and local environments.
- By following these steps, users can quickly get started with AI Agents and explore its capabilities.

For more advanced use cases, please check our other example jupyter notebooks.