# **When to use an agentic framework**

An **agentic framework** is not always necessary when building an application with LLM. In some cases, simple predefined workflows or a chain of prompts may be sufficient. However, it becomes useful when the workflow is complex, such as when:

- The LLM needs to **call functions** or **interact with multiple agents**.
- You want more **flexibility** and **control in the process**.

### **Key elements of an agentic system**

An agentic system relies on several essential components to ensure flexibility and control in the LLM workflow. Here are the main elements:

1. **LLM Engine**
- The heart of the system, responsible for generating text and decisions.
- This can be a model such as GPT, LLaMA, or a custom LLM.

2. **Tools accessible to the agent**
- The agent can call APIs, databases, or execute custom functions.
- It allows to extend the capabilities of the model beyond simple text generation.

3. **Tool call parser**
- It analyzes the LLM output to detect and interpret requests to execute tools.
- It facilitates automation and integration with external systems.

4. **System prompt synchronized with the parser**
- A well-structured prompt guides the behavior of the agent.
- It must be aligned with the parser to ensure consistent responses.

5. **Memory**
- Allows the agent to maintain context between interactions.
- Can be based on temporary sessions or persistent storage.

6. **Error Handling and Retry Mechanism**
- Implements checks to reduce LLM errors.
- Retry mechanisms improve system reliability.

These components make an agent more powerful and capable of handling complex tasks autonomously and efficiently.

<p><em><strong>Parsing in an agentic</strong></em> context is used to help agents understand intent, actions, and structured data, so they know what response to generate or what tool to use.</p>



### **Main Agentic Architectures**

1️⃣ **Prompt Chaining** → Splits a task into multiple sequential steps, improving precision and control.
**Example**: Writing a document starting from an outline, then reviewing and completing it.

2️⃣ **Routing** → Routes different inputs to specialized processes, optimizing responses.
**Example**: Splitting support requests between technical support, refunds, and general information.

3️⃣ **Parallelization** → Splits a task into multiple instances running concurrently, speeding up the process.
**Example**: Analyze code with multiple agents to find vulnerabilities.

4️⃣ **Orchestrator-Workers** → A primary LLM dynamically splits the work among secondary agents.
**Example**: A programming agent that modifies multiple files at once based on specific needs.

5️⃣ **Evaluator-Optimizer** → One LLM generates a response, another evaluates it and improves it iteratively.
**Example**: Literary translations with multiple review cycles for higher quality.

## **What is smolagents?**
smolagents is an **open-source library by Hugging Face** designed to **create AI agents** in a lightweight and efficient way. This module provides an overview of its main features, comparing it to other frameworks such as LlamaIndex and LangGraph.

The goal is to learn how to build AI agents that can:
- Search data
- Execute code
- Interact with web pages
- Combine multiple agents for advanced solutions

1️⃣ **Why use smolagents?**
**smolagents is just one of the many agentic frameworks available**. This section analyzes the advantages and disadvantages of smolagents compared to alternatives such as LlamaIndex and LangGraph, helping you choose the right framework based on your project.

2️⃣ **CodeAgents**
CodeAgents are **agents that generate Python code to perform actions**.
Unlike other agents that produce simple text responses, these directly perform programmatic operations.
They are ideal for software development and automation tasks.

3️⃣ **ToolCallingAgents**
Unlike CodeAgents, which produce Python code, ToolCallingAgents **generate JSON or text that is interpreted by the system to perform actions**.
They are useful when the text output needs to be processed by other parts of the system.

4️⃣ **Tools**
Tools are **functions** that an LLM can call within an agentic system.
This section shows how to create tools using the Tool class or the @tool decorator, as well as how to share and use community tools.

5️⃣ **Retrieval Agents**
These agents allow models to access knowledge databases, **retrieving and synthesizing information**.
They use vector stores and exploit the Retrieval-Augmented Generation (RAG) pattern to improve AI responses.
They are **particularly useful for combining web search with personalized knowledge**.

6️⃣ **Multi-agent systems**
An **advanced system can combine multiple agents with different roles** (e.g. a web search agent with a code execution agent).
This section shows how to design and manage multi-agent systems to improve their efficiency and reliability.

7️⃣ **Vision and Browser Agents**
Vision Agents use **Vision-Language Models (VLM) to analyze images and visual data**.
This section explores how to integrate agents with vision capabilities for image understanding, visual data analysis and multimodal interactions.
The creation of a Browser Agent, capable of navigating the web and extracting information autonomously, is also discussed.

### **Creating Code-Using Agents with smolagents**
**What are Code Agents?**

**Code Agents** are the **default agent type in smolagents**. They **generate Python code to perform actions**, ensuring efficiency, expressiveness, and precision. Compared to JSON-based approaches, code:

- Is more composable, reusable, and flexible.
- Allows you to directly handle complex objects (e.g. images, data structures).
- Is more natural for LLMs, as models are trained on huge amounts of code.

### **How Do Code Agents Work?**
A CodeAgent follows a multi-step execution cycle:

- **Records the system prompt** and the user task.
- **Converts the agent’s memory** into **messages** that the model can read.
- The **model generates a Python code snippet** as a response.
- The **code is executed**.
- The **results** are **saved in memory**.
- If the **code includes function calls, they are automatically executed** before moving on to the next step.

### **Practical Examples of Code Agents**

**Installation and Authentication** on Hugging Face

In [2]:
%pip install smolagents -U
from huggingface_hub import login
login()


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

### **Selecting a Party Playlist with DuckDuckGo**
To find the best party playlists, Alfred uses a CodeAgent with the DuckDuckGo search tool.

In [3]:
from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel

agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())

agent.run("Search for the best music recommendations for a party at the Wayne's mansion.")

['September - Earth, Wind & Fire (1978)',
 'Uptown Funk - Mark Ronson ft. Bruno Mars (2014)',
 'TQG - KAROL G & Shakira (2023)',
 'TRUSTFALL - P!nk (2023)',
 'Flowers - Miley Cyrus (2023)',
 'Old School R&B Mix 2024 | BEST 90s & 2000s R&B Party Songs',
 'Norton Smull Band - Passaic County Parks & Recreation (Local Band Recommendation)']

### **Calculate Preparation Times with Python**

In [6]:
from smolagents import CodeAgent, HfApiModel
import datetime

agent = CodeAgent(tools=[], model=HfApiModel(), additional_authorized_imports=['datetime'])

agent.run(
    """
    Alfred needs to prepare for the party. Here are the tasks:
    1. Prepare the drinks - 30 minutes
    2. Decorate the mansion - 60 minutes
    3. Set up the menu - 45 minutes
    4. Prepare the music and playlist - 45 minutes

    If we start right now, at what time will the party be ready?
    """
)


'2025-02-27 17:35:28'

### **ToolCallingAgents in SmolAgents: Code vs. JSON Actions**
SmolAgents supports **two main types of AI agents**: **CodeAgents** and **ToolCallingAgents**.

**CodeAgents** 
- They **generate and execute Python code snippets** to call tools and perform actions.

```python
for query in [
    "Best catering services in Gotham City", 
    "Party theme ideas for superheroes"
]:
    print(web_search(f"Search for: {query}"))
```

**ToolCallingAgents**
- Use **JSON blobs to describe tool calls**, **without** generating code.

```json
[
    {"name": "web_search", "arguments": "Best catering services in Gotham City"},
    {"name": "web_search", "arguments": "Party theme ideas for superheroes"}
]
```

### **When to Use CodeAgents or ToolCallingAgents**
- Use **CodeAgents** when you need **complex actions**, **variable management**, or **more flexibility**.
- Use **ToolCallingAgents** to perform **simple operations**, such as **querying APIs** or **performing web searches**.

## **Tools in SmolAgents**
In **SmolAgents**, **AI agents** can perform **actions using tools**. A **tool** is essentially a **function that an LLM model can call to perform an operation**. Each tool has:

- **Name**: Unique identifier.
- **Description**: Explains the function of the tool.
- **Input**: Parameters required for execution.
- **Output**: Type of expected result.

### **Methods for Creating a Tool**

1. Using the **@tool** decorator
For simple tools, you can use the @tool decorator, which allows the LLM to easily interpret them.

In [8]:
from smolagents import CodeAgent, HfApiModel, tool

# Let's pretend we have a function that fetches the highest-rated catering services.
@tool
def catering_service_tool(query: str) -> str:
    """
    This tool returns the highest-rated catering service in Gotham City.
    
    Args:
        query: A search term for finding catering services.
    """
    # Example list of catering services and their ratings
    services = {
        "Gotham Catering Co.": 4.9,
        "Wayne Manor Catering": 4.8,
        "Gotham City Events": 4.7,
    }
    
    # Find the highest rated catering service (simulating search query filtering)
    best_service = max(services, key=services.get)
    
    return best_service


agent = CodeAgent(tools=[catering_service_tool], model=HfApiModel())

# Run the agent to find the best catering service
result = agent.run(
    "Can you give me the name of the highest-rated catering service in Gotham City?"
)

print(result)   # Output: Gotham Catering Co.



### **Creating a Tool as a Python Class**
For more advanced tools, you can use a **Tool class**, which provides more detailed metadata.

In [9]:
from smolagents import Tool, CodeAgent, HfApiModel

class SuperheroPartyThemeTool(Tool):
    name = "superhero_party_theme_generator"
    description = """
    This tool suggests creative superhero-themed party ideas based on a category.
    It returns a unique party theme idea."""
    
    inputs = {
        "category": {
            "type": "string",
            "description": "The type of superhero party (e.g., 'classic heroes', 'villain masquerade', 'futuristic Gotham').",
        }
    }
    
    output_type = "string"

    def forward(self, category: str):
        themes = {
            "classic heroes": "Justice League Gala: Guests come dressed as their favorite DC heroes with themed cocktails like 'The Kryptonite Punch'.",
            "villain masquerade": "Gotham Rogues' Ball: A mysterious masquerade where guests dress as classic Batman villains.",
            "futuristic Gotham": "Neo-Gotham Night: A cyberpunk-style party inspired by Batman Beyond, with neon decorations and futuristic gadgets."
        }
        
        return themes.get(category.lower(), "Themed party idea not found. Try 'classic heroes', 'villain masquerade', or 'futuristic Gotham'.")

# Instantiate the tool
party_theme_tool = SuperheroPartyThemeTool()
agent = CodeAgent(tools=[party_theme_tool], model=HfApiModel())

# Run the agent to generate a party theme idea
result = agent.run(
    "What would be a good superhero party idea for a 'villain masquerade' theme?"
)

print(result)  # Output: "Gotham Rogues' Ball: A mysterious masquerade where guests dress as classic Batman villains."

### SmolAgents Default Toolbox

**SmolAgents** provides a collection of built-in tools ready for use:

- **PythonInterpreterTool** → Executes Python code.
- **FinalAnswerTool** → Ends execution with a final answer.
- **UserInputTool** → Asks for additional input from the user.
- **DuckDuckGoSearchTool** → Searches for information online.
- **GoogleSearchTool** → Performs Google searches.
- **VisitWebpageTool** → Visits and analyzes web content.


### **Building Agentic RAG Systems**
**Retrieval-Augmented Generation (RAG)** combines information retrieval with text generation to provide more accurate, contextualized answers.
Agentic RAG improves on the traditional model by introducing autonomous agents capable of controlling information retrieval, optimizing queries, and validating results.

1. Introduction to RAG and Agentic RAG
Retrieval-Augmented Generation (RAG) combines information retrieval and text generation to provide contextualized responses.
2. Agentic RAG improves on traditional RAG by allowing agents to autonomously control retrieval and generation, increasing efficiency and accuracy.
3. Limitations of traditional RAG include the use of only one retrieval step and the reliance on direct semantic similarity to the user query.
4. Agentic RAG solves these problems by autonomously formulating queries, critiquing the results, and iterating the retrieval for more precise responses.

### **Personalization with Knowledge Base**

**Create a vector knowledge base** to provide **contextualized answers**.
BM25 retriever to search for ideas from a predefined knowledge base.

In [13]:
%pip install langchain_community
%pip install rank_bm25
%pip install 'smolagents[litellm]' matplotlib geopandas shapely kaleido -q
from langchain.docstore.document import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter
from smolagents import Tool
from langchain_community.retrievers import BM25Retriever
from smolagents import CodeAgent, HfApiModel

class PartyPlanningRetrieverTool(Tool):
    name = "party_planning_retriever"
    description = "Uses semantic search to retrieve relevant party planning ideas for Alfred’s superhero-themed party at Wayne Manor."
    inputs = {
        "query": {
            "type": "string",
            "description": "The query to perform. This should be a query related to party planning or superhero themes.",
        }
    }
    output_type = "string"

    def __init__(self, docs, **kwargs):
        super().__init__(**kwargs)
        self.retriever = BM25Retriever.from_documents(
            docs, k=5  # Retrieve the top 5 documents
        )

    def forward(self, query: str) -> str:
        assert isinstance(query, str), "Your search query must be a string"

        docs = self.retriever.invoke(
            query,
        )
        return "\nRetrieved ideas:\n" + "".join(
            [
                f"\n\n===== Idea {str(i)} =====\n" + doc.page_content
                for i, doc in enumerate(docs)
            ]
        )

# Simulate a knowledge base about party planning
party_ideas = [
    {"text": "A superhero-themed masquerade ball with luxury decor, including gold accents and velvet curtains.", "source": "Party Ideas 1"},
    {"text": "Hire a professional DJ who can play themed music for superheroes like Batman and Wonder Woman.", "source": "Entertainment Ideas"},
    {"text": "For catering, serve dishes named after superheroes, like 'The Hulk's Green Smoothie' and 'Iron Man's Power Steak.'", "source": "Catering Ideas"},
    {"text": "Decorate with iconic superhero logos and projections of Gotham and other superhero cities around the venue.", "source": "Decoration Ideas"},
    {"text": "Interactive experiences with VR where guests can engage in superhero simulations or compete in themed games.", "source": "Entertainment Ideas"}
]

source_docs = [
    Document(page_content=doc["text"], metadata={"source": doc["source"]})
    for doc in party_ideas
]

# Split the documents into smaller chunks for more efficient search
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    add_start_index=True,
    strip_whitespace=True,
    separators=["\n\n", "\n", ".", " ", ""],
)
docs_processed = text_splitter.split_documents(source_docs)

# Create the retriever tool
party_planning_retriever = PartyPlanningRetrieverTool(docs_processed)

# Initialize the agent
agent = CodeAgent(tools=[party_planning_retriever], model=HfApiModel())

# Example usage
response = agent.run(
    "Find ideas for a luxury superhero-themed party, including entertainment, catering, and decoration options."
)

print(response)


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


ERROR: Invalid requirement: "'smolagents[litellm]'": Expected package name at the start of dependency specifier
    'smolagents[litellm]'
    ^


Error in generating final LLM output:
402 Client Error: Payment Required for url: https://api-inference.huggingface.co/models/Qwen/Qwen2.5-Coder-32B-Instruct/v1/chat/completions (Request ID: Root=1-67c0d36b-0622873c550af01d6eb3356d;9c79b456-170d-46d0-99ad-a67298634e32)

You have exceeded your monthly included credits for Inference Providers. Subscribe to PRO to get 20x more monthly allowance.


4. Advanced Strategies of Agentic RAG
- **Query Reformulation** → The agent can rewrite the query for more precise results.
- **Multi-Step Retrieval** → Iterates over searches, progressively improving the response.
- **Source Integration** → Combines multiple sources (web search, knowledge base, internal documentation).
- **Result Validation** → Checks the relevance and accuracy of information before using it.

Conclusion **Agentic RAG is an advanced evolution of classic RAG**, enhancing **information retrieval and content generation** through **intelligent agents**. By **combining web search**, **knowledge base**, and **autonomous query refinement**, the system provides more accurate, detailed, and contextualized responses.

- **Tools** → Add functions to agents (database, search, API, etc.) 
- **RAG** → Retrieves data and generates more contextualized responses 
- **Agentic RAG** → Agents improve the process by optimizing queries and verifying results 
- **Prompt Engineering** → Agents rewrite and refine the prompt to ensure an accurate response

This approach allows for the creation of intelligent and adaptive AI agents, capable of responding accurately and dynamically depending on the context! 

### **Multi-Agent Systems**

**Multi-agent systems** allow multiple **specialized agents** to **collaborate on complex tasks**, improving modularity, scalability, and robustness.

Instead of relying on a single agent, **tasks are distributed among agents** with specific capabilities.

🔹 **Structure of a Multi-Agent System**
An **Orchestrator Agent** coordinates **multiple agents with distinct functions**. For example:

- **Manager Agent** → Delegates tasks to other agents.
- **Code Interpreter Agent** → Executes Python code.
- **Web Search Agent** → Retrieves information online.

- Example of Multi-Agent System for Batman Investigation
The goal is to find Batman movie locations, calculate the cargo plane transfer time, and represent the data on a map.

**Tools Used**
- **DuckDuckGoSearchTool** → To search the web.
- **VisitWebpageTool** → To extract content from web pages.
- **calculate_cargo_travel_time** → To calculate the travel time by cargo plane.
- **pandas / geopandas / shapely** → For data manipulation and map creation.

In [7]:
import math
from typing import Optional, Tuple

from smolagents import tool


@tool
def calculate_cargo_travel_time(
    origin_coords: Tuple[float, float],
    destination_coords: Tuple[float, float],
    cruising_speed_kmh: Optional[float] = 750.0,  # Average speed for cargo planes
) -> float:
    """
    Calculate the travel time for a cargo plane between two points on Earth using great-circle distance.

    Args:
        origin_coords: Tuple of (latitude, longitude) for the starting point
        destination_coords: Tuple of (latitude, longitude) for the destination
        cruising_speed_kmh: Optional cruising speed in km/h (defaults to 750 km/h for typical cargo planes)

    Returns:
        float: The estimated travel time in hours

    Example:
        >>> # Chicago (41.8781° N, 87.6298° W) to Sydney (33.8688° S, 151.2093° E)
        >>> result = calculate_cargo_travel_time((41.8781, -87.6298), (-33.8688, 151.2093))
    """

    def to_radians(degrees: float) -> float:
        return degrees * (math.pi / 180)

    # Extract coordinates
    lat1, lon1 = map(to_radians, origin_coords)
    lat2, lon2 = map(to_radians, destination_coords)

    # Earth's radius in kilometers
    EARTH_RADIUS_KM = 6371.0

    # Calculate great-circle distance using the haversine formula
    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = (
        math.sin(dlat / 2) ** 2
        + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2) ** 2
    )
    c = 2 * math.asin(math.sqrt(a))
    distance = EARTH_RADIUS_KM * c

    # Add 10% to account for non-direct routes and air traffic controls
    actual_distance = distance * 1.1

    # Calculate flight time
    # Add 1 hour for takeoff and landing procedures
    flight_time = (actual_distance / cruising_speed_kmh) + 1.0

    # Format the results
    return round(flight_time, 2)


print(calculate_cargo_travel_time((41.8781, -87.6298), (-33.8688, 151.2093)))

22.82


In [8]:
import os
from PIL import Image
from smolagents import CodeAgent, GoogleSearchTool, HfApiModel, VisitWebpageTool

model = HfApiModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct", provider="together")


In [9]:
task = """Find all Batman filming locations in the world, calculate the time to transfer via cargo plane to here (we're in Gotham, 40.7128° N, 74.0060° W), and return them to me as a pandas dataframe.
Also give me some supercar factories with the same cargo plane transfer time."""

In [11]:
agent = CodeAgent(
    model=model,
    tools=[GoogleSearchTool(), VisitWebpageTool(), calculate_cargo_travel_time],
    additional_authorized_imports=["pandas"],
    max_steps=20,
)

result = agent.run(task)
result

'Error in generating final LLM output:\n402 Client Error: Payment Required for url: https://huggingface.co/api/inference-proxy/together/v1/chat/completions (Request ID: Root=1-67c0d2e8-3ae74aa505f94e0e2fd3ddf2;fadeb996-5a53-4287-81dc-73601d0541ed)\n\nYou have exceeded your monthly included credits for Inference Providers. Subscribe to PRO to get 20x more monthly allowance.'

### **Splitting the task between two agents**
Multi-agent structures allow to **separate memories between different sub-tasks**, with two great benefits:

**Each agent is more focused on its core task**, thus more performant
**Separating memories reduces the count of input tokens** at each step, thus reducing latency and cost.
Let’s create a team with a dedicated web search agent, managed by another agent.

The manager agent should have plotting capabilities to write its final report: so let us give it access to additional imports, including matplotlib, and geopandas + shapely for spatial plotting.

In [15]:
model = HfApiModel(
    "Qwen/Qwen2.5-Coder-32B-Instruct", provider="together", max_tokens=8096
)

web_agent = CodeAgent(
    model=model,
    tools=[
        GoogleSearchTool(),
        VisitWebpageTool(),
        calculate_cargo_travel_time,
    ],
    name="web_agent",
    description="Browses the web to find information",
    verbosity_level=0,
    max_steps=10,
)

In [17]:
# So we give it the stronger model DeepSeek-R1, and add a planning_interval to the mix.

from smolagents.utils import encode_image_base64, make_image_url
from smolagents import OpenAIServerModel


def check_reasoning_and_plot(final_answer, agent_memory):
    final_answer
    multimodal_model = OpenAIServerModel("gpt-4o", max_tokens=8096)
    filepath = "saved_map.png"
    assert os.path.exists(filepath), "Make sure to save the plot under saved_map.png!"
    image = Image.open(filepath)
    prompt = (
        f"Here is a user-given task and the agent steps: {agent_memory.get_succinct_steps()}. Now here is the plot that was made."
        "Please check that the reasoning process and plot are correct: do they correctly answer the given task?"
        "First list reasons why yes/no, then write your final decision: PASS in caps lock if it is satisfactory, FAIL if it is not."
        "Don't be harsh: if the plot mostly solves the task, it should pass."
        "To pass, a plot should be made using px.scatter_map and not any other method (scatter_map looks nicer)."
    )
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt,
                },
                {
                    "type": "image_url",
                    "image_url": {"url": make_image_url(encode_image_base64(image))},
                },
            ],
        }
    ]
    output = multimodal_model(messages).content
    print("Feedback: ", output)
    if "FAIL" in output:
        raise Exception(output)
    return True


manager_agent = CodeAgent(
    model=HfApiModel("deepseek-ai/DeepSeek-R1", provider="together", max_tokens=8096),
    tools=[calculate_cargo_travel_time],
    managed_agents=[web_agent],
    additional_authorized_imports=[
        "geopandas",
        "plotly",
        "shapely",
        "json",
        "pandas",
        "numpy",
    ],
    planning_interval=5,
    verbosity_level=2,
    final_answer_checks=[check_reasoning_and_plot],
    max_steps=15,
)

manager_agent.visualize()

In [None]:
manager_agent.run("""
Find all Batman filming locations in the world, calculate the time to transfer via cargo plane to here (we're in Gotham, 40.7128° N, 74.0060° W).
Also give me some supercar factories with the same cargo plane transfer time. You need at least 6 points in total.
Represent this as spatial map of the world, with the locations represented as scatter points with a color that depends on the travel time, and save it to saved_map.png!

Here's an example of how to plot and return a map:
import plotly.express as px
df = px.data.carshare()
fig = px.scatter_map(df, lat="centroid_lat", lon="centroid_lon", text="name", color="peak_hour", size=100,
     color_continuous_scale=px.colors.sequential.Magma, size_max=15, zoom=1)
fig.show()
fig.write_image("saved_image.png")
final_answer(fig)

Never try to process strings using code: when you have a string to read, just print it and you'll see it.
""")

### **Vision Agents with Smolagents**
**Vision Agents in smolagents** enable agents to **process and interpret images**, extending their capabilities beyond just text. This is e**ssential for tasks such as web navigation and visual recognition**.