In this hands-on we will use the `smolagents` library developed by HuggingFace. We use this library due to its simplicity, support for any LLM posted on HuggingFace Hub and its integration of Code Agents (more on that later).

<img src="https://camo.githubusercontent.com/c6efa99360afde7cf829dff3cad81e56573658c1843464dff1fbb30a8f63b082/68747470733a2f2f68756767696e67666163652e636f2f64617461736574732f68756767696e67666163652f646f63756d656e746174696f6e2d696d616765732f7265736f6c76652f6d61696e2f736d6f6c6167656e74732f736d6f6c6167656e74732e706e67" alt="drawing" width="400"/>

[Smolagents Documentation](https://huggingface.co/docs/smolagents/en/index)

In this hands-on we will:
- Understand why it's helpful to have agentic capabilities
- Understand how to use the `smolagents` library
- Understand the difference between a Tool Calling Agent and a Code Agent
- Implement a custom Agent leveraging the RAG pipeline that we implemented before

## 1. Setting up the environment

We will use `gpt-4.1-mini` as our LLM for this hands-on. We could also use any model available on HuggingFace. 

In [None]:
import sys
sys.path.append("../../")

In [None]:
!pip install -r ../../requirements.txt

In [None]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass()

In [None]:
from smolagents import LiteLLMModel
from IPython.display import display_markdown

model = LiteLLMModel(model_id="gpt-4.1-mini")

In [None]:
def compute_agent_cost(agent, model_name="gpt-4.1-mini"):
    if model_name == "gpt-4o-mini":
        input_token_price = 0.15 / 1000000
        output_token_price = 0.6 / 1000000
    elif model_name == "gpt-4o":
        input_token_price = 2.5 / 1000000
        output_token_price = 10 / 1000000        
    elif model_name == "gpt-4.1-mini":
        input_token_price = 0.40 / 1000000
        output_token_price = 1.6 / 1000000 
    return (
        input_token_price * agent.monitor.get_total_token_counts()["input"]
        + output_token_price * agent.monitor.get_total_token_counts()["output"]
    )

## 2. Let's create our first Agent

## 2.1 Tool Calling Agent

In [None]:
from smolagents import ToolCallingAgent

# This is as simple as
agent = ToolCallingAgent(
    tools=[],
    model=model,
    verbosity_level=0,
    description="An agent that is capable of searching the web",
)

In [None]:
output = agent.run("What can you do?")
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent))

Ok, so our agent says that it can help us answering questions. Let's see how it goes.

In [None]:
output = agent.run(
    "Can you visit https://www.swissinfo.ch/eng/ and tell me what are recent news?"
)
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent))

In [None]:
agent.visualize()

In [None]:
agent.write_memory_to_messages()

I guess we forgot to give tools to our agent. Let's add a websearch tool. We can use the one provided by default by the smollagents library.

In [None]:
from smolagents import VisitWebpageTool

visit_webpage_tool = VisitWebpageTool()
agent = ToolCallingAgent(
    tools=[visit_webpage_tool],
    model=model,
    verbosity_level=0,
    description="An agent that is capable of searching the web",
)
agent.visualize()

In [None]:
output = agent.run(
    "Can you visit https://www.swissinfo.ch/eng/ and tell me what are recent news?"
)
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent))

#### So now we have an agent that can answer questions using the LLM and also search the web for us.
#### Let's see what the agents does behind the scene.

In [None]:
agent.write_memory_to_messages()

In [None]:
# Let's print the system prompt
print("""""", agent.system_prompt)

## Tools

A tool is an atomic function to be used by an agent. To be used by an LLM, it also needs a few attributes that constitute its API and will be used to describe to the LLM how to call this tool:

    A name
    A description
    Input types and descriptions
    An output type

The library provide a list of default tools: https://github.com/huggingface/smolagents/blob/28cfef22389a2830176b48be9fcc3e3d5793b87b/src/smolagents/default_tools.py#L102

- PythonInterpreterTool
- FinalAnswerTool
- UserInputTool
- DuckDuckGoSearchTool
- GoogleSearchTool
- VisitWebpageTool

With the `smolagents` library, there are two ways of declaring tool. Using the `@tool` decorator or using the `Tool` class.

The `@tool` decorator is a more concise way of declaring a tool, but it is less flexible than the `Tool` class.

###  Defining a Tool as a Python Class

In this class, we define:

- `name`: The tool’s name.
- `description`: A description used to populate the agent’s system prompt.
- `inputs`: A dictionary with keys type and description, providing information to help the Python interpreter process inputs.
- `output_type`: Specifies the expected output type.
- `forward`: The method containing the inference logic to execute.

In [None]:
from smolagents import Tool


class Sum(Tool):
    name = "sum"
    description = "This is a tool that can add two numbers. It returns the sum of the two numbers."
    inputs = {
        "number_1": {"type": "number", "description": "The first number to add."},
        "number_2": {"type": "number", "description": "The second number to add."},
    }
    output_type = "number"

    def forward(self, number_1: float, number_2: float) -> float:
        return number_1 + number_2

In [None]:
sum_tool = Sum()
agent = ToolCallingAgent(
    tools=[sum_tool],
    model=model,
)
agent.visualize()

In [None]:
output = agent.run("sum 3 4")
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent))

### The @tool Decorator

Using this approach, we define a function with:

- **A clear and descriptive function name** that helps the LLM understand its purpose.
- **Type hints for both inputs and outputs** to ensure proper usage.
- **A detailed description**, including an Args: section where each argument is explicitly described. These descriptions provide valuable context for the LLM, so it’s important to write them carefully.

In [None]:
from smolagents import tool


@tool
def sum_tool(number_1: float, number_2: float) -> float:
    """
    This is a tool that can add two numbers. It returns the sum of the two numbers.

    Args:
        number_1: The first number to add.
        number_2: The second number to add.
    """
    return number_1 + number_2

In [None]:
agent = ToolCallingAgent(
    tools=[sum_tool],
    model=model,
)
agent.visualize()

In [None]:
output = agent.run("sum 3 4")
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent))

## Code Agent vs Tool Calling Agent

In [None]:
from datetime import date

today = date.today()
current_date = today.strftime("%Y-%m-%d")

In [None]:
from smolagents import DuckDuckGoSearchTool

web_search = DuckDuckGoSearchTool()
agent = ToolCallingAgent(
    model=model,
    tools=[sum_tool, web_search],
    verbosity_level=1,
    max_steps=10,
)
agent.visualize()

In [None]:
output = agent.run(
    f"You are an agent that can study financial market. Today's date is {current_date}. "
    + "What is the gain that Nvidia stock made in the last week?",
    reset=True,
)
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent, model.model_id))

In [None]:
from smolagents import CodeAgent

agent = CodeAgent(
    model=model,
    add_base_tools=False,
    tools=[sum_tool, web_search],
    verbosity_level=1,
    max_steps=10,
)
agent.visualize()

In [None]:
output = agent.run(
    f"You are an agent that can study financial market. Today's date is {current_date}. "
    + "What is the gain that Nvidia stock made in the last week?",
    reset=True,
)
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent, model.model_id))

## Agent Hierarchy / MultiAgent

It is also possible to use a multi-agent system, where multiple agents can be used to solve a problem. This is useful when the problem is too complex for a single agent to solve. In this case, the agents can communicate with each other to solve the problem.

Another advantage is context size, as a single agent will store the full history of the steps, while a multi-agent system will store only the history of the steps of the agent that is currently active.

In [None]:
from smolagents import CodeAgent, DuckDuckGoSearchTool


web_agent = CodeAgent(
    tools=[DuckDuckGoSearchTool()],
    model=model,
    add_base_tools=False,
    name="information_retriever_agent",
    description="An agent that can be called to run web search to obtain information. Call it as a function using the **task** argument.",
    verbosity_level=1,
)

manager_agent = CodeAgent(
    tools=[sum_tool],
    model=model,
    managed_agents=[web_agent],
    verbosity_level=1,
    description="An agent that manages other agent.",
    max_steps=10,
)

manager_agent.visualize()

In [None]:
output = manager_agent.run(
    f"You are an agent that can study financial market. Today's date is {current_date}. "
    + "What is the gain that Nvidia stock made in the last week?",
    reset=True,
)
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(manager_agent))

## Visualize Your Agent

In [None]:
from smolagents import (
    GradioUI,
)

web_agent = CodeAgent(
    tools=[DuckDuckGoSearchTool(), VisitWebpageTool()],
    model=model,
    name="information_retriever_agent",
    description="An agent that can be called to run web search to obtain information. Call it as a function using the **task** argument.",
    verbosity_level=0,
)

manager_agent = CodeAgent(
    name="ManagerAgent",
    tools=[],
    model=model,
    managed_agents=[web_agent],
    verbosity_level=0,
    description="An agent that manages other agent.",
    max_steps=10,
)

GradioUI(manager_agent).launch(
    share=True
)  # This is necessary to make it work in Renku but be careful

# !!! Do not forget to stop the process of the previous cell before executing the next ones !!! 

## Another Multi Agent Example

We will create:

- an agent that can read a document and summarize it

- an agent that can search the web

- a manager agent that handle both previous agents

In [None]:
import requests
from helpers.data_processing import SimpleChunker, PDFExtractor


@tool
def read_pdf_tool(pdf_file_path: str) -> str:
    """
    This tool reads a PDF file and returns the text content of the PDF file.

    Args:
        pdf_file_path: The path to the PDF file.
    """
    response, text, images = PDFExtractor().extract_text_and_images(pdf_file_path)
    return text

In [None]:
tmp = read_pdf_tool("../../data/hypoxy_stat_1page.pdf")
tmp

In [None]:
agent_read_pdf = CodeAgent(
    name="read_pdf_agent",
    description="Reads and summarize a PDF file. Call it as a function using the **task** argument.",
    tools=[read_pdf_tool],
    add_base_tools=True,
    model=model,
    verbosity_level=1,
)

agent_web_search = CodeAgent(
    name="web_search_agent",
    description="Runs web searches for you. Call it as a function using the **task** argument.",
    tools=[DuckDuckGoSearchTool(), visit_webpage_tool],
    add_base_tools=True,
    model=model,
    verbosity_level=1,
)

agent = CodeAgent(
    name="medical_agent",
    tools=[],
    model=model,
    add_base_tools=False,
    managed_agents=[agent_read_pdf, agent_web_search],
    verbosity_level=1,
)

agent.visualize()

In [None]:
output = agent.run(
    "Your task is the following: "
    + "Can you read the PDF file at '../../data/hypoxy_stat_1page.pdf' and tell me what it is about? Also, can you give me the wikipedia definition of the area of research?",
    reset=True,
)
display_markdown(output, raw=True)
print("Cost of the agent: ", compute_agent_cost(agent))

## Exercise 1: Create your own RAG Agent

![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/beating_gaia/classical_vs_agentic_rag.png)

First we will create a simple agent that can answer questions on a knowledge base, AKA a **RAG agent**.

1. Define a tool that retrieve documents from a knowledge base
2. Define an agent that uses the tool to retrieve documents and answer questions

In [None]:
from helpers.constants_and_data_classes import Chunk
from helpers.data_processing import SimpleChunker, PDFExtractorAPI
from helpers.embedding import (
    OpenAITextEmbeddings,
    compute_openai_large_embedding_cost,
)
from helpers.vectorstore import (
    ChromaDBVectorStore,
    VectorStoreRetriever,
)

### Retriever Pipeline

In [None]:
data_folder = "../../data"

pdf_files = [
    "Explainable_machine_learning_prediction_of_edema_a.pdf",
    "Modeling tumor size dynamics based on real‐world electronic health records.pdf",
]
example_pdf_file = "Explainable_machine_learning_prediction_of_edema_a.pdf"
example_pdf_path = os.path.join(data_folder, example_pdf_file)

vector_store_collection = "text_collection"

In [None]:
data_extractor = PDFExtractor()
_, text, _ = data_extractor.extract_text_and_images(example_pdf_path)

In [None]:
file_metadata = {"source_text": example_pdf_file}

text_chunker = SimpleChunker(max_chunk_size=1000)

chunks = text_chunker.chunk_text(text, file_metadata)

In [None]:
embedding_model = OpenAITextEmbeddings()
embeddings = embedding_model.get_embedding([chunk.content for chunk in chunks])

In [None]:
vector_store = ChromaDBVectorStore(vector_store_collection)
vector_store.insert_chunks(chunks, embeddings)

In [None]:
retriever = VectorStoreRetriever(embedding_model, vector_store)
results = retriever.retrieve("Who are the authors of the paper=", 5)
results

### Toolify

In [None]:
# Create the retriever tool
@tool
def retriever_tool(query: str, number_of_chunks: int) -> list:
    ## Fill the docstring here

    return retriever.retrieve(query, number_of_chunks)


# Create the CodeAgent with the tool
rag_agent = CodeAgent(tools=[retriever_tool], model=model)
rag_agent.visualize()

In [None]:
# Create the retriever tool
@tool
def retriever_tool(query: str, number_of_chunks: int) -> list:
    """
    This is a tool that can search a document and extract the related information based on the query given. It returns a list of string
    Only use this tool if necessary to answer the query otherwise rely on your internal knowledge.

    Args:
        query: The user query
        number_of_chunks: number of chunks to return, by default it's 5.
    """

    return retriever.retrieve(query, number_of_chunks)


# Create the CodeAgent with the tool
rag_agent = CodeAgent(tools=[retriever_tool], model=model)
rag_agent.visualize()

In [None]:
# Call the agent
output = rag_agent.run(
    "According to SHAP analysis, which factors were the most influential in predicting higher-grade edema (Grade 2+)?"
)

In [None]:
output = rag_agent.run("What is the highest court of the USA?")

## Exercise 2: Make it a multi-agent system

Now that we built a RAG agent, we will improve it by adding a web search tool to it. This way, if the agent can't find the answer in the knowledge base, it will search the web for it.

However, we will transform our agent into a multi-agent system. This way, we will have one agent responsible for answering questions using the knowledge base and another agent responsible for searching the web.

1. Define a tool that searches the web
2. Define a new agent that uses the web search tool
3. Create a multi-agent system that uses both agents


In [None]:
multi_rag_agent = ...

In [None]:


#
from smolagents import DuckDuckGoSearchTool

web_search_agent = CodeAgent(
    name="web_search_agent",
    description="Runs web searches for you. Call it as a function using the **task** argument.",
    tools=[DuckDuckGoSearchTool(), visit_webpage_tool],
    add_base_tools=True,
    model=model,
    verbosity_level=1,
)

rag_agent = CodeAgent(
    name="medical_literature_agent",
    description="Retrieve information from medical litterature. Call it as a function using the **task** argument.",
    tools=[retriever_tool],
    model=model,
)

multi_rag_agent = CodeAgent(
    tools=[],
    model=model,
    add_base_tools=False,
    managed_agents=[rag_agent, web_search_agent],
    verbosity_level=1,
)



In [None]:
# Call the agent
output = multi_rag_agent.run(
    "According to SHAP analysis, which factors were the most influential in predicting higher-grade edema (Grade 2+)?"
)

In [None]:
output = multi_rag_agent.run("What is the current price of Nvidia stock?")