<a href="https://colab.research.google.com/github/thomaspurk/python-gen-ai-chat-examples/blob/main/Hugging_Face_Agents_Course_Dummy_Agent_Library.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Notebook: Hugging Face Agents Course Dummy Agent Library
# Author: Thomas Purk
# Date: 2025-04-18
# Reference: https://huggingface.co/learn/agents-course/unit1/introduction
# Reference: https://huggingface.co/learn/agents-course/unit1/dummy-agent-library

# Hugging Face Agents Course Dummy Agent Library

This notebook is intended to serve as an alternative to the Colab notebook
provided by the Hugging Face Agents course.

https://huggingface.co/learn/agents-course/unit1/dummy-agent-library

**Reason for the alternative**

The course notebook cannot be ran using an InferenceClient object if the
suggested mode "meta-llama/Llama-3.2-3B-Instruct" is not available.

When I attempted to run the original notebook, I got the error...

> HfHubHTTPError: 503 Server Error: Service Temporarily Unavailable for url: https://router.huggingface.co/hf-inference/models/meta-llama/Llama-3.2-3B-Instruct

The alternative public end point "https://jc26mwg228mkj8dw.us-east-1.aws.endpoints.huggingface.cloud" also gave me an error...

> BadRequestError: (Request ID: OUr7-S) Bad request: Bad Request: Invalid state

So, for me and others on the Discord channel, it was not possible to run the
original course notebook. Hopefully, others can run this notebook to experience
the intended lessons of the original notebook.

**Changes in this Notebook**

- Using the smaller "meta-llama/Llama-3.2-1B-Instruct" model
- Using Transformer Pipelines instead of InferenceClient
- Using the chat template from the model tokenizer (as discussed earlier in Unit 1)
- Wrapping get_weather using the @tool decoractor and matching dummy class (as discussed earlier in Unit 1)
- The course notebook uses two text_generation calls to InferenceClient to simulate the dummy agent behavior. I uses the "stop" parameter to stop at "Observation:" on the first call, compiles the result into a second prompt to finish the agent action.
- This notebook breaks the text generation into three phases instead of two, one for each of Thought, Action, Observation.
- This notebook gets a JSON object from the Action phase, which is parsed into a dict. The keys/values of the dict are used to execute the get_weather function. A tiny bit closer to an actual agent.

**Environment**

- Google Colab (I have Colab Pro)
- CPU runtime (Runs in a few minutes, GPU not required.)
- Tokens stored in Colab Secrets

## Notebook Setup
- Imports
- Custom Functions

In [124]:
# Set up Notebook

import os
import inspect
import json
import re

from google.colab import userdata

from IPython.display import Markdown

# Use Hugging Face Pipeline instead of InferenceClient
from transformers import pipeline
from transformers import AutoTokenizer

# Get the Google Colab secret and add to the evironment variables
# Update for other environments as needed.
os.environ["HF_TOKEN"] = userdata.get('HF_TOKEN_2')

In [86]:
# Dummy Tool Class, Decorator, & Function
# From "What are Tools?" course section

class Tool:
    """
    A class representing a reusable piece of code (Tool).

    Attributes:
        name (str): Name of the tool.
        description (str): A textual description of what the tool does.
        func (callable): The function this tool wraps.
        arguments (list): A list of argument.
        outputs (str or list): The return type(s) of the wrapped function.
    """
    def __init__(self,
                 name: str,
                 description: str,
                 func: callable,
                 arguments: list,
                 outputs: str):
        self.name = name
        self.description = description
        self.func = func
        self.arguments = arguments
        self.outputs = outputs

    def to_string(self) -> str:
        """
        Return a string representation of the tool,
        including its name, description, arguments, and outputs.
        """
        args_str = ", ".join([
            f"{arg_name}: {arg_type}" for arg_name, arg_type in self.arguments
        ])

        return (
            f"Tool Name: {self.name},"
            f" Description: {self.description},"
            f" Arguments: {args_str},"
            f" Outputs: {self.outputs}"
        )

    def __call__(self, *args, **kwargs):
        """
        Invoke the underlying function (callable) with provided arguments.
        """
        return self.func(*args, **kwargs)

def tool(func):
    """
    A decorator that creates a Tool instance from the given function.
    """
    # Get the function signature
    signature = inspect.signature(func)

    # Extract (param_name, param_annotation) pairs for inputs
    arguments = []
    for param in signature.parameters.values():
        annotation_name = (
            param.annotation.__name__
            if hasattr(param.annotation, '__name__')
            else str(param.annotation)
        )
        arguments.append((param.name, annotation_name))

    # Determine the return annotation
    return_annotation = signature.return_annotation
    if return_annotation is inspect._empty:
        outputs = "No return annotation"
    else:
        outputs = (
            return_annotation.__name__
            if hasattr(return_annotation, '__name__')
            else str(return_annotation)
        )

    # Use the function's docstring as the description (default if None)
    description = func.__doc__ or "No description provided."

    # The function name becomes the Tool name
    name = func.__name__

    # Return a new Tool instance
    return Tool(
        name=name,
        description=description,
        func=func,
        arguments=arguments,
        outputs=outputs
    )

# NOTE: Divided the docstring for get_weather, so that the to_string
# result would not include "Parameters" & "Returns."
@tool
def get_weather(location: str) -> str:
    """Get the current weather in a given the name of a city."""
    """
        Parameters:
            location (str): The location of interest.

        Returns:
            str: A short descirption of the whether.

    """
    return f"the weather in {location} is sunny with low temperatures."

# Validate getting the string description
print(get_weather.to_string())

Tool Name: get_weather, Description: Get the current weather in a given the name of a city., Arguments: location: str, Outputs: str


In [27]:
# Validate get_weather
print(type(get_weather))
print(get_weather('Chicago'))

<class '__main__.Tool'>
the weather in Chicago is sunny with low temperatures.


In [69]:
# Setup the model
checkpoint = "meta-llama/Llama-3.2-1B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

# "Instruct" models are trained for agent actions and typically provide a chat template.
# Validate chat template exists
Markdown(f'Chat Template:\n\n {tokenizer.chat_template}')

Chat Template:

 {{- bos_token }}
{%- if custom_tools is defined %}
    {%- set tools = custom_tools %}
{%- endif %}
{%- if not tools_in_user_message is defined %}
    {%- set tools_in_user_message = true %}
{%- endif %}
{%- if not date_string is defined %}
    {%- if strftime_now is defined %}
        {%- set date_string = strftime_now("%d %b %Y") %}
    {%- else %}
        {%- set date_string = "26 Jul 2024" %}
    {%- endif %}
{%- endif %}
{%- if not tools is defined %}
    {%- set tools = none %}
{%- endif %}

{#- This block extracts the system message, so we can slot it into the right place. #}
{%- if messages[0]['role'] == 'system' %}
    {%- set system_message = messages[0]['content']|trim %}
    {%- set messages = messages[1:] %}
{%- else %}
    {%- set system_message = "" %}
{%- endif %}

{#- System message #}
{{- "<|start_header_id|>system<|end_header_id|>\n\n" }}
{%- if tools is not none %}
    {{- "Environment: ipython\n" }}
{%- endif %}
{{- "Cutting Knowledge Date: December 2023\n" }}
{{- "Today Date: " + date_string + "\n\n" }}
{%- if tools is not none and not tools_in_user_message %}
    {{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }}
    {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
    {{- "Do not use variables.\n\n" }}
    {%- for t in tools %}
        {{- t | tojson(indent=4) }}
        {{- "\n\n" }}
    {%- endfor %}
{%- endif %}
{{- system_message }}
{{- "<|eot_id|>" }}

{#- Custom tools are passed in a user message with some extra guidance #}
{%- if tools_in_user_message and not tools is none %}
    {#- Extract the first user message so we can plug it in here #}
    {%- if messages | length != 0 %}
        {%- set first_user_message = messages[0]['content']|trim %}
        {%- set messages = messages[1:] %}
    {%- else %}
        {{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }}
{%- endif %}
    {{- '<|start_header_id|>user<|end_header_id|>\n\n' -}}
    {{- "Given the following functions, please respond with a JSON for a function call " }}
    {{- "with its proper arguments that best answers the given prompt.\n\n" }}
    {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
    {{- "Do not use variables.\n\n" }}
    {%- for t in tools %}
        {{- t | tojson(indent=4) }}
        {{- "\n\n" }}
    {%- endfor %}
    {{- first_user_message + "<|eot_id|>"}}
{%- endif %}

{%- for message in messages %}
    {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}
        {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}
    {%- elif 'tool_calls' in message %}
        {%- if not message.tool_calls|length == 1 %}
            {{- raise_exception("This model only supports single tool-calls at once!") }}
        {%- endif %}
        {%- set tool_call = message.tool_calls[0].function %}
        {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}}
        {{- '{"name": "' + tool_call.name + '", ' }}
        {{- '"parameters": ' }}
        {{- tool_call.arguments | tojson }}
        {{- "}" }}
        {{- "<|eot_id|>" }}
    {%- elif message.role == "tool" or message.role == "ipython" %}
        {{- "<|start_header_id|>ipython<|end_header_id|>\n\n" }}
        {%- if message.content is mapping or message.content is iterable %}
            {{- message.content | tojson }}
        {%- else %}
            {{- message.content }}
        {%- endif %}
        {{- "<|eot_id|>" }}
    {%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{%- endif %}


In [75]:
# Create the Text Generation Pipeline
tg_pipeline = pipeline(
    task="text-generation",
    model=checkpoint
)

Device set to use cpu


## Manually Build and Execute the Dummy Agent

In [76]:
# Example question the user might ask the agent
user_question = "What's the weather in London?"

In [77]:
# NOTE: Pipelines do not have a "stop" parameter similar to the InferenceClient shown
# in the course tutorial.

# So, break up the prompt into Thought -> Action > Observation cycle phase

In [90]:
# Thought Phase Messages

THOUGHT_PHASE = f"""You are given the user question '{user_question}.' Do not
answer the question. Instead think about the action you need to take in
order to determine the answer to this question given that you are provided
access to the following tools:

{get_weather.to_string()}

Provide a response that summarizes the conclusion resulting from you thoughts in
 a single concise sentence."""

# Create the Thought Phase Messages list
thought_messages=[
    {"role": "system", "content": "You think about how to answer questions instead of providing answers."  },
    {"role": "user", "content": THOUGHT_PHASE},
]

# Though Phase Prompt - Apply the chat template
thought_prompt = tokenizer.apply_chat_template(
    conversation=thought_messages,
    tokenize=False,
    add_generation_prompt=True
)

# Display / Validate
print(thought_prompt)

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 19 Apr 2025

You think about how to answer questions instead of providing answers.<|eot_id|><|start_header_id|>user<|end_header_id|>

You are given the user question 'What's the weather in London?.' Do not 
answer the question. Instead think about the action you need to take in
order to determine theanswer this question given that you are provided access to the following tools:

Tool Name: get_weather, Description: Get the current weather in a given the name of a city., Arguments: location: str, Outputs: str

Provide a response that summarizes the conclusion resulting from you thoughts in
 a single concise sentence.<|eot_id|><|start_header_id|>assistant<|end_header_id|>




In [91]:
# Execute the pipeline to generate an LLM response
thought_output = tg_pipeline(
    thought_prompt,
    max_new_tokens=200,
    pad_token_id=128001
)

# Format the result to remove the original prompt text
thought_output = thought_output[0]['generated_text'].replace(thought_prompt,'')
thought_output = f"Thought: {thought_output}"

In [92]:
# Validate Tought Output
print(thought_output)

Thought: To answer this question, I need to determine the location of London and then use the `get_weather` tool to retrieve the current weather for that location.


In [118]:
# Action Phase Messages

ACTION_PHASE = f"""You are provided access to the following tools:

{get_weather.to_string()}

Create a JSON object that describes action instructions. You must specify
a JSON object containing an 'action' key that is the name of the tool and an
'action_input' key that is the input parameters to the tool.

Respond only with a JSON object and no other text.

example response:

{{
  'action': 'get_weather',
  'action_input': {{'location': 'New York'}}
}}
"""

# Create the Messages list
action_messages=[
    {"role": "system", "content": "You create JSON objects containing tool instructions"  },
    {"role": "user", "content":  thought_output + "\n\n" + ACTION_PHASE}
]

# Action Phase Prompt - Apply the chat template
action_prompt = tokenizer.apply_chat_template(
    conversation=action_messages,
    tokenize=False,
    add_generation_prompt=True
)

# Display / Validate
print(action_prompt)


<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 19 Apr 2025

You create JSON objects containing tool instructions<|eot_id|><|start_header_id|>user<|end_header_id|>

Thought: To answer this question, I need to determine the location of London and then use the `get_weather` tool to retrieve the current weather for that location.

You are provided access to the following tools:

Tool Name: get_weather, Description: Get the current weather in a given the name of a city., Arguments: location: str, Outputs: str

Create a JSON object that describes action instructions. You must specify
a JSON object containing an 'action' key that is the name of the tool and an
'action_input' key that is the input parameters to the tool.

Respond only with a JSON object and no other text.

example response:

{
  'action': 'get_weather',
  'action_input': {'location': 'New York'}
}<|eot_id|><|start_header_id|>assistant<|end_header_id|>




In [133]:
# Execute the pipeline to generate an LLM response
action_output = tg_pipeline(
    action_prompt,
    max_new_tokens=200,
    pad_token_id=128001
)

# Format the result to remove the original prompt text
action_output_raw = action_output[0]['generated_text'].replace(action_prompt,'')

# Based on testing while coding, the model can sometimes add extra "stuff"
# Keep only the JSON object text
match = re.search(r'\{.*\}', action_output_raw, re.DOTALL)
action_output_raw = match.group() # If this errors, we can't continue anyway.

# More formatting
action_output = f"Action:\n{action_output_raw}"

In [135]:
# Validate Action Output
print(action_output_raw)

# If a valid JSON string is not the precise output, then the remaining cells
# will error.

# Fingers crossed for output similar to . . .
# {
#   "action": "get_weather",
#   "action_input": {"location": "London"}
# }

{
  "action": "get_weather",
  "action_input": {
    "location": "London"
  }
}


In [141]:
# Execute the Action

# Attempt to parse the JSON string into a Python dict
action_dict = json.loads(action_output_raw)

# Get the action Name
func_name = action_dict["action"]

# Get the function from locals
action_result = locals()[func_name](**action_dict["action_input"])

# Validate Action Results
print(f"Action Result: {action_result}")

Action Result: the weather in London is sunny with low temperatures.


In [138]:
# Observation Phase Messages

OBSERVATION_PHASE = f"""Given that the result of the action is {action_result},
state your observation in the 'Observation' field within the following
format.

Observation: <your observation>"""

# Build the final prompt
final_prompt = thought_output + "\n\n" + \
    action_output + "\n\n" + \
    OBSERVATION_PHASE

# Create the Messages list
final_messages=[
    {"role": "system", "content": "You comment on the results of an action."  },
    {"role": "user", "content": final_prompt}
]

# Apply the chat template
final_prompt = tokenizer.apply_chat_template(
    conversation=final_messages,
    tokenize=False,
    add_generation_prompt=True
)

# Display / Validate
print(final_prompt)

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 19 Apr 2025

You comment on the results of an action.<|eot_id|><|start_header_id|>user<|end_header_id|>

Thought: To answer this question, I need to determine the location of London and then use the `get_weather` tool to retrieve the current weather for that location.

Action:
{
  "action": "get_weather",
  "action_input": {
    "location": "London"
  }
}

Given that the result of the action is the weather in London is sunny with low temperatures.,
return state your observation in the 'Observation' field within the following 
format.

Observation: <your observation><|eot_id|><|start_header_id|>assistant<|end_header_id|>




In [139]:
# Execute the pipeline to generate an LLM response
final_output = tg_pipeline(
    final_prompt,
    max_new_tokens=200,
    pad_token_id=128001
)

final_output = final_output[0]['generated_text'].replace(final_prompt,'')

In [140]:
# Validate Final Output
print(final_output)

To determine the observation based on the given action and result, I'll follow these steps:

1. Determine the location from the action input.
2. Use the `get_weather` tool to retrieve the weather in that location.
3. Analyze the result to determine the observation.

Given the action input:
```json
{
  "action": "get_weather",
  "action_input": {
    "location": "London"
  }
}
```
I'll assume that the `get_weather` tool returns a JSON response with the following structure:
```json
{
  "location": "London",
  "weather": {
    "temperature": "low",
    "description": "sunny"
  }
}
```
From this structure, I can determine that the observation is:

Observation: The weather in London is sunny with low temperatures.


In [None]:
# Conclusion: Not exectly the same output as the original, but good enough to
# illustrate the learning points of the original