<a href="https://colab.research.google.com/gist/justheuristic/79d21c8afe45c2ae85b0737ab52a5e29/hf_agents_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Let an LLM search the web for you using Transformers Agents
_Based on the [original tutorial](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/agents.ipynb) authored by: [Aymeric Roucher](https://huggingface.co/m-ric)_

This notebook demonstrates how you can use [**Transformers Agents**](https://huggingface.co/docs/transformers/en/agents) to search the web for you.

**TL;DR what are agents**? Agents are systems that are powered by an LLM and enable the LLM (with careful prompting and output parsing) to use specific *tools* to solve problems. Tools can be anything: from a calculator to a web search engine, an API or even another LLM. The model is prompted to use a tool by generating a text (e.g. python code) that is then interpretted as a call to a tool. Whatever the tool returns is sent back into the LLM.

In [1]:
!pip install "transformers[agents]" datasets langchain sentence-transformers faiss-cpu duckduckgo-search openai langchain-community wikipedia --upgrade -q
from huggingface_hub import login; login()  # optional: register a free account and create a READ token for API access, otherwise there's a harsh rate limit

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.1/44.1 kB[0m [31m1.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m480.6/480.6 kB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m22.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.5/27.5 MB[0m [31m32.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m389.5/389.5 kB[0m [31m20.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m45.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

First, let's fetch an LLM 'engine' to use those tools:

In [2]:
import transformers

# we're gonna use a remote API for now; we'll see local HF agents later
llm_engine = transformers.HfApiEngine("Qwen/Qwen2.5-72B-Instruct")

In [3]:
output = llm_engine([{"role": "user", "content": "Who is the last US president?"}], stop_sequences=["\n"])
print('\n\nOutput (no tools):', output)



Output (no tools): The last US president, as of my last update in October 2023, is Joe Biden. He was inaugurated on January 20, 2021, as the 46th President of the United States. If you're reading this after a new president has been inaugurated, please check the most current sources for the latest information.


### Tool use: Web Search

There are [**many**](https://huggingface.co/docs/transformers/main/en/main_classes/agent) tools available to your models, and you can create new ones as you see fit. This time, we will try one of the more useful ones, a web search, with [duckduckgo.com](https://duckduckgo.com/) as its backend. **Warning:** the DuckDuckGo tool is rate-limited. Calling it too often will result in an exception - just wait for it to cool down.

In [4]:
search_tool = transformers.agents.search.DuckDuckGoSearchTool()
# usage: >>> search_tool.forward("The size of your mom")

__Define a basic agent:__

In [5]:
agent = transformers.ReactCodeAgent(
    tools=[search_tool], llm_engine=llm_engine,
    additional_authorized_imports=['math', 'time', 'datetime', 'requests', 're', 'bs4', 'wikipedia']
)

In [6]:
result = agent.run("Who is the latest US president?")
print("Output (with search):", result)

[37;1mWho is the latest US president?[0m
[33;1m=== Agent thoughts:[0m
[0mThought: I will use the `web_search` tool to find the latest US president.[0m
[33;1m>>> Agent is executing the code below:[0m
[0m[38;5;7mresults[39m[38;5;7m [39m[38;5;109;01m=[39;00m[38;5;7m [39m[38;5;7mweb_search[39m[38;5;7m([39m[38;5;7mquery[39m[38;5;109;01m=[39;00m[38;5;144m"[39m[38;5;144mlatest US president[39m[38;5;144m"[39m[38;5;7m)[39m
[38;5;109mprint[39m[38;5;7m([39m[38;5;7mresults[39m[38;5;7m)[39m[0m
[33;1m====[0m
[33;1mPrint outputs:[0m
[32;20m[{'title': 'US president election results 2024 | Live maps, charts and the latest ...', 'href': 'https://www.reuters.com/graphics/USA-ELECTION/RESULTS/zjpqnemxwvx/president/', 'body': 'Updated results from the 2024 election for the US president majority. Reuters live coverage of the 2024 US President, Senate, House and state governors races.'}, {'title': 'Election 2024: Presidential results - CNN', 'href': 'https://www

Output (with search): Donald Trump


### Using Local LLMs for Agents

Below, we define an agent from a `transformers

In [1]:
import transformers
model_name = "unsloth/Llama-3.2-3B-Instruct"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
model = transformers.AutoModelForCausalLM.from_pretrained(model_name, torch_dtype='auto', device_map='auto')
# note: you may load larger models with quantization (e.g. load_in_4bit=True or pre-quantized)
pipe = transformers.pipeline("text-generation", model=model, tokenizer=tokenizer)
pipe.generation_config.max_new_tokens = 8192

llm_engine = transformers.TransformersEngine(pipe)
# usage: >>> result = llm_engine([{"role": "user", "content": "What is my purpose?"}], stop_sequences=["\n"])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [2]:
agent = transformers.ReactCodeAgent(
    tools=[transformers.agents.search.DuckDuckGoSearchTool()],
    llm_engine=llm_engine,
    additional_authorized_imports=['math', 'time', 'datetime', 'dateutil', 'requests', 're', 'bs4', 'wikipedia']
)

result = agent.run("How old is Donald Trump?")
# note: this may print errors in LLM-generated code - these are fed back into LLM so it may fix them.
# note also: this particular LLM solves the problem about half of the time - and often does it after several self-corrections.
#            If it tuns out of memory due to too many tokens, just restart runtime (session) and run it again.
print("Output (local LLM with DuckDuckGo)", result)

In [None]:
# see detailed docs / examples at: https://huggingface.co/docs/transformers/en/agents