<a href="https://colab.research.google.com/github/RDGopal/IB9LQ0-GenAI/blob/main/8_2_simple_agent_with_tools.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple Agent with Tools
Here we will enhance a simple, LLM-based agent by giving it access to a simple tool ... a calculator. While LLMs have got a lot better at doing calculations, in general they do not understand math and will answer math questions based on probable words (i.e. memorisation) not based on calculating the actual sums. To avoid any hallucinations we will give our agent a simple, Python-base calculator.

First, we need to setup the LLM as before:

In [None]:
# Install dependencies
!pip install -q transformers accelerate

# Load Hugging Face token (stored as a secret in Colab or prompted)
import os
from google.colab import userdata


# Load Falcon-7B-Instruct model
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# If you haven't already added the secret in Colab, here's how:
# Runtime > Secrets > Set 'HF_TOKEN' with your Hugging Face API token
hf_api_token = userdata.get('HF_TOKEN')  # Make sure the secret is set in Colab

# Set the token if it's retrieved correctly, else print an error
if hf_api_token:
    os.environ["HUGGINGFACEHUB_API_TOKEN"] = hf_api_token
else:
    print("Hugging Face API Token is missing. Please set it in Colab Secrets.")

model_id = "tiiuae/falcon-7b-instruct" # faster but less accurate
#model_id = "tiiuae/Falcon3-10B-Instruct" # this will be quite slow

tokenizer = AutoTokenizer.from_pretrained(model_id, token=hf_api_token)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)

Now that we have our model we can create our calculator:

In [None]:
# Define calculator tool
def calculator(expression):
    try:
        return str(eval(expression))
    except Exception as e:
        return f"Error: {e}"

We've created a very simple calulator that takes the output of the LLM (which should be an equation) and evaluates it as math. We then can return the results as a string.

Now we can define the agent. Note, the agent, if using a calculator, needs to work in three stages:
1. First it needs to determine if it needs to use the calculator tool;
2. If so, it then needs to pass the equation to the calculator;
3. Once it has the result from the calculator it needs to return this answer as part of its prompt.

For this reason the code may look quite complicated, but we'll break it down.

In [None]:
# Simple agent that detects [TOOL: calculator ...] and responds
def run_agent(query):
    prompt = f"""You are a helpful assistant. You may use tools.

      If a calculation is needed, request it like this:
      [TOOL: calculator 5 * (6 + 1)]
      Please replace the formula in the example with the actual calculation.
      Then use the result in your final answer.

      User: {query}
      Assistant:"""

    # Run initial LLM to get the first response ...
    # this may be a call to the calculator

    # first tokenise the query
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    # generate the (initial) output from the LLM
    outputs = model.generate(**inputs, max_new_tokens=300)

    # decode (convert from output tokens to natural language)
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print("\n Model Response:\n", response)

    # Check for tool call
    if "[TOOL: calculator" in response:
        # if the response includes a call to the calculator then
        try:
            # find the part of the generated text that includes the equation
            # see 8.1 Notebook for more discussion of this
            tool_call = response.split("[TOOL: calculator", 1)[1].split("]", 1)[0].strip()

            # pass the equation to the calulator function
            result = calculator(tool_call)
            print(f"\n Calculator called with: {tool_call} = {result}")

            # Continue with result
            # here we make another call to the LLM with the results of the calculator
            followup = f"{response}\n[RESULT: {result}]\nAssistant:"

            # again, tokenise input, get the results and decode the results
            followup_inputs = tokenizer(followup, return_tensors="pt").to(model.device)
            followup_output = model.generate(**followup_inputs, max_new_tokens=100)
            final = tokenizer.decode(followup_output[0], skip_special_tokens=True)
            print("\n Final Answer:\n", final)

        except Exception as e:
            print("Tool error:", e)

    else:
        print("\n No tool needed.")

Now we can test it!

In [None]:
# Try it out
run_agent("What is (12 + 4) * 3?")

If you are using the 7bil parameter model (as the code will execute unless you change it) the results are ...

... underwhelming.

The LLM is too dumb to replace the example equation with the actual equation and so calculates 5 * (6 + 1) rather than (12 + 4) * 3. Inexplicably, in my results it then adds 1 to this before returning as 36. Idiot.

If you want to upgrade the model then just uncomment to use the 10bil parameter model and it should work. However, installing and running will take much longer.

Regardless though, we have seen how such an approach would work!