# Introduction to AI Agents with Ollama and Llama3.2

## Introduction to Tools

Tools are a way to augment our LLMs with code execution. A tool is simply a function formatted so that our agent can understand how to use it, and then execute it. Let's start by creating a few simple tools.

We can use the `@tool` decorator to create an LLM-compatible tool from a standard Python function — this function should include a few things for optimal performance:

- A docstring describing what the tool does and when it should be used. This will be read by our LLM/agent and used to decide when to use the tool, and also how to use the tool.
- Clear parameter names that ideally tell the LLM what each parameter is. If it isn't clear, we make sure the docstring explains what the parameter is for and how to use it.
- Both parameter and return type annotations.


In [21]:
# First, let's install the necessary packages
!pip install -q langchain langchain_community requests

In [22]:
# Import necessary libraries
from langchain_core.tools import tool
from langchain_community.llms import Ollama
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferMemory
from langchain.schema import HumanMessage, AIMessage
import json
import requests
from datetime import datetime
from IPython.display import display, Markdown

# Initialize the Ollama LLM
llm = Ollama(
    model="llama3.2",  # Model name
    temperature=0.0,    # Lower temperature for more deterministic outputs
    base_url="http://localhost:11434"  # Default Ollama server URL
)

# Test the connection
response = llm.invoke("Hello, are you working?")
print(response)

I'm here and ready to help. How can I assist you today?


## Creating Tools

With Ollama, we can still use the `@tool` decorator to create tools, but we need to handle tool calling differently since Ollama doesn't support the `bind_tools` method directly. Let's create our mathematical tools:


In [23]:
@tool
def add(x: float, y: float) -> float:
    """Add 'x' and 'y'."""
    return x + y

@tool
def multiply(x: float, y: float) -> float:
    """Multiply 'x' and 'y'."""
    return x * y

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the power of 'y'."""
    return x ** y

@tool
def subtract(x: float, y: float) -> float:
    """Subtract 'x' from 'y'."""
    return y - x


With the `@tool` decorator, our function is turned into a `StructuredTool` object, which we can see below:

In [24]:
print(f"{add.name=}\n{add.description=}")
print(add.args_schema.model_json_schema())
print(exponentiate.args_schema.model_json_schema())

add.name='add'
add.description="Add 'x' and 'y'."
{'description': "Add 'x' and 'y'.", 'properties': {'x': {'title': 'X', 'type': 'number'}, 'y': {'title': 'Y', 'type': 'number'}}, 'required': ['x', 'y'], 'title': 'add', 'type': 'object'}
{'description': "Raise 'x' to the power of 'y'.", 'properties': {'x': {'title': 'X', 'type': 'number'}, 'y': {'title': 'Y', 'type': 'number'}}, 'required': ['x', 'y'], 'title': 'exponentiate', 'type': 'object'}


When invoking the tool, a JSON string output by the LLM will be parsed into JSON and then consumed as kwargs, similar to the below

In [25]:
llm_output_string = "{\"x\": 5, \"y\": 2}"  # this is the output from the LLM
llm_output_dict = json.loads(llm_output_string)  # load as dictionary
print(llm_output_dict)

# This is then passed into the tool function as kwargs (keyword arguments) as indicated by the ** operator
result = exponentiate.func(**llm_output_dict)
print(result)


{'x': 5, 'y': 2}
25


## Creating a Custom Agent for Ollama

Since Ollama doesn't support the `bind_tools` method required by LangChain's `create_tool_calling_agent`, we need to create a custom agent implementation. We'll implement a simple tool-calling loop that works with Ollama.


In [26]:
def run_simple_tool_agent(llm, tools, user_input, max_steps: int = 6):
    """
    Simple tool-calling loop for LLMs that don't implement bind_tools().
    Expects the LLM to either:
      - return a JSON object like {"tool": "add", "x": 2, "y": 3}
      - or produce a non-JSON final answer (treated as the final response)
    """
    import json, textwrap
    tools_map = {t.name: t for t in tools}  # StructuredTool objects from the notebook
    scratchpad = []  # keep a running history of tool calls / observations
    for step in range(max_steps):
        # Build a short instruction + scratchpad to send to the LLM
        instruction = (
            "You are a calculator agent. When you need to use a tool, output a JSON object "
            "with the key 'tool' and parameters for that tool, e.g. "
            "{\"tool\": \"add\", \"x\": 2, \"y\": 3}. "
            "When you are done, return the final answer as plain text."
        )
        if scratchpad:
            instruction += "\n\nScratchpad:\n" + "\n".join(scratchpad)
        prompt_text = f"{instruction}\n\nUser: {user_input}\n\nAgent:"
        resp = llm.invoke(prompt_text).strip()
        # Try to parse JSON -> treat as tool call
        try:
            payload = json.loads(resp)
            if not isinstance(payload, dict) or "tool" not in payload:
                # Not the expected shape -> treat as final answer
                return resp
            tool_name = payload.pop("tool")
            if tool_name not in tools_map:
                scratchpad.append(f"ERROR: unknown tool '{tool_name}'")
                continue
            tool = tools_map[tool_name]
            # call the underlying function (StructuredTool stores original func on .func)
            observation = tool.func(**payload)
            scratchpad.append(f"CALL {tool_name} -> {observation}")
            # continue the loop so the LLM can use the observation
            continue
        except json.JSONDecodeError:
            # Not JSON => assume final natural language answer
            return resp
    # If max_steps exhausted, return the last LLM response or a timeout message
    return "Agent stopped after max steps. " + (resp if 'resp' in locals() else "")

Now let's test our custom agent with a mathematical query:


In [27]:
# Define our tools
tools = [add, subtract, multiply, exponentiate]

# Example usage
user_input = "Compute (2+3)*4 and then raise the result to the power 2."
result = run_simple_tool_agent(llm=llm, tools=tools, user_input=user_input)
print(result)


{"tool": "multiply", "x": "(2+3)", "y": 4}
{"tool": "add", "x": 5, "y": 0}
{"tool": "power", "x": 5, "y": 2} 

25


## Adding Memory to Our Agent

To make our agent remember previous interactions, we need to add memory functionality. Let's create an enhanced version of our agent that includes conversation memory:


In [28]:
def run_tool_agent_with_memory(llm, tools, user_input, memory=None, max_steps: int = 6):
    """
    Tool-calling agent with memory for LLMs that don't implement bind_tools().
    """
    import json, textwrap
    tools_map = {t.name: t for t in tools}
    scratchpad = []
    
    # Format conversation history
    history = ""
    if memory is not None:
        for message in memory.chat_memory.messages:
            if isinstance(message, HumanMessage):
                history += f"Human: {message.content}\n"
            elif isinstance(message, AIMessage):
                history += f"Assistant: {message.content}\n"
    
    for step in range(max_steps):
        instruction = (
            "You are a helpful assistant. You have access to the following tools: " + 
            ", ".join(tools_map.keys()) + ".\n\n" +
            "When you need to use a tool, output a JSON object with the key 'tool' and parameters for that tool, e.g. " +
            "{\"tool\": \"add\", \"x\": 2, \"y\": 3}.\n\n" +
            "When you are done, return the final answer as plain text.\n\n" +
            "Conversation History:\n" + history + "\n\n" +
            "Scratchpad:\n" + "\n".join(scratchpad) + "\n\n" +
            "User: " + user_input + "\n\n" +
            "Agent:"
        )
        resp = llm.invoke(instruction).strip()
        # Try to parse JSON -> treat as tool call
        try:
            payload = json.loads(resp)
            if not isinstance(payload, dict) or "tool" not in payload:
                # Not the expected shape -> treat as final answer
                # Update memory with the final answer
                if memory is not None:
                    memory.save_context({"input": user_input}, {"output": resp})
                return resp
            tool_name = payload.pop("tool")
            if tool_name not in tools_map:
                scratchpad.append(f"ERROR: unknown tool '{tool_name}'")
                continue
            tool = tools_map[tool_name]
            observation = tool.func(**payload)
            scratchpad.append(f"CALL {tool_name} -> {observation}")
            # continue the loop
        except json.JSONDecodeError:
            # Not JSON => assume final natural language answer
            if memory is not None:
                memory.save_context({"input": user_input}, {"output": resp})
            return resp
    # If max_steps exhausted, return the last LLM response or a timeout message
    result = "Agent stopped after max steps. " + (resp if 'resp' in locals() else "")
    if memory is not None:
        memory.save_context({"input": user_input}, {"output": result})
    return result

Now let's create a memory object and test our agent with memory:


In [29]:
# Create a memory object
memory = ConversationBufferMemory(
    memory_key="chat_history",  # must align with MessagesPlaceholder variable_name
    return_messages=True  # to return Message objects
)

# Test with a simple query
result = run_tool_agent_with_memory(
    llm=llm, 
    tools=tools, 
    user_input="what is 10.7 multiplied by 7.68?",
    memory=memory
)
print(result)

# Test with memory
result = run_tool_agent_with_memory(
    llm=llm, 
    tools=tools, 
    user_input="My name is James",
    memory=memory
)
print(result)

# Test with a complex calculation
result = run_tool_agent_with_memory(
    llm=llm, 
    tools=tools, 
    user_input="What is nine plus 10, minus 4 * 2, to the power of 3",
    memory=memory
)
print(result)

# Test if the agent remembers the name
result = run_tool_agent_with_memory(
    llm=llm, 
    tools=tools, 
    user_input="What is my name",
    memory=memory
)
print(result)

Agent stopped after max steps. {"tool": "multiply", "x": 10.7, "y": 7.68}
Hello James! It's nice to meet you. How can I assist you today?
{"tool": "add", "x": 9, "y": 10}
{"tool": "subtract", "x": {"tool": "multiply", "x": 4, "y": 2}, "y": 11}
{"tool": "exponentiate", "x": 11, "y": 3}
Hello James! I'm happy to help you with your questions. You haven't asked a question yet, so let's get started! What would you like to know or calculate?


## Creating a Weather Agent

Now let's create tools for getting location and weather information:

In [30]:
@tool
def get_location_from_ip():
    """Get the geographical location based on the IP address."""
    try:
        response = requests.get("https://ipinfo.io/json")
        data = response.json()
        if 'loc' in data:
            latitude, longitude = data['loc'].split(',')
            data = (
                f"Latitude: {latitude},\n"
                f"Longitude: {longitude},\n"
                f"City: {data.get('city', 'N/A')},\n"
                f"Country: {data.get('country', 'N/A')}"
            )
            return data
        else:
            return "Location could not be determined."
    except Exception as e:
        return f"Error occurred: {e}"

@tool
def get_current_datetime() -> str:
    """Return the current date and time."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

@tool
def get_weather(location: str) -> str:
    """Get the current weather for a specified location.
    
    Args:
        location: The location to get weather for, e.g., "London, UK"
        
    Returns:
        Weather information as a string
    """
    try:
        # For this example, we'll simulate weather data
        # In a real implementation, you would use a weather API
        return f"The weather in {location} is partly cloudy with a temperature of 22°C."
    except Exception as e:
        return f"Error getting weather: {e}"

Now let's test our weather agent:

In [31]:
# Create a new memory object for the weather agent
weather_memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Define the weather tools
weather_tools = [get_current_datetime, get_location_from_ip, get_weather]

# Test the weather agent
result = run_tool_agent_with_memory(
    llm=llm, 
    tools=weather_tools, 
    user_input="I have a few questions, what is the date and time right now? How is the weather where I am? Please give me degrees in Celsius",
    memory=weather_memory
)
display(Markdown(result))

{"tool": "get_current_datetime",}

{"tool": "get_weather",}