# Why Agents?

What is an agent?

An agent is a reference to combining models that can perform some kind of reasoning, like large language models (e.g ChatGPT, Llama2, Mistral, etc...) with tools to give it access to the real world,
so they can do things like browsing the internet, buying stuff, etc...

Ok, so, why is there so much hype around agents right now?

Because Agents are cool! Recently with the advance of LLMs, we've seen them become an amazing tool to do all sorts of things like building apps, browse the internet and more.



SOme neat examples of these kinds of agents can be found in here:

- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT)
- [GPT-Engineer](https://github.com/gpt-engineer-org/gpt-engineer)
- [BabyAGI](https://github.com/yoheinakajima/babyagi)

Now, today although they seem extremely powerful, agents are still at a very early stage in terms of readiness to deploy as products, something you can atest by listening to Andrej Karpathy talk about agents in this talk here:

- [Karpathy on Agents](https://www.youtube.com/watch?v=fqVLjtvWgq8)

This live-training is all about! Getting you excited about this amazing new technology, understanding it from the ground up but with a focus on practical applications and fun stuff you can do with them! 

# What is an Agent?

An agent is nothing more than some entity that can _think_ and _act_, that's right, in a way you're an agent! 

After all you can think and act on those thoughts like in the case of coming to this live-training:

- Thought: "I want to learn about agents"

- Action: "Go to the internet and research cool platforms where I can learn about agents"

- Thought: "O'Reilly has some awesome courses and live-trainings"

- Action: "Look up O'Reilly courses"

- Thought: "Live-trainings by instructor Lucas are awesome"

- Action: "Schedule live-training about agents with instructor Lucas Soares" (lol)

In a way this is a simplified rendition of what brought you here, obviously not necessarily in this particular order nor these particular sets of thought and action pairs. This particular way of thinking about how to structure thoughts and actions is well represented in the paper: [ReACT](https://arxiv.org/pdf/2210.03629.pdf). 

With regards to LLMs, how can bring this idea to fruition thinking about the LLM model as the reasoning and thinking engine?

We can start simple and just call the openai API to start:

In [None]:
%pip install openai

In [43]:
# uncomment this if running locally
#!pip install python-dotenv
# from dotenv import load_dotenv

# load_dotenv()

import os
import getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"var: ")

_set_env("OPENAI_API_KEY")

In [44]:
from openai import OpenAI
from IPython.display import Markdown
client = OpenAI()

def get_response(prompt_question):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful research and programming assistant"},
            {"role": "user", "content": prompt_question}]
    )
    
    return response.choices[0].message.content

output = get_response("Create a joke about a bald teacher explaining agents to wonderful students.")
Markdown(output)

Why did the bald teacher become an expert on agents? 

Because he knew that just like hair, their ability to adapt can really make a difference in the classroom!

Ok cool, so here we have three ideas of actions to perform:

- Creating directories
- Listing files
- Removing files

Let's transform them into functions that we could call just like in any type of Python-based application.

In [45]:
def create_directory(dir_name):
    os.makedirs(dir_name, exist_ok=True)

def create_file(filename):
    with open(filename, 'w'):
        pass

def list_files():
    files = os.listdir()
    for file in files:
        print(file)

Now, let's imagine that we wanted to create an agent that would perform these actions for us based on some input that we give it, how can we connect models that we know and can use today like ChatGPT, with these tools that do stuff in the real world?

To answer this question, how about we give a task to the model, and for that task we ask it to list the steps that it needs to perform to complete the task, and then for each of those steps we would ask the model to decide whether or not a function should be called to execute that task? 

In the famous paper ['Toolformer'](https://arxiv.org/pdf/2302.04761.pdf) they demonstrated that today's advanced LLMs like the gpt-series could teacha themselves how to properly call and use external tools!

Isn't that awesome???

So, let's see if we can hack our way into connecting the llm response with the functions that we want that llm to use.

In [4]:
def get_response(prompt_question):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": "You are a helpful research and programming assistant"},
                {"role": "user", "content": prompt_question}]
    )
    
    return response.choices[0].message.content


Now, how can we actually put it all together so that given a task, a model can:

- Plan the task
- Execute actions to complete the task
- Know when to call a function


This is actually an interesting problem, let's understand why is that the case by trying to hack our way into putting all of these together:

In [46]:
def get_response(prompt_question, model="gpt-3.5-turbo-16k"):
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "system", "content": "You are a helpful research and programming assistant"},
                {"role": "user", "content": prompt_question}]
    )
    
    return response.choices[0].message.content

def create_directory(dir_name):
    os.makedirs(dir_name, exist_ok=True)

def create_file(filename):
    with open(filename, 'w'):
        pass

def list_files():
    files = os.listdir()
    for file in files:
        print(file)

    

task_description = "Create a folder called 'funny-pancakes-recipes'. Inside that folder, \
create a file called '3-funny-pancake-recipes.md"

prompt = f"""Given this task: {task_description}, \n
        Consider you have access to the following functions:
                            
    def create_directory(dir_name):
        os.makedirs(dir_name, exist_ok=True)

    def create_file():
        with open('test.txt', 'w'):
            pass

    def list_files():
        files = os.listdir()
        for file in files:
            print(file)
    
    Your output should be the first function to be executed to complete the task containing the necessary arguments.
    The OUTPUT SHOULD ONLY BE THE PYTHON FUNCTION CALL and NOTHING ELSE.
    """

output = get_response(prompt)

Markdown(output)

```python
create_directory('funny-pancakes-recipes')
```

In [None]:
import os
# Remove any ```python or ``` tags from the output string if present
output_clean = output.replace("```python", "").replace("```", "").strip()
Markdown(output_clean)

create_directory('funny-pancakes-recipes')

In [50]:
exec(output_clean)

In [51]:
!ls -d ./* | grep pancakes

[1m[36m./funny-pancakes-recipes[m[m


At this point we can start identifying a lot of issues with this approach despite our early sucess:

- Uncertainty of model's outputs can affect our ability to reliably call the functions
- We need more structured ways to prepare the inputs of the function calls
- We need better ways to put everything together (just feeding the entire functions like this makes it a very clunky and non-scalable framework for more complex cases)

There are many more issues but starting with these, we can now look at frameworks and see how they fix these issues and with that in mind understand what is behind their implementations!

I personally think this is a much better way to understand what is going on behind agents in practice rather than just use the more higher level frameworks right of the bat!

# OpenAI Functions

Ok, let's first understand how [OpenAI](https://openai.com/) the company behind ChatGPT, allows for these function call implementations in its API.

OpenAI implemented a [function calling API](https://platform.openai.com/docs/guides/function-calling) which is a standard way to connect their models to outside tools like in the very simple example we did above.

According to their [official documentation](https://platform.openai.com/docs/guides/function-calling#:~:text=The%20basic%20sequence,to%20the%20user.) the sequence of steps for function calling is as follows:
1. Call the model with the user query and a set of functions defined in the functions parameter.
2. The model can choose to call one or more functions; if so, the content will be a stringified JSON object adhering to your custom schema (note: the model may hallucinate parameters).
3. Parse the string into JSON in your code, and call your function with the provided arguments if they exist.
4. Call the model again by appending the function response as a new message, and let the model summarize the results back to the user.

Below is an example taken from their official documentation:

In [52]:
from openai import OpenAI
import json

client = OpenAI()

Let's look at how our previous model with those three simple functions: `create_directory()`, `create_file()`, and `list_files()` would be implemented using OpenAI's function calling approach:

In [53]:
import json
import subprocess
import json
import os

def create_directory(directory_name):
    """Function that creates a directory given a directory name."""
    os.mkdir(directory_name)
    return json.dumps({"directory_name": directory_name})


tool_create_directory = {
    "type": "function",
    "function": {
        "name": "create_directory",
        "description": "Create a directory given a directory name.",
        "parameters": {
            "type": "object",
            "properties": {
                "directory_name": {
                    "type": "string",
                    "description": "The name of the directory to create.",
                }
            },
            "required": ["directory_name"],
        },
    },
}

tools = [tool_create_directory]

In [54]:
import json

def run_terminal_task():
    messages = [
        {"role": "user", 
         "content": "Create a folder called 'pancakes-are-better-than-waffles'."}]
    tools = [tool_create_directory]  
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        tools=tools,
        tool_choice="auto",  # auto is default, but we'll be explicit
    )
    response_message = response.choices[0].message
    tool_calls = response_message.tool_calls
    # Step 2: check if the model wanted to call a function
    
    if tool_calls:
        # Step 3: call the function
        # Note: the JSON response may not always be valid; be sure to handle errors
        available_functions = {
            "create_directory": create_directory,
        }
        messages.append(response_message)
        # Step 4: send the info for each function call and function response to the model
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions[function_name]
            function_args = json.loads(tool_call.function.arguments)
            function_response = function_to_call(
                directory_name=function_args.get("directory_name"),
            )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )
        second_response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
        )
        return second_response

output = run_terminal_task()
output

ChatCompletion(id='chatcmpl-CO2ZtXCJHhWO2vgc6V4qmtDaD8QTR', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="A folder called 'pancakes-are-better-than-waffles' has been created successfully!", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1759845817, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier='default', system_fingerprint='fp_560af6e559', usage=CompletionUsage(completion_tokens=21, prompt_tokens=74, total_tokens=95, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))

In [55]:
output.choices[0].message.content

"A folder called 'pancakes-are-better-than-waffles' has been created successfully!"

In [56]:
!ls -d */ | grep waffles

[1m[36mpancakes-are-better-than-waffles/[m[m


Great! We implemented openai function calling for creating directories! We could evolve this approach but let's stop for now.

See more info on these examples from OpenAI's [official cookbook](https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models).

Now, let's implement the agent loop using a few LangChain components but implementing the loop logic ourselves!


In [14]:
from langchain_ollama import ChatOllama

llm_structured_output = ChatOllama(model="llama3.2", format="json", temperature=0)

llm_structured_output.invoke("Output names of cats!")

AIMessage(content='{ "Mittens":"A grey and white cat with bright green eyes", \n"Muffin": "A fluffy orange tabby with a sweet disposition",\n"Bella": "A sleek black cat with a shiny coat and playful personality",\n"Pussycat": "A mischievous calico cat with a love for adventure",\n"Salem": "A mysterious grey and white cat with piercing blue eyes" }', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-10-07T11:40:14.850466Z', 'done': True, 'done_reason': 'stop', 'total_duration': 2252068375, 'load_duration': 1324406667, 'prompt_eval_count': 30, 'prompt_eval_duration': 116381791, 'eval_count': 85, 'eval_duration': 810758625, 'model_name': 'llama3.2'}, id='run-58148630-58a5-4af5-991d-497e33055730-0', usage_metadata={'input_tokens': 30, 'output_tokens': 85, 'total_tokens': 115})

In [15]:
import os

def create_dir(folder_path):
    """
    Creates a directory given a folder path.
    """
    if not os.path.exists(folder_path):
        os.mkdir(folder_path)
    
    return f"Folder path was created at: {folder_path}"


create_dir("lucas-test-local-agent")

'Folder path was created at: lucas-test-local-agent'

In [17]:
!ls -d */ | grep lucas-test-local-agent

[1m[36mlucas-test-local-agent/[m[m


In [19]:
def create_file(file_path, contents=""):
    """
    Creates a file with content.
    If no content is provided it will create an empty file.
    """
    
    with open(file_path, "w") as f:
        f.write(contents)
    
    return f"A file was created at: {file_path}"

create_file("./lucas-test-local-agent/file-text.txt", "This is a test")

'A file was created at: ./lucas-test-local-agent/file-text.txt'

In [21]:
def read_file(file_path):
    """
    Reads from file given its path.
    """
    
    with open(file_path, "r") as f:
        contents = f.read()
    
    return contents

read_file("./lucas-test-local-agent/file-text.txt")

'This is a test'

In [22]:
tools = [create_dir, create_file, read_file]

llm_tools = llm_structured_output.bind_tools(tools)

In [23]:
from langchain_core.prompts import ChatPromptTemplate

system_prompt = f"""
You are an agent that helps users with desktop tasks like reading and writing files and creating directories. You either call a tool
from the options available: create_dir, create_file or read_file, or you return a summary of the tasks performed and at the end a string: END.
"""
prompt_template = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("human", "{input}")
])
llm_agent = prompt_template | llm_tools

In [24]:
llm_agent.invoke("Create file named lucas-loves-pancakes.md")

AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-10-07T11:41:27.367862Z', 'done': True, 'done_reason': 'stop', 'total_duration': 611825542, 'load_duration': 61231958, 'prompt_eval_count': 337, 'prompt_eval_duration': 287781709, 'eval_count': 29, 'eval_duration': 261916125, 'model_name': 'llama3.2'}, id='run-6941bc0a-dac8-4390-be39-fb3c4a920daa-0', tool_calls=[{'name': 'create_file', 'args': {'contents': '', 'file_path': 'lucas-loves-pancakes.md'}, 'id': '7f9968ba-273e-4415-8476-addec6a65a37', 'type': 'tool_call'}], usage_metadata={'input_tokens': 337, 'output_tokens': 29, 'total_tokens': 366})

In [25]:
output_tool_call = llm_agent.invoke("Create a file named ./test-agent.txt")
output_tool_call.tool_calls

[{'name': 'create_file',
  'args': {'file_path': './test-agent.txt'},
  'id': '353b0fba-8e0c-4b2d-9ae6-aed56fead13e',
  'type': 'tool_call'}]

In [26]:
tool_mapping = {
    "create_file": create_file,
    "create_dir": create_dir,
    "read_file": read_file
}

In [27]:
def llm_call(query, observations=[], actions_taken=[]):
    """
    Calls the llm, it can return a tool call with arguments for calling different tools,
    or an output to the user.
    """
    template = """
    Imagine you are a simple assistant tasked with managing a file system. You have access to three tools:
    
    1. create_dir: Creates a new directory.
    2. create_file: Creates a new file and optionally writes content to it.
    3. read_file: Reads the contents of an existing file.

    Based on the user input and your observations, choose an action to execute. Your action must follow these guidelines:
    
    * Action Guidelines *
    1) Only one action is allowed per iteration.
    2) Be concise and specific about what to create, write, or read.
    3) Provide clear reasoning for your action.
    4) Always consider the current context before taking action.
    
    Respond in the following format:
    Thought: {{Your reasoning here}}
    Action: {{Action name with parameters}}
    
    Available Actions:
    - create_dir [directory_name]
    - create_file [file_name]; [content (optional)]
    - read_file [file_name]
    """

    # Prepare the input for the LLM
    prompt = f"""
    {template}

    User Query: {query}
    Observations: {observations}
    Actions Taken So Far: {actions_taken}
    """

    output = llm_agent.invoke(prompt)

    if output.tool_calls:
        for tool_call in output.tool_calls:
            function_name = tool_call["name"]
            function_args = tool_call.get("args", [])  # default to empty list if "args" is missing

            # Check if function_name exists in tool_mapping
            if function_name not in tool_mapping:
                print(f"Error: Unknown function name '{function_name}'")
                continue

            try:
                # Attempt to call the function with the provided arguments
                tool_output = tool_mapping[function_name](**function_args)
                actions_taken.append(function_name)
                observations.append(tool_output)
            except TypeError as e:
                # Handle errors if the number of arguments does not match the expected number for the function
                print(f"Error: Invalid arguments for function '{function_name}': {e}")
    
    
    return output, actions_taken, observations

In [28]:
llm_call("Create a folder named: lucas-tests-tool-calling")

(AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-10-07T11:41:46.21828Z', 'done': True, 'done_reason': 'stop', 'total_duration': 519555250, 'load_duration': 62527833, 'prompt_eval_count': 562, 'prompt_eval_duration': 233172959, 'eval_count': 23, 'eval_duration': 222973500, 'model_name': 'llama3.2'}, id='run-15acb2d1-e774-4954-a5e8-4f219e1f405b-0', tool_calls=[{'name': 'create_dir', 'args': {'folder_path': 'lucas-tests-tool-calling'}, 'id': '67d21996-debd-4d4a-94ad-52be19289125', 'type': 'tool_call'}], usage_metadata={'input_tokens': 562, 'output_tokens': 23, 'total_tokens': 585}),
 ['create_dir'],
 ['Folder path was created at: lucas-tests-tool-calling'])

In [29]:
llm_call("Create a file at: ./lucas-tests-tool-calling/test1.txt with the contents: Hello Lucas! You just made a tool call!")

(AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-10-07T11:41:54.752509Z', 'done': True, 'done_reason': 'stop', 'total_duration': 588247667, 'load_duration': 53824667, 'prompt_eval_count': 595, 'prompt_eval_duration': 130760709, 'eval_count': 41, 'eval_duration': 402795166, 'model_name': 'llama3.2'}, id='run-c72da7e2-c27d-4879-add5-eb8b08f3917e-0', tool_calls=[{'name': 'create_file', 'args': {'contents': 'Hello Lucas! You just made a tool call!', 'file_path': './lucas-tests-tool-calling/test1.txt'}, 'id': '85e45cc1-c459-44b5-8b96-3f67e791bab5', 'type': 'tool_call'}], usage_metadata={'input_tokens': 595, 'output_tokens': 41, 'total_tokens': 636}),
 ['create_dir', 'create_file'],
 ['Folder path was created at: lucas-tests-tool-calling',
  'A file was created at: ./lucas-tests-tool-calling/test1.txt'])

In [30]:
llm_call("Read the file contents from  ./lucas-tests-tool-calling/test1.txt")

(AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-10-07T11:41:59.103084Z', 'done': True, 'done_reason': 'stop', 'total_duration': 459313708, 'load_duration': 62398000, 'prompt_eval_count': 605, 'prompt_eval_duration': 141910667, 'eval_count': 26, 'eval_duration': 254120000, 'model_name': 'llama3.2'}, id='run-2e8841bd-73d1-43b0-82c5-5c2810b2af6b-0', tool_calls=[{'name': 'read_file', 'args': {'file_path': './lucas-tests-tool-calling/test1.txt'}, 'id': '7aaf79a7-68ee-44dd-8b1a-d2b8398cd142', 'type': 'tool_call'}], usage_metadata={'input_tokens': 605, 'output_tokens': 26, 'total_tokens': 631}),
 ['create_dir', 'create_file', 'read_file'],
 ['Folder path was created at: lucas-tests-tool-calling',
  'A file was created at: ./lucas-tests-tool-calling/test1.txt',
  'Hello Lucas! You just made a tool call!'])

In [31]:
def agent_loop(query):
    iter_count = 0
    obs = []
    acts_taken = []
    while True:
        output,obs,acts_taken = llm_call(query, obs, acts_taken)
        print(output)
        iter_count+=1
        if iter_count>=3:
            print(f"Breaking after {iter_count} iterations")
            break
        if output.content!="":
            break
    
    return output.content


agent_loop("Create a folder in current directory named 'testing-multiple-calls' and inside that folder create a file named nested-file.txt")        

content='' additional_kwargs={} response_metadata={'model': 'llama3.2', 'created_at': '2025-10-07T11:42:03.880497Z', 'done': True, 'done_reason': 'stop', 'total_duration': 414921458, 'load_duration': 63947958, 'prompt_eval_count': 576, 'prompt_eval_duration': 126933833, 'eval_count': 22, 'eval_duration': 223179292, 'model_name': 'llama3.2'} id='run-6f3be690-dfd6-448c-82eb-992de8b92b37-0' tool_calls=[{'name': 'create_dir', 'args': {'folder_path': './testing-multiple-calls'}, 'id': '7d058ecf-a966-4a61-b3dd-87b640b40644', 'type': 'tool_call'}] usage_metadata={'input_tokens': 576, 'output_tokens': 22, 'total_tokens': 598}
content='' additional_kwargs={} response_metadata={'model': 'llama3.2', 'created_at': '2025-10-07T11:42:04.186755Z', 'done': True, 'done_reason': 'stop', 'total_duration': 304482208, 'load_duration': 42697041, 'prompt_eval_count': 592, 'prompt_eval_duration': 42391458, 'eval_count': 22, 'eval_duration': 218524959, 'model_name': 'llama3.2'} id='run-747065b3-7711-41e6-bce6-

''

# References

- [HuggingGPT](https://github.com/microsoft/JARVIS)
- [Gen Agents](https://arxiv.org/pdf/2304.03442.pdf)
- [WebGPT](https://www.semanticscholar.org/paper/WebGPT%3A-Browser-assisted-question-answering-with-Nakano-Hilton/2f3efe44083af91cef562c1a3451eee2f8601d22)
- [LangChain](https://python.langchain.com/docs/get_started/introduction)
- [OpenAI](https://openai.com/)
- [OpenAI Function Calling](https://platform.openai.com/docs/guides/function-calling)
- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT)
- [GPT-Engineer](https://github.com/gpt-engineer-org/gpt-engineer)
- [BabyAGI](https://github.com/yoheinakajima/babyagi)
- [Karpathy on Agents](https://www.youtube.com/watch?v=fqVLjtvWgq8)
- [ReACT Paper](https://arxiv.org/abs/2210.03629)