# Why Agents?

What is an agent?

An agent is a reference to combining models that can perform some kind of reasoning, like large language models (e.g ChatGPT, Llama2, Mistral, etc...) with tools to give it access to the real world,
so they can do things like browsing the internet, buying stuff, etc...

Ok, so, why is there so much hype around agents right now?

Because Agents are cool! Recently with the advance of LLMs, we've seen them become an amazing tool to do all sorts of things like building apps, browse the internet and more.



SOme neat examples of these kinds of agents can be found in here:

- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT)
- [GPT-Engineer](https://github.com/gpt-engineer-org/gpt-engineer)
- [BabyAGI](https://github.com/yoheinakajima/babyagi)

Now, today although they seem extremely powerful, agents are still at a very early stage in terms of readiness to deploy as products, something you can atest by listening to Andrej Karpathy talk about agents in this talk here:

- [Karpathy on Agents](https://www.youtube.com/watch?v=fqVLjtvWgq8)

This live-training is all about! Getting you excited about this amazing new technology, understanding it from the ground up but with a focus on practical applications and fun stuff you can do with them! 

# What is an Agent?

An agent is nothing more than some entity that can _think_ and _act_, that's right, in a way you're an agent! 

After all you can think and act on those thoughts like in the case of coming to this live-training:

- Thought: "I want to learn about agents"

- Action: "Go to the internet and research cool platforms where I can learn about agents"

- Thought: "O'Reilly has some awesome courses and live-trainings"

- Action: "Look up O'Reilly courses"

- Thought: "Live-trainings by instructor Lucas are awesome"

- Action: "Schedule live-training about agents with instructor Lucas Soares" (lol)

In a way this is a simplified rendition of what brought you here, obviously not necessarily in this particular order nor these particular sets of thought and action pairs. This particular way of thinking about how to structure thoughts and actions is well represented in the paper: [ReACT](https://arxiv.org/pdf/2210.03629.pdf). 

With regards to LLMs, how can bring this idea to fruition thinking about the LLM model as the reasoning and thinking engine?

We can start simple and just call the openai API to start:

In [1]:
# !pip install python-dotenv
# !pip install "langchain[all]"
# !pip install openai
# !pip install tiktoken
# !pip install pypdf

In [2]:
import openai
import os
from dotenv import load_dotenv

load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")

In [1]:
from openai import OpenAI
from IPython.display import Markdown
client = OpenAI()

def get_response(prompt_question):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo-16k",
        messages=[{"role": "system", "content": "You are a helpful research and programming assistant"},
                  {"role": "user", "content": prompt_question}]
    )
    
    return response.choices[0].message.content

output = get_response("Create a simple task list of 3 desktop things I can do on the terminal.")
Markdown(output)

1. View the file contents: You can use the command "cat" followed by the file name to display the contents of a file on the terminal. For example, if you have a file called "notes.txt", you can type "cat notes.txt" to see its contents.

2. Copy files: You can use the command "cp" followed by the source file path and the destination file path to copy files. For example, if you want to copy a file called "document.txt" from the current directory to the "/Desktop" directory, you can type "cp document.txt /Desktop".

3. List files and directories: You can use the command "ls" to list the files and directories in your current directory. By default, it will display a simple list of names, but you can add various options (such as "-l" for a detailed view or "-a" to show hidden files) to customize the output. For example, you can type "ls -l" to see a detailed list of files and directories in the current directory.

Ok cool, so here we have three ideas of actions to perform:

- Creating directories
- Listing files
- Removing files

Let's transform them into functions that we could call just like in any type of Python-based application.

In [2]:
import subprocess

def create_directory():
    subprocess.run(["mkdir", "test"])

def create_file():
    subprocess.run(["touch", "test.txt"])

def list_files():
    subprocess.run(["ls"])

Notice that here we are not concerned with giving these functions arguments to avoid complicating things, the only thing we should focus on is to have the ability to perform the actions outside the narrow scope of the python script out there in the real-world.

Let's test these functions

In [5]:
!ls

1.0-intro-agents.ipynb
1.1-intro-agents-openai-functions.ipynb
1.2-intro-langchain.ipynb
2.0-building-llm-agents-with-langchain.ipynb
2.1-building-agents-with-langchain-and-LCEL-interface.ipynb
3.0-langchain-github-agent-prototype.ipynb
4.0-building-a-simple-research-agent.ipynb
5.0-langchain-deploy-chat-with-website.ipynb
5.1-langchain-deploy-agent.ipynb
[1m[36magent-deploy[m[m
[1m[36massets-resources[m[m
[1m[36mchat-with-pdf[m[m
function_calls_chatgpt.py
[1m[36mlangchain_agents[m[m
requirements.txt
research-assistant.py
sample_app_to_test_github_agent.py


Ok, we start with this folder containing a few notebooks, scripts and sub-directories.

Let's test each function and inspect its output behavior:

In [6]:
create_directory()

!ls

1.0-intro-agents.ipynb
1.1-intro-agents-openai-functions.ipynb
1.2-intro-langchain.ipynb
2.0-building-llm-agents-with-langchain.ipynb
2.1-building-agents-with-langchain-and-LCEL-interface.ipynb
3.0-langchain-github-agent-prototype.ipynb
4.0-building-a-simple-research-agent.ipynb
5.0-langchain-deploy-chat-with-website.ipynb
5.1-langchain-deploy-agent.ipynb
[1m[36magent-deploy[m[m
[1m[36massets-resources[m[m
[1m[36mchat-with-pdf[m[m
function_calls_chatgpt.py
[1m[36mlangchain_agents[m[m
requirements.txt
research-assistant.py
sample_app_to_test_github_agent.py
[1m[36mtest[m[m


In [7]:
create_file()

!ls

1.0-intro-agents.ipynb
1.1-intro-agents-openai-functions.ipynb
1.2-intro-langchain.ipynb
2.0-building-llm-agents-with-langchain.ipynb
2.1-building-agents-with-langchain-and-LCEL-interface.ipynb
3.0-langchain-github-agent-prototype.ipynb
4.0-building-a-simple-research-agent.ipynb
5.0-langchain-deploy-chat-with-website.ipynb
5.1-langchain-deploy-agent.ipynb
[1m[36magent-deploy[m[m
[1m[36massets-resources[m[m
[1m[36mchat-with-pdf[m[m
function_calls_chatgpt.py
[1m[36mlangchain_agents[m[m
requirements.txt
research-assistant.py
sample_app_to_test_github_agent.py
[1m[36mtest[m[m
test.txt


In [8]:
list_files()

1.0-intro-agents.ipynb
1.1-intro-agents-openai-functions.ipynb
1.2-intro-langchain.ipynb
2.0-building-llm-agents-with-langchain.ipynb
2.1-building-agents-with-langchain-and-LCEL-interface.ipynb
3.0-langchain-github-agent-prototype.ipynb
4.0-building-a-simple-research-agent.ipynb
5.0-langchain-deploy-chat-with-website.ipynb
5.1-langchain-deploy-agent.ipynb
[1m[36magent-deploy[m[m
[1m[36massets-resources[m[m
[1m[36mchat-with-pdf[m[m
function_calls_chatgpt.py
[1m[36mlangchain_agents[m[m
requirements.txt
research-assistant.py
sample_app_to_test_github_agent.py
[1m[36mtest[m[m
test.txt


Ok great all the functions are working!

Now, let's imagine that we wanted to create an agent that would perform these actions for us based on some input that we give it, how can we connect models that we know and can use today like ChatGPT, with these tools that do stuff in the real world?

To answer this question, how about we give a task to the model, and for that task we ask it to list the steps that it needs to perform to complete the task, and then for each of those steps we would ask the model to decide whether or not a function should be called to execute that task? 

This is what the now famous paper ['Toolformer'](https://arxiv.org/pdf/2302.04761.pdf) demonstrated!

They showed that today's advanced LLMs like the gpt-series could teacha themselves how to properly call and use external tools!

Isn't that awesome???

So, let's see if we can hack our way into connecting the llm response with the functions that we want that llm to use.

In [3]:
def get_response(prompt_question):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo-1106",
        messages=[{"role": "system", "content": "You are a helpful research and programming assistant"},
                {"role": "user", "content": prompt_question}]
    )
    
    return response.choices[0].message.content

def create_directory(directory_name):
    subprocess.run(["mkdir", directory_name])

def create_file(file_name):
    subprocess.run(["touch", file_name])

def list_files():
    subprocess.run(["ls"])

OK, cool! now Notice that, here we added single parameters to the functions: `create_directory(), create_file()`, and we did this so
that we can actually do real things instead of just always creating the same folders over and over.

Now, how can we actually put it all together so that given a task, a model can:

- Plan the task
- Execute actions to complete the task
- Know when to call a function

????

This is actually an interesting problem, let's understand why is that the case by trying to hack our way into putting all of these together:

In [4]:
def get_response(prompt_question, model="gpt-3.5-turbo-16k"):
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "system", "content": "You are a helpful research and programming assistant"},
                {"role": "user", "content": prompt_question}]
    )
    
    return response.choices[0].message.content

def create_directory(directory_name):
    subprocess.run(["mkdir", directory_name])

def create_file(file_name):
    subprocess.run(["touch", file_name])

def list_files():
    subprocess.run(["ls"])

    

task_description = "Create a folder called 'lucas-the-agent-master'. Inside that folder, create a file called 'the-10-master-rules.md"
output = get_response(f"""Given this task: {task_description}, \n
                            Consider you have access to the following functions:
                            
    def create_directory(directory_name):
        '''Function that creates a directory given a directory name.'''
        subprocess.run(["mkdir", directory_name])
    
    def create_file(file_name):
        '''Function that creates a file given a file name.'''
        subprocess.run(["touch", file_name])
    
    def list_files():
       '''Function that lists all files in the current directory.'''
        subprocess.run(["ls"])
    
    Your output should be the first function to be executed to complete the task containing the necessary arguments.
    The OUTPUT SHOULD ONLY BE THE PYTHON FUNCTION CALL and NOTHING ELSE.
    """)

Markdown(output)

create_directory('lucas-the-agent-master')

Hey! Look at that the output is that function! Now, all we need is find a way to execute this function. We can use Python's built in `exec` method for that:

In [6]:
exec(output)

In [7]:
!ls -d */ | grep lucas

[1m[36mlucas-the-agent-master/[m[m


Yessss! We did it! All we had to do is to use the Python builtin method `exec` connected with the function call we got from the model's response!

This is great, but what if we wanted to perform multiple actions?

How about changing our prompt so that our output is a python list of function calls which we can later programmatically call?

Let's try that:

In [13]:
task_description = "Create a folder called 'lucas-the-agent-master'. Inside that folder create a file called 'the-10-master-rules.md'."
output = get_response(f"""Given a task that will be fed as input, and consider you have access to the following functions:
                            
    def create_directory(directory_name):
        '''Function that creates a directory given a directory name.'''
        subprocess.run(["mkdir", directory_name])
    
    def create_file(file_name):
        '''Function that creates a file given a file name.'''
        subprocess.run(["touch", file_name])
    
    def list_files():
       '''Function that lists all files in the current directory.'''
        subprocess.run(["ls"])  
    .
    Your output should be the a list of function calls to be executed to complete the task containing the necessary arguments.
    For example:
    
    task: 'create a folder named test-dir'
    output_list: [create_directory('test-dir')]
    
    task: 'create a file named file.txt'
    output_list: [create_file('file.txt')]
    
    task: 'Create a folder named lucas-dir and inside that folder create a file named lucas-file.txt'
    output_list: [create_directory('lucas-dir'), create_file('lucas-dir/lucas-file.txt')]
    
    The OUTPUT SHOULD ONLY BE A PYTHON LIST WITH THE FUNCTION CALLS INSIDE and NOTHING ELSE.
    task: {task_description}
    output_list:\n
    """)

Markdown(output)

[
  create_directory('lucas-the-agent-master'),
  create_file('lucas-the-agent-master/the-10-master-rules.md')
]

In [14]:
exec(output)

mkdir: lucas-the-agent-master: File exists


In [15]:
!ls lucas-the-agent-master/

the-10-master-rules.md


At this point we can start identifying a lot of issues with this approach despite our early sucess:

- Uncertainty of model's outputs can affect our ability to reliably call the functions
- We need more structured ways to prepare the inputs of the function calls
- We need better ways to put everything together (just feeding the entire functions like this makes it a very clunky and non-scalable framework for more complex cases)

There are many more issues but starting with these, we can now look at frameworks and see how they fix these issues and with that in mind understand what is behind their implementations!

I personally think this is a much better way to understand what is going on behind agents in practice rather than just use the more higher level frameworks right of the bat!

# References

- [HuggingGPT](https://github.com/microsoft/JARVIS)
- [Gen Agents](https://arxiv.org/pdf/2304.03442.pdf)
- [WebGPT](https://www.semanticscholar.org/paper/WebGPT%3A-Browser-assisted-question-answering-with-Nakano-Hilton/2f3efe44083af91cef562c1a3451eee2f8601d22)
- [LangChain](https://python.langchain.com/docs/get_started/introduction)
- [OpenAI](https://openai.com/)
- [OpenAI Function Calling](https://platform.openai.com/docs/guides/function-calling)
- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT)
- [GPT-Engineer](https://github.com/gpt-engineer-org/gpt-engineer)
- [BabyAGI](https://github.com/yoheinakajima/babyagi)
- [Karpathy on Agents](https://www.youtube.com/watch?v=fqVLjtvWgq8)
- [ReACT Paper](https://arxiv.org/abs/2210.03629)