Nascent AGI architectures like BabyAGI and AutoGPT have captured a great deal of public interest by demonstrating LLMs' agentic capabilities and capacity for introspective step-by-step reasoning. As proofs-of-concept, they make great strides, but a few things wanting.
The primary contributions I would like to make are twofold:
- Allowing an LLM to read from a corpus of information and act according to that information.
- Enabling more robust reproducibility and modularity.
Conceptually, the user provides the AgenticGPT
agent with an objective and a list of Action
s that it can take, and then the agent figures out the rest, and asks the user for clarification when it needs help.
You can contribute or use agents built on top of AgenticGPT
in the registry. Right now the following AgenticGPT
-based agents exist.
PlaywrightAgent
: UsesAgenticGPT
to automate browser actions.
pip install -r requirements.txt
- Set up your OpenAI API key as the
OPENAI_API_KEY
environment variable.
AgenticGPT can be instantiated with the following signature:
AgenticGPT(
objective,
actions_available=[],
memory_dict={},
model="gpt-3.5-turbo",
embedding_model="text-embedding-ada-002",
ask_user_fn=ask_user_to_clarify,
max_steps=100,
verbose=False,
)
All you have to do is give it a string objective
, define a list of Action
s, and optionally give it a memory_dict
of name
to text
for it to remember. The agent is equipped with a few Action
s by default, such as being able to ask you for clarification if necessary, and a memory which it can query to help achieve its objectives.
See some examples. TODO: If you want to run the examples, you have to move them into the root folder and then run python <example_file>.py
.
Example:
"""Example of using AgenticGPT to make a folder on the filesystem.
Also demonstrates how to ask the user for input and use it in the agent's
thoughts."""
import os
from agentic_gpt.agent import AgenticGPT
from agentic_gpt.agent.action import Action
def mkdir(folder):
os.mkdir(folder)
actions = [
Action(
name="mkdir", description="Make a folder on the filesystem.", function=mkdir
)
]
agent = AgenticGPT(
"Ask the user what folder they want to make and make it for them.",
actions_available=actions,
)
agent.run()
Action
s are instantiated with a name
, description
, and function
. The name
, description
, and function signature are then injected into the agent prompt to tell them what they can do. Action
results are stored in context as variables, unless a dict answer is given with {"context": }
which sets the context accordingly.
You can then save the steps that the LLM generated using
agent.save_actions_taken("mkdir.json")
and reuse it using:
agent = AgenticGPT(
"Ask the user what folder they want to make and make it for them.",
actions_available=actions,
)
agent.from_saved_actions("mkdir.json")
agent.replay()
See request for comment for the original motivation for and considerations around building this.
TODO, add a diagram and explanation
I took some notes while building this project to document some of my thoughts on working with and taming LLMs. If you're curious about the development journey (and maybe some prompt engineering secret sauce), check out the linked journal.
- Memory instantiation and routing of queries.
- Add "query memory" and "add to memory" default functions.
- Save and load routine to file.
- Write some initial docs. Be sure to add emojis because people can't get enough emojis.
- Create and document examples. Start setting up a library of actions.
- Support sentencetransformers and gpt-4.
- Don't make it incumbent on the user to make Actions return a context.
- Figure out a more modular way to solicit the user for feedback, maybe a default
ask_user_to_clarify
hook. - Retry when there is an error.
- Create logic to condense context window if you get an error from the API.
- Create chatbot mode where it stops after every step and asks you how it's doing.
- Make some diagrams describing the architecture.
- Put on pypi.
- Test Memory functions: adding a document, querying all, loading document.
- Be careful what tools you expose to the agent, because it is running autonomously.
- Be careful what data you expose to the agent, because it may be processed by the LLM APIs under the hood.