-
Notifications
You must be signed in to change notification settings - Fork 43.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Command Base Class Interface #3824
Command Base Class Interface #3824
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
there's over a handful of discussions and RFEs now discussing ideas to let Auto-GPT evolve itself by writing its own commands/plugins in one way or another. Right now, people are already doing the same thing anyway, it's just not as "direct", i.e. goes through separate steps of writing python code or executing shell commands. But we could just as well allow the system to write its own commands and then take it from there. This may sound like over-engineering to some folks, but that way, unnecessary looping (recurring issue here) could be prevented, too - because the system could simply extend itself with custom commands as needed. Possibly using a "namespace" based notation - i.e. namespace.command( [args] ) - because there are already many people tinkering with massive command lists, and others tinkering with the idea of introducing categories of commands. Thus, requiring commands to use a namespace prefix, would at least help organize the whole system a bit more, while also giving additional hints to the LLM when selecting commands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments.
We also keep seeing the LLM not replacing arguments or hallucinating imaginary command names (#1894) - so, it would probably make sense to check if the command exists and if the arguments were provided/replaced or not.
Here's some related background regarding issues in the current command system:
(these are primarily about gathering sufficient contextual information in order to deal with running commands via ssh on different hosts, using different platforms/OS, as well as different tools / tool versions) |
description (str): A brief description of what the command does. | ||
signature (str): The signature of the function that theicommand executes. Defaults to None. | ||
""" | ||
def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could become a good place for alternate names (aliases) to live for hallucinated command names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting that these should then also be added to the command register?
class Command(abc.ABC):
def __init__(
self,
name: str,
description: str,
method: Callable[..., Any],
signature: str = "",
enabled: bool = True,
disabled_reason: Optional[str] = None
alternative_names: Optional[List[str]] = None
):
class CommandRegistry(abc.ABC):
...
def register_command(self, cmd: Command) -> None:
self.commands[cmd.name] = cmd
for alt_names in cmd.alternative_names:
self.commands[alt_names] = cmd
I haven't personally seen this happen too much, I also wonder if we'll get too many similar names in the command register. Happy to add it in though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it should be handled the otherway as a fallback lookup for commands the ai issues that don't exist
Added to convo #3856. Should serve as a good example of what to do to avoid issues caused by poor IoC. |
@Boostrix - thanks for the information on both comments, I really like the command validator. I'm trying to keep the PRs very small and atomic, so I'm happy to add this in but I'm not sure if that is the purpose of this scope of work. @ntindle - I'm not sure what the protocol here is so happy to go either way on this. |
not disagreenig, just wanted to keep you in the loop - since there's so much redundant work being done currently, some features having up to 5 PRs by 5 different people - so sorry for the noise, but that's definitely a problem |
@anonhostpi - thanks for the comment! I am uncertain how to action this though. Do let me know :D |
Just keep doing what you are doing. Decoupling the source code like you guys are with the re-arch should help streamline future PRs open source contributions |
We for sure want command validation, but as mentioned, it's outside the scope, and I'm not even sure recovering from picking a bad command is the job of the command registry (I'd say the fault is on the language model side). Really appreciate the collation of the issues though, that's super helpful. |
@DarylRodrigo If we don't have a chance to chat, I'll try to follow up more later. But what we're after here is not the port of the implementation, which will come in a second round of PRs, but a definition of the interfaces (see the other re-arch PRs) that will be able to capture the essential logic of the existing system, allows for easy interoperability with the other system, and removes our dependence on global state. This is actually a pretty tricky, as I think the current implementation in the repo is a bit challenging. We have the extra concern of trying to get the command system work well with plugins, which relies on us figuring out with @ntindle exactly what the plugin interface should look like. I'll try to connect with you both for discussion, and will document what I'm thinking in PRs otherwise. |
def __pre_call__(self, *args, **kwargs) -> Tuple[Any, Dict[str, Any]]: | ||
# Return the unmodified *args and **kwargs by default, as a tuple and a dictionary | ||
return args, kwargs | ||
|
||
def __post_call__(self, command_result) -> CommandResult: | ||
# Return the unmodified command_result by default | ||
return command_result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if I wrapped your call via the plugin system with a decorator? IDK
Right, I do realize that those related RFEs are out of scope for something as basic as an abstract interface class, just wanted to be sure that people working on the underlying infrastructure, are made aware of ongoing talks elsewhere. Maybe, some sort of meta/draft PR would be better to collate related talks there ? The point being, there's a really thought provoking analysis from @valayDave that I'd consider "recommmended reading" for anyone considering to tackle improving the command system: #2987 (comment) It would be a pity if someone were to revisit the command system without being aware such thoughtful analyses. |
6017090
into
Significant-Gravitas:agent-state-encapsulation
all commands would ideally come with their own pytest based test, so that people could easily check whether their commands are working for diagnostic purposes - also, even at runtime that would be useful if the system could annotate its prompt to indicate the number of successful/failed executions (duration etc ) of a command to use that this could be used by a fitness function that learns which commands are working well, and which ones aren't. background: All commands should be annotated to encode their "costs" (such as execution duration, but also other resource utilization such as CPU/RAM requirements) That way, we can run a local fitness function, as per: If this is done at the action/command level, and the system has multiple actions available, this may lead to interesting behavior such as favoring a certain option over another one (think redis over pinecone). Or it preferring wget/curl over selenium based browsing for some websites to use less resources. |
FWIW, I have locally experimented with this to make it support not only return actual results, but a tuple of (result, errorcode) - the point being, we would then be able to chain commands together that way, and even execute loops by registering loops/conditions as arguments. You mentioned pre/post hooks: if we could chain commands together, this could work analogous to an actual shell: do_some_check && list_files With this sort of scheme, it would not be far-fetched to allow new commands to be registered on top of existings ones (or scripts), we could even allow python scripts to be registered as commands - that way, there would be much less trial and error once a script works and is made available that way. |
Overview
Changes
To discuss
PR Quality Checklist
Notes: