-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
agent that both proposes and executes tools #2223
Comments
Doesn't this work already? I frequently use register_function with the same agent assigned to both caller and executor; the latter is a little misleadingly named, because such agent doesn't even need a code execution configuration to use said tool. |
It works in an unorthodox way. In your case the same agent would be selected to speak twice, given that it is the only one with access to the tool in question. So assuming the default groupchat flexibility in agent selection, it works. The moment you start dictating agent order, though, it will fail. I have implemented a workaround to this issue with Society Of Mind agents (a single agent that is composed of multiple under the hood). One SoM agent is called that under the hood calls a tool caller and then a tool executor, returning the result. I could make a PR with it, but I feel a more integrated solution to this would be preferable as SoM is experimental. |
@shippy it works in group chat while it takes two messages in the group chat for tool proposal and tool execution. @WebsheetPlugin probably wants to encapsulate tool proposal and execution inside one agent's inner conversation and only returns a single message of the result to the outer chat. @GeorgSatyros thanks for sharing your experience. SoM Agent is experimental, while nested chat is in the core library. I suggest using nested chat to implement a new agent with the same functionality: https://microsoft.github.io/autogen/docs/tutorial/conversation-patterns#nested-chats |
Odd: pretty sure all my use cases have been in multi-agent chats with defined transitions between agents, where the tool-executing agent wasn't allowed to speak twice (I think - I'll double-check) |
@shippy I could have definitely elaborated more above, so let me fix that! @sonichi Sure, I was looking for an excuse to dive deeper into nested chats anyway! Is there any planned support for SoM going forward or will the "agent composed of agents" niche be fulfilled by nested chats? Asking because I was considering contributing to SoM and that effort may be better spent on the core component instead. |
Would be great to have a sample for this 👍🏼. |
I guess we already have it |
@GeorgSatyros it'll be great if you could make a PR to reimplement the SoM agent using nested chat. It'll be easier to maintain. The current SoM agent can retire after feature parity. |
@sonichi I agree, that would be a more graceful solution than deprecation. Will be opening a PR with a solution as soon as my schedule allows! |
Yes, I use speaker selection to decide the order of Agents. But again, for me, all this seemed unintuitive. And it's still not clear to me why another agent should execute the tool. Or why it should not be executed if selected. Is there a case where you want to select a tool by Agent A and execute it by Agent B or C or not execute it at all? |
For example, it makes it possible for Agent B to perform extra conversations with other agents or humans before executing. |
Ahhh, now I got it. I am acutaly counting each token twice :) so this would be not my use case, but finaly I understand it. Makes sense. Just an observation. I feel that Autogen is pivoted for use cases where many tokens are beings used, and it's kinda leaving out more simple use cases like the one mentioned above. |
I think the self-executing agent based on nested chat is what you need. What do you think? |
Can you point us to docs or a sample for this @sonichi ? Thx. |
OK, the same I linked to above 🤓 - thx! |
Maybe we can make a special agent class that does self-execution using nested chat out of the box. |
by default, tools are sent onward to be processed. The alternative (non nested) is to catch the tools calls before send, and run them, add results back to the original message. Nest chat is cleaner, but not the minimal case. The only negative is if you want the agent to loop: invoke tool, react to tool results, invoke tool, react to tool results... But 'invoke tool, add tool results to message that invoked them (clearing tool calls as processed from message so next agent won't redo them), and let someone else have a turn, that could the true minimal 'self-tool-executing' agent. |
@scruffynerf that is exactly what I'm trying to do at the moment. Quite easy if you are only interested in the tool output being within the "context" of the response. But adding and executing proper "tool_calls" into the message is much trickier as openai effectively requires 2 messages in the chat history per tool execution. As such you would need an agent that injects multiple messages in the conversation per call, and so they cannot be included in a "send". That is probably why a nested chat may be the more graceful solution in that case, where two agents are doing the execution under the hood instead. |
The method I'm using for 'toolsfortoolless' (adding tool response processing in pre/post API for models/services that don't support tool_calls) can be used for this, if it fits your use case. Unsure. #2966 (comment) In it, I hook 'process_message_before_send' to add the tool_calls. You could do the same to process the tool calls, that is to self_execute them, and remove the tool_call so nobody else sees it. While OpenAI api has trained in a 'tool call' OR 'content' behavior, they admit it's not actually a hardcoded OR, just trained in. Officially the spec does allows for both at once, I linked to a discussion saying so elsewhere. So you could even prompt it, to do something like this:
or maybe you want 2 known tool_calls merged: and then you get two tool_calls, and process the next free date function, which returns
And THAT text would be what everyone including the calling LLM would see, and it would look like that LLM did as asked. Seamlessly self-executed. That would have been at least 2-3 LLM calls: As I said, NOT ideal, but it might be a huge saving if you could reduce LLM calls in half, or more, for very fixed processes you can predict and merge together without the LLM managing it. not ideal, the call/response method of even a nested chat with a LLM-less tool executor proxy allows the LLM to take the results and pretty it up, but yes, it's extra calls back and forth. If you don't need that, you can shortcircuit it. |
I may even add an option to add 'self-executing' for toolsfortoolless, just to see how it works. |
Actually, mulling it over, making this a separate capability makes more sense. It's far easier to add 3-4 capabilities than have 1 with stuff you don't want, even if they all have flags to disable them |
I've been trying the state transitions feature and wanted to keep the graph as simple as possible, keeping possible transitions to 1 for much of the graph. My thoughts are
I came up with this solution # where applicable, make the same agent able to both invoke and execute the same function
agent.register_for_execution()(function)
agent.register_for_llm()(function)
def state_transition(last_speaker: Agent, groupchat: GroupChat):
messages = groupchat.messages
if "tool_calls" in messages[-1]:
called = messages[-1]["tool_calls"][0]["function"]["name"]
if called in last_speaker.function_map:
return last_speaker
return "auto"
groupchat = GroupChat(
...
allowed_or_disallowed_speaker_transitions=allowed_transitions,
speaker_transitions_type="allowed",
speaker_selection_method=state_transition,
) It's
I'm new to autogen, but would be hesitant to try nested chats given the big jump in complexity managing state for web-apps. This solution works well for me, keeping complexity low on both function calling and managing messages. |
I like the concept of Autogen, I like to use a couple of features of it. But I currently just need a simple tool executor.
I notice that Autogen heavily works on the concept of generating code and executing it. Does it not has an simple way of executing code directly after it has been "selected"? - So basically same like OpenAI Function calling? Does it all use function calling?
I ask that, because for my use case it's sometimes enough to just select the correct tool. No need of extra checks etc.
Originally posted by @WebsheetPlugin in #2208
Suggestion: Create a reply function `propose_and_execute_tools_nested_reply" which uses a nested chat between an AssistantAgent and a UserProxyAgent with human_input_mode="NEVER".
cc @qingyun-wu
The text was updated successfully, but these errors were encountered: