Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to send and receive messages interactively #142

Closed
wwj718 opened this issue Sep 11, 2023 · 14 comments
Closed

How to send and receive messages interactively #142

wwj718 opened this issue Sep 11, 2023 · 14 comments

Comments

@wwj718
Copy link
Contributor

wwj718 commented Sep 11, 2023

In previous versions, an agent was instantiated and then added to the space. The agent could be referenced from outside the space. We could "hold" a agent in an environment like Jupyter and interact with other agents through it. In the latest version, space is responsible for instantiating the agent. We can't seem to get a reference to an agent. How to interactively communicate with other agents in the space?

Another benefit of referencing the agent is that it can be programmed dynamically, which seems to be helpful for early exploration.

Isolating the agent in a space is very consistent with actor-style programming. Actors can only communicate with each other through messages and cannot call functions from each other.

In erlang/elixer, I remember that users can still send messages to specific actors in the repl environment.

@wwj718
Copy link
Contributor Author

wwj718 commented Sep 11, 2023

Can Space provide a method : get_agent_by_id? Whether externally referencing agent and dynamically calling its methods is compatible with the current process model.

@operand
Copy link
Owner

operand commented Sep 11, 2023

This was a fundamental change and I'm glad you're bringing it up for discussion. I probably should've explained more of the reasoning in the release notes.

To put it briefly, instantiating the agent in the main process prevents multiprocessing or other concurrent processing approaches that do not share a memory space with the main process. References to local objects are not possible across processes.

So the API needed to change to where an Agent is instantiated within its own process and communicates entirely via incoming and outgoing message queues. If you look at the Space implementations, you'll see that they now provide a queue type that its agents use for communication.

The good thing is that by abstracting communication to rely solely on queues, we allow more possibilities for distributed processing.

For example, one could create a new space type to distribute agents using a background job processing framework like Celery. In a situation like that it's even more clear that memory cannot be shared directly between the agent instances and the main application that launches them. This will also be the case when Javascript based agents are introduced.

So I do think that this is a better abstraction moving forward, and as you mentioned it's consistent with other Actor model implementations. That said, I agree that it makes certain things harder to do, like inspecting or altering an agent's state from another process. This is a common enough need that I'd like to explore how we might make this easier. I'm not sure yet what the best solution is.

I love your idea of creating a "proxy" agent that you can use to interact with an agent instance. I think something like that is a real possibility and it would help greatly with testing and debugging.

The multiprocessing library supports the concept of proxy objects that almost fit the requirements. The only problem with this is that it wouldn't work outside of a python environment. This wouldn't work for a JS based agent, so I think this approach is not worth pursuing.

I have a rough idea right now that might work reasonably well. I have it in mind to explore the idea of a common eval action on all agents, that would allow an agent to run arbitrary code within the context of another agent.

Sending this eval action could be done directly from the main process, for example:

space._route({
    "to": "my_agent",
    "action": {
        "name": "eval",
        "args": {
            "code": "print(self.my_property)"
        }
    }
})

Notice that we call space._route() directly to send a message into the space. This approach works today and I use this in the test suite.

The new request() method might also be useful for making this call synchronous. This would allow any code to be run in the context of another agent and for values to be returned, etc.

The limitations would be that arguments and return values need to be serializable. You also wouldn't be able to return a reference to an internal variable that you can modify directly.

I'm curious to hear your thoughts on all this. Does the reasoning for the API change make sense? And would something like the above meet your needs?

@wwj718
Copy link
Contributor Author

wwj718 commented Sep 12, 2023

Thank you for your detailed explanation. I basically agree with all your views and decisions here.

If you look at the Space implementations, you'll see that they now provide a queue type that its agents use for communication.
The good thing is that by abstracting communication to rely solely on queues, we allow more possibilities for distributed processing.

Architecturally, this is a very clear and thorough design. It's very much in line with the actor's philosophy.

I read the implementation details and I like the design of multiprocess_space.py as you said

The good thing is that by abstracting communication to rely solely on queues.

Should AMQPSpace inherit MultiprocessSpace? It seems better if _AgentAMQPProcess only exposes inbound_queue and outbound_queue, just like _AgentProcess does. Conceptually, it seems that we should move connection.Consumer to AMQPSpace. Mechanically, is it possible to dynamically add queue to connection.Consumer when adding space? Or use the rabbitmq wildcard "#" to subscribe to all topic messages and then route them to different agents. In this way, we only need to maintain one connection in the space, which is also easier to deal with problems such as disconnection and reconnection.

So I do think that this is a better abstraction moving forward

Completely agree 😄

This is a common enough need that I'd like to explore how we might make this easier. I'm not sure yet what the best solution is.

I think there might be two different requirement here:

  1. Communicate and interact with clients in the space. This requirement is relatively easy to meet. Since everything is just about sending and receiving messages, we can refer to the approach: receive_logs_topic.py: create an amqp client that can send and receive messages. It does not have to be an agent. At the message level, they are the same.

  2. The second requirement is that we want to dynamically build Agent in a REPL environment like jupyter. Due to the dynamic nature of Python, we can dynamically call methods for instances, or even dynamically add methods. This programming style is very popular in the Smalltalk/LISP community. But whether it can be used for actor-style programming, I have no idea. Personally, I'm very keen to be able to interactively build agents during the development phase.

I'm curious to hear your thoughts on all this. Does the reasoning for the API change make sense? And would something like the above meet your needs?

Sounds great! "Code execution (interpreter) support" seems likely to satisfy both of the above needs. Looking forward to seeing this prototype soon!

@operand
Copy link
Owner

operand commented Sep 12, 2023

Should AMQPSpace inherit MultiprocessSpace? It seems better if _AgentAMQPProcess only exposes inbound_queue and outbound_queue, just like _AgentProcess does. Conceptually, it seems that we should move connection.Consumer to AMQPSpace.

I agree. I plan to clean up the implementations as soon as possible. There's some obvious duplication.

Mechanically, is it possible to dynamically add queue to connection.Consumer when adding space? Or use the rabbitmq wildcard "#" to subscribe to all topic messages and then route them to different agents. In this way, we only need to maintain one connection in the space, which is also easier to deal with problems such as disconnection and reconnection.

That could be a way to do it, yes. I'll keep this in mind incase we find more reason to make that change.

Sounds great! "Code execution (interpreter) support" seems likely to satisfy both of the above needs. Looking forward to seeing this prototype soon!

Thanks! I can't say it'll be "soon" but definitely as soon as possible. The React app rewrite and JS client will take some time. But it should be easy enough to add an eval action to play around with the idea above in the meantime.

@wwj718
Copy link
Contributor Author

wwj718 commented Sep 14, 2023

The new request() method might also be useful for making this call synchronous.

Regarding request(), I have some doubts. request() introduces two message ids: request_id and response_id. Are they necessary? Or is it enough that we just use meta.id.

Jupyter's approach seems worth learning (Messaging in Jupyter), which uses msg_id for both request and send: each message has a uuid as msg_id by default. When the sender of the message receives the same msg_id, it knows that this is the return value (response), which is synchronous mode, if it does not care whether there is a return value, it is asynchronous mode. My SyncAgent basically adopts this idea. This brings the following benefits:

These places seem to be simpler and more consistent if we adopt a similar approach to jupyter.

@operand
Copy link
Owner

operand commented Sep 14, 2023

@wwj718 thanks for this feedback. I agree with your points.

Let me play around with an implementation that uses meta.id like you're suggesting, and we can discuss it. I'll open a PR for this as soon as I can and follow up here.

[edit] I paused working on a PR. See below.

@operand
Copy link
Owner

operand commented Sep 14, 2023

Okay now I remember why I separated the meta.request_id from meta.id. The reasoning was that I felt it was useful to identify when an incoming message is expecting a return value, as all request() messages do.

This problem stems from wanting to return the return value from an action. It comes down to the fact that in python, a method will have a return value of None by default, and there's no way to determine if None was intentionally returned via a return statement, or if it just doesn't return a value.

You can see where I deal with this issue on this line.

There are different ways we can handle this. A couple possibilities:

  1. We can always send a response with the return value after every message. This would result in a lot of extra messages being sent since every message would be followed by a response message. I think this is why I was unsure about this approach and opted to identify these messages differently.

    This may not be too bad though since responses get filtered in the _receive() method. If a given response does not correlate with a known request in _pending_responses, then it falls back to using the handle_action_value callback.

    This would work and be consistent, but with the downside of many extra messages and calls being made to handle_action_value.

  2. We could use a special format for the meta.id that would identify it as a request. In fact the request_id already does this by prepending request-- before the id.

    So in this approach the meta.id for a request would look something like request--123. And then we can use this fact to determine whether we send back the return value or not. e.g.

    if message["meta"]["id"].startswith("request--") or return_value is not None:
      # send back the return value

    In this approach we could also ignore return values unless the message is a request. This is possible today but I've made the assumption that if a non-None return value is present, that we should respond with it. We might want to be more explicit about the expectation here.

Regardless, I agree that we should probably get rid of the meta.request_id field and only use the meta.id field for identification. And I agree we should probably go ahead and populate meta.id by default from now on.

I'm not sure I agree with re-using the same meta.id in the response message though. I think they should be different fields where meta.response_id would refer to the original's meta.id. The response would have its own unique meta.id.

Looking at the jupyter documentation, it looks like the way they deal with this is to include the entire "parent header" which itself contains the original message id. So while we're not returning the entire header, the response_id should be enough to correlate messages on the client.

What do you think? I can work on a PR after we settle on what we'd like to do.

@wwj718
Copy link
Contributor Author

wwj718 commented Sep 15, 2023

You can see where I deal with this issue on this line.

It seems meaningful to return return_value by default. When return_value is None, None still carries information: it means that the action is completed. On the sender side, it decides whether to utilize this information: when it has request requirements, it can process this information, and when it does not care, it ignores it. From an encapsulation perspective, this knowledge is only retained inside the sender object.

We can always send a response with the return value after every message. This would result in a lot of extra messages being sent since every message would be followed by a response message.

Alan Kay said optimization is sweet trouble. Smalltalk is completely built on the idea that "everything is an object; objects communicate through messages; objects interpret messages that they understand." Within the language, hundreds of thousands of objects may be born and die every second, and millions of messages flow between objects. .Performance won't be an issue, and if we encounter one, the potential for optimization is huge.

We could use a special format for the meta.id that would identify it as a request.

It may be better to separate the semantics of the request from the semantics of the message id (separation of concerns). As you said "We might want to be more explicit about the expectation here". Such as we can add a request parameter in action.args request: true

Looking at the jupyter documentation, it looks like the way they deal with this is to include the entire "parent header" which itself contains the original message id.

It's a bit like the early days of agency.

@wwj718
Copy link
Contributor Author

wwj718 commented Sep 15, 2023

Performance won't be an issue, and if we encounter one, the potential for optimization is huge.

By switching rabbitmq to a message delivery system like Syndicate , performance can be greatly improved, and its performance is even enough to support the infrastructure of the operating system.

@operand
Copy link
Owner

operand commented Sep 15, 2023

woah. This is seriously great feedback! I think you nailed it.

It seems meaningful to return return_value by default. When return_value is None, None still carries information: it means that the action is completed.

Great point. It keeps it really simple and yeah.. it does feel like I'm prematurely optimizing.

Regarding Syndicate. I'm extremely interested. I wasn't aware of it. I was wondering that I can't be the first to want to build a language spanning actor model system, and there it is... And it already supports python and TypeScript!

I don't think I'm going to switch gears just yet, but I want to look into using Syndicate or other transport technologies pretty soon. I think after the React/JS updates, I'll be taking some time to reconsider architecture and tech. We're still in an experimental phase, but after some front-end tech is in place, then the broader shape of the framework is largely there, and I think that'll be a good time to reconsider technologies.

I'm going to put together a PR to clean up the request semantics to always return the response. I'll ping you to see if you want to take a look when it's ready.

In the future, I'm thinking that whenever bigger changes are coming, I'll make sure to put up a PR and leave it open for feedback for at least a day or two before merging. I think that'll be a better practice so that we can iron out issues like this before a release.

@wwj718 I really appreciate the feedback you've given here. You have a really great perspective and I'm so thankful you're thinking about this critically. I need that! :)

@wwj718
Copy link
Contributor Author

wwj718 commented Sep 15, 2023

I'm so happy that we finally reached a consensus on this topic.

I don't think I'm going to switch gears just yet, but I want to look into using Syndicate or other transport technologies pretty soon. I think after the React/JS updates, I'll be taking some time to reconsider architecture and tech. We're still in an experimental phase, but after some front-end tech is in place, then the broader shape of the framework is largely there, and I think that'll be a good time to reconsider technologies.

I very much agree with your considerations. At this stage, we pay more attention to the clarity of structure and concepts. It is easy to replace partial components.

I'm going to put together a PR to clean up the request semantics to always return the response. I'll ping you to see if you want to take a look when it's ready.

I'm always willing to participate in discussions 😄

@operand
Copy link
Owner

operand commented Sep 17, 2023

I'm in the middle of working on the request/response changes and I just wanted to share an idea I had regarding the other topic in this thread, the idea of a proxy agent.

While I think the eval approach above is still worthwhile I think we could come up with a way to allow instantiating an agent in the main process/thread similar to the behavior before. It might actually just be an option when creating the agent. I think one limitation might be that you can only do this with one agent at most, though I'm not sure.

So with this idea you could create a sort of "foreground" agent, and this would allow you to create applications that require interaction in the foreground, like a terminal process or something like that. It should work with the python repl as well.

I realize that running all agents in the background has made certain things difficult or even impossible. This approach would have made the Gradio app changes easier and I wouldn't have had to disable functionality, so I think it has real use cases. I'll spend some time exploring this after the request changes.

@operand
Copy link
Owner

operand commented Sep 18, 2023

PR #144 is up to improve the request behavior. I'm giving it a day or so to wait for feedback if anyone's inclined.

@operand
Copy link
Owner

operand commented Sep 18, 2023

@wwj718 @hidaris I opened a separate issue regarding the difficulties from not being able to instantiate in the main thread. #146

I'll need to tinker a little, but I think a solution should be possible.

@operand operand closed this as completed Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants