Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote message broadcasting #92

Closed
stuartcarnie opened this issue Feb 8, 2017 · 7 comments
Closed

Remote message broadcasting #92

stuartcarnie opened this issue Feb 8, 2017 · 7 comments

Comments

@stuartcarnie
Copy link
Contributor

stuartcarnie commented Feb 8, 2017

Abstract

Sending a single message to multiple remote PIDs can consume potentially large amounts of memory and CPU resources. We propose to add support for batching these messages so they can be serialized and deserialized once per node.

Background

Broadcast Messages

Group routers allow the user to add arbitrary PIDs to receive messages based on the routing strategy of the router. These PIDs could reside on many remote nodes. Certain scenarios allow a single message to be forwarded to all of the PIDs, which can be very inefficient if many of them are remote. In the case of a broadcast router, messages are always forwarded to all the routees. In addition, a router that receives a router.Broadcast message will forward the inner message to all its routees.

As an example, a group broadcast router with 1,000 PIDs residing across 10 remote nodes would result in a single message being serialized and deserialized 100 times on each node. In addition, the source node will have to serialize the message 1,000 times. This equates to a potentially large increase in memory and CPU resource usage on all involved nodes.

Proposal

BroadcastEnvelope

We propose creating a new message which represents a broadcast of an inner message to a set of PIDs on a remote node:

message BroadcastEnvelope {
  string          typeName = 1;
  bytes           messageData = 2;
  repeated string IDs = 3;
}

The typeName and messageData represent the serialized protobuf message that will be reconstituted at the remote and delivered to the local PIDs. The IDs array is the Id component of the PID, given the Address is redundant.

An actor on the remote node will deliver the inner message to all PIDs listed in the IDs array.

PIDSet.Tell

Add Tell method to PIDSet, allowing efficient broadcast of messages to a set of PIDs

@cpx86
Copy link
Contributor

cpx86 commented Feb 8, 2017

In think we should synchronize this with the other idea of wrapping all messages in an envelope as proposed in #69.

For example, how would we handle message headers for a broadcasted message sent over the wire? Could we build broadcasting into the general message envelope by having a list of recipients? Or would it be better to wrap one of the envelopes inside the other? (and if so, in which order?)

I'm leaning towards either merging the two envelopes, or wrapping the broadcast envelope in the message envelope.

@stuartcarnie
Copy link
Contributor Author

I think the two are separate. I see BroadcastEnvelope as an implementation detail for optimizing the delivery of a message to multiple PIDs that exist on one or more remote nodes. We would generate an envelope for each remote node, with the subset of PIDs residing on that node.

As I understand it, a Message is sent via Tell, Request, etc, to a single PID. There could be middleware that inspects a header with a list of PIDs to forward this message, but I don't see that as the same thing. Something worth noting is if the header used a PIDSet to broadcast the message, it should benefit from the same optimizations based on the proposed PIDSet.Tell API.

@rogeralsing
Copy link
Collaborator

As I see this, there are two ways this could be implemented:

  1. Which I guess is what @cpx86 is leaning towards.

We have a LocalEnvelope, a RemoteEnvelope and a BroadcastEnvelop. all of which carry the message header information to the receiver.

Or

  1. We separate them to be only LocalEnvelope and RemoteEnvelope, and make the BroadcastEnvelope a separate thing with no headers.

This would mean that if we broadcast to a remote node, the broadcast envelope would have to contain a RemoteEnvelope internally. in order to also carry the header information.

@rogeralsing
Copy link
Collaborator

One thing we could do, is to let the current MessageEnvelope (in Remote) take a list of ID's instead of a PID target.

So a message could be sent to 1-n actors on a remote nod.

Not sure how this would affect allocations in the default (1 target) case.
But in theory, there is not really any different between having 1 or more targets

@stuartcarnie
Copy link
Contributor Author

Q: What is the difference between a LocalEnvelope and a RemoteEnvelope?

MessageEnvelop could work – as it also represents a single message to be delivered. It could be optimized to take one or more Ids instead of a single PID, which I doubt would add noticeable overhead.

@rogeralsing
Copy link
Collaborator

A LocalEnvelope could carry a Message interface{}, a remote envelope must serialize.

@rogeralsing
Copy link
Collaborator

Not planned for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants