Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

sync_send: retry feature #68

Closed
mavam opened this Issue Aug 31, 2012 · 9 comments

Comments

Projects
None yet
2 participants
Member

mavam commented Aug 31, 2012

Do you think it makes sense to add a retry feature to sync_send? In the same vein as after, the user could specify how many times the runtime should try to send the same message again by adding a retry(n) partial function.

Such a feature would allow for more resilience in a distributed systems where it is expected that not all messages arrive.

Owner

Neverlord commented Sep 4, 2012

I'm not quite sure what do you mean with retry. libcppa uses only reliable messaging for messages between actors. One could implement a group communication module based on UDP, but you cannot send a synchronous messag to a group anyway. All messages send to actors directly will eventually arrive unless you've lost the connection to the receiving node,

If you want to retry a message because the receiver didn't handle it in time, you could just increase the timeout.

Member

mavam commented Sep 5, 2012

All messages send to actors directly will eventually arrive unless you've lost the connection to the receiving node

What about indirect sends? Consider the actor chain A -> B -> C -> D, where A performs a sync_send and B and C just do forward_to until the message hits D. A expects a reply to its sync_send, but if C goes down it has no direct way of figuring out what's going on. I understand that a timeout specification here avoids a deadlock.

After the timeout in A fires, a common reaction would be to try again sending the message. However, this would mean the user has manage tuples, buffering them until the "ACK" arrives. I was imagining a libcppa mechanism where the user can tell libcppa to keep the message until a non-timeout reply comes in (or until some other condition becomes true). Does that make a little more sense?

Owner

Neverlord commented Sep 6, 2012

Ok, I can see your point now. This is related to #51. If a synchronous messages could not be delivered, an error message should be sent. However, consider this example:

sync_send(foo, atom("bar")).then (
  on_arg_match >> [](const string& answer) {
    cout << "the answer is " << answer << ", my friend" << endl;
  },
  on("FAILURE", arg_match) >> [=](sync_error err) {
    cout  << "failure during sync_send: " << to_string(err) << endl;
    if (last_sender() != foo) {
      // message was forwarded from foo to another actor
    }
    else {
      // foo terminated
    }
  },
  after... // should become optional, fires only if message was delivered but not handled
);

How would you avoid lifelocks in the presence of a retry feature? If foo does not notice that it forwards to an unreachable / terminated actor, your actor would end up sending the same message over and over again. You might retry it manually once in case you didn't received the error from foo. But an automatic retry feature would make it too easy to create endless loops, would it not?

Member

mavam commented Sep 6, 2012

I agree that a plain retry would make it too easy to create livelocks. Going through a FAILURE handler makes the error status more explicit, which I like.

However, inside the error handler, I still see value in providing a function resend(foo, ...), retry(...), or perhaps just calling last_sent() in order to access the last synchronous message. Do you think that buffering the message until a reply arrives is not warranted? If not, do you see a clean way to make this optional? Otherwise, one would have to burden the user with keeping a copy of the tuple. But I would argue it's enough of a common scenario to support it.

Owner

Neverlord commented Oct 29, 2012

From my personal experience using libcppa, I'm using message handlers with mutable references quite often. Especially whenever I'm sending a string to an actor that the receiver then moves into a container, e.g., to aggregate replies from several sources. An automatic buffering of the last sent message would always force a copy instead of a move, which can drastically impact performance.
Making such a feature optional - by providing a mixin - would require some kind of indirection for outgoing messages. I'm not strictly opposed to enabling such "hooks", but I'm not quite sure whether this can be done without performance impact in case the feature is "turned off".

Member

mavam commented Oct 29, 2012

From my personal experience using libcppa, I'm using message handlers with mutable references quite often. Especially whenever I'm sending a string to an actor that the receiver then moves into a container, e.g., to aggregate replies from several sources.

Interesting. Do you mean that you move the tuple contents (e.g., a string) of last_dequeued on the receiver end into a container to enable dataflow-programming-like semantics without copying (i.e., pushing data through a pipeline)? I always assumed that one shall not tamper with the tuple because it could potentially be shared by multiple actors, but misunderstand what you are doing.

Owner

Neverlord commented Oct 30, 2012

Well, any_tuple has a copy-on-write semantic. Whenever an actor demands write acces to a tuple, it will (a) check whether the reference count is exactly 1, and (b) copy the tuple before modifying it if its reference count is > 1. "Demanding write access" is either calling any_tuple::get_as_mutable or taking a mutable reference in your message handler.

on(atom("doStuff"), arg_match) >> [=](string& stuff) { // mutable ref
    this->todo_list.push_back(move(stuff));
    // don't access last_dequeued() or stuff for anything else in this handler
}

The snippet above always moves stuff into a local container. However, stuff itself is a copy in case the tuple was shared by multiple actors. As long the receiving actor holds the only reference to the tuple, it will move the original string directly.

Whenever you use the send function, you'll end up having exactly one reference to the tuple in the receiving actor. Moving the data can safe a lot of copying. However, be careful whenever you mess around with move and mutable references in message handlers. ;)

Member

mavam commented Oct 30, 2012

Hah, if I just knew earlier that this was possible! I always thought the only way to get mutable access to an any_tuple would via last_dequeued, and that this certainly is discouraged because it accesses internals. Up to this point, I created complicated data processing pipelines with "virtual" actors that call a hook before "sealing" the object into a const-reference. Now that I know I could have just transferred ownership of the object by moving it into the tuple, much simpler and contained architectures come to mind.

When you get a chance, would you mention that feature in the manual?

In this light, I know understand your concerns regarding a potential retry feature. If libcppa had some notion of a weak_tuple that could lock an any_tuple iff its reference count is non-zero, it may be possible to keep the last tuple around on the sending side and potentially access it when the send failed. But even that would incur some bookkeeping overhead which not every scenario needs.

I am fine with closing the ticket for now and perhaps revisit in the future.

Owner

Neverlord commented Oct 30, 2012

Well, adding bookkeeping overhead for each any_tuple instance would seriously impact performance. However, I hope you can make use of this feature now to simplify your architecture. :)

I guess it's one of those things that just seem obvious if you know all internas, but that really aren't to other developers. Maybe it's time for an "advanced" section in the manual to give some hints for performance increases and not-so-obvious idioms.

@Neverlord Neverlord closed this Oct 30, 2012

@Neverlord Neverlord added wontfix and removed other labels Jul 30, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment