Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-consider separate, ordered mailbox #94

Closed
thomaseizinger opened this issue Jun 21, 2022 · 10 comments · Fixed by #207
Closed

Re-consider separate, ordered mailbox #94

thomaseizinger opened this issue Jun 21, 2022 · 10 comments · Fixed by #207

Comments

@thomaseizinger
Copy link
Collaborator

With #85, we are introducing 3 mailboxes: ordered, prioritized and broadcast.

Ordered exists because BinaryHeaps don't preserve insertion order and thus, messages with the same priority (like 0 as the default) could be processed in a different order, assuming they are used with split_receiver and the processing thus happens async.

I am claiming that this is way too much of a detail to expose to the user (which has to eventually learn that there are 3 mailboxes) so my suggestion would be to:

a) Use a data structure that preserves insertion order
b) Accept the re-ordering of messages

I am not data structure expert but couldn't we for example also use an atomic counter to tag each incoming message and sort the heap based on priority and message counter? That should preserve the sending order even among messages with the same priority.

@Restioson
Copy link
Owner

Couldn't we for example also use an atomic counter to tag each incoming message and sort the heap based on priority and message counter?

In theory, yes, but how do you propose to handle overflows? This was the main issue I ran into thinking about this option

@thomaseizinger
Copy link
Collaborator Author

Couldn't we for example also use an atomic counter to tag each incoming message and sort the heap based on priority and message counter?

In theory, yes, but how do you propose to handle overflows? This was the main issue I ran into thinking about this option

Reset it every time the queue is empty perhaps?

Am I too naive in saying that we could also just not handle them? The "problem" would be a slight out of order handling of same-priority messages.

From experience, actors receive messages from various addresses concurrently, so the exact order in which they will be handled is unclear anyway so this wouldn't even be noticeable.

But if we accept this behaviour in an edge case, we might also just accept it all the time :D

@thomaseizinger
Copy link
Collaborator Author

b) Accept the re-ordering of messages

Expanding more on this:

Usage of split_receiver makes message delivery asynchronous and I think by definition, timing of events doesn't matter in an async system. (I've found this but it doesn't have many follow-up references.)

In other words, I would argue that someone using split_receiver explicitly gives up control over the ordering of their messages.

We may want to debate the name of that function again because it doesn't communicate that right away, only conceptually.

@thomaseizinger
Copy link
Collaborator Author

If we would implement #92 (comment), then this issue would go away because the BinaryHeap could be replaced by a VecDeque per message priority.

@thomaseizinger
Copy link
Collaborator Author

@Restioson What is your latest position on this?

I am gonna try and summarize our current knowledge:

Separate handling of messages with default priority has the advantage that they will be delivered to the actor in the order they have been sent even if the user uses split_receiver.

I see two problems with this feature:

  1. We are special-casing only the default priority. If I do the same thing with two messages of priority 5, they are not getting the same treatment.
  2. The potential re-ordering only happens if the user chooses to await the handler invocation asynchronously. I think by definition, that means they don't care about the order in which these messages are handled, otherwise they would have chosen synchronous execution, right?

From a user perspective, the feature is incomplete (only applies to default priority) and from a maintainer perspective, it adds a fair bit of complexity to the codebase which is why I'd like to remove it.

@thomaseizinger
Copy link
Collaborator Author

Friendly ping @Restioson :)

@thomaseizinger
Copy link
Collaborator Author

I propose to merge ordered and priority into a unicast mailbox. I think this can wait for after 0.6 as it doesn't affect the public API although it would be nice to do before given that the "priority" feature is only introduced in master and doesn't exist in 0.5.

@Restioson
Copy link
Owner

Hi, sorry for getting to this so late. I am back from holiday now :)

I think the main thing around having a unified mailbox would be that we would need to find some way to implement the two features (priority & no priority) in one fell swoop while still retaining message ordering, as there are some existing usecases which use message ordering. We could try the VecDeque solution and see what the performance/memory usage impact would be versus the current solution. I do anticipate it would use more memory and allocate more, needing one new VecDeque per priority that the user is using. Maybe there are other priority queue structures that also preserve insertion order?

@thomaseizinger
Copy link
Collaborator Author

Hi, sorry for getting to this so late. I am back from holiday now :)

I think the main thing around having a unified mailbox would be that we would need to find some way to implement the two features (priority & no priority) in one fell swoop while still retaining message ordering

As I tried to argue above, message ordering is guaranteed if you await the response of the message. If you say split_receiver, you are in asynchronous land where ordering goes out the window by design.

Can you explain what ordering you want to retain?

Also, I think we shouldn't look at it as no priority vs priority. Priority 0 is just another priority and not no priority :)

@thomaseizinger
Copy link
Collaborator Author

thomaseizinger commented Feb 23, 2023

Here are meeting "notes" from an out-of-band meeting: https://hackmd.io/@AeK4eNTZTGGJXAT4cxS_Zw/r1MvCrHCi.

The conclusion of the meeting is:

  • We agree that the ordered & priority mailboxes should be merged:
    • It should be easier to understand for users as there is no explanation needed why initializing with capacity may end up consuming capacity * 3 memory.
    • It removes the feature disparity where only "default" priority messages retained ordering
  • We want to have a function like split_receiver in the library because:
    • Back-pressure is important. We cannot just tell users to tokio::spawn(address.send()) as that breaks the back-pressure chain if we f.e. read messages from a socket.
    • Spawning requires knowledge about the executor and we want to be an executor-agnostic library. We also want to make it possible for users to further build executor-agnostic libraries on top of xtra which would not be possible if split_receiver-like functionality is not offered out of the box.
  • Merging the mailboxes removes the "ordering" guarantee in combination with split_receiver. We agreed that we need a better name for the function to convey this:
    • Ideally, the name is goal-oriented. What happens is "background-processing" or "asynchronous handling" but "async" is a very overloaded term within the Rust space so we would like to avoid that, esp. because xtra defines itself as being an "async-first" library but this is aimed at the support for async fn handlers.
    • The name should imply or at least suggest that the actor may handle several message sent using that function in any order.

Some naming suggestions that came up during the meeting:

  • send().queue(): Suggests that we get a space in the mailbox but that processing is separate. It also has a co-notation of ordering though because most queue's are FIFO which pretty much rules this one out. It is also weird that queue is a "configuration" of send and not an alternative. We do however want to retain the simplicity of Address only offering Address:send and Address::queue.
  • send().fire_and_forget(): Suggests that we will not wait for the response which is in line with what we want. It is kind of odd that you can still get the response by waiting for the returned receiver.
  • send().eventually(): Suggests that processing will happen "eventually" which has not notion of ordering. There is no prior art that I know about so this one may be very hard to discover.

An idea that just came to my mind:

addr.send().spawn():

  • spawn has prior art in the ecosystem of doing something "in the background".
  • Both threads and tasks can await the result through a handle returned from spawn, same as the receiver returned from send().

What may not be intuitive is that people might think "what executor / thread is this going to be spawned on?" when in reality it isn't actually "spawned" as such but just queued, albeit in a non-FIFO queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants