Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ending sessions with max_round. #309

Closed
joschisan opened this issue Apr 7, 2023 · 3 comments
Closed

Ending sessions with max_round. #309

joschisan opened this issue Apr 7, 2023 · 3 comments

Comments

@joschisan
Copy link
Contributor

joschisan commented Apr 7, 2023

Hello everyone,

I need to run sessions sequentially to create an atomic broadcast. However, the time to finalisation is critical and we therefore cannot increase the round_delay until we finalise a preset number of items in this session. The alternative seems to be to specify the max_round to 3000 for example and finalise a variable number of items per session. In this case, however, the behaviour of the algorithm is unclear to me. I am wondering if the node keeps transmitting messages after he has finalized the last Data item to allow the other nodes to finish the session as well. Or do I risk that some nodes will not be able to finish their session if I use max_round to finish a session? In my case I need the session to allow all other nodes to catch up until either all of them have finished or I terminate it manually.

Please let me know if there is a developer discord or similar to discuss questions like this.

@joschisan joschisan changed the title Session without increasing round delay. Ending sessions with max_round. Apr 7, 2023
@timorl
Copy link
Collaborator

timorl commented Apr 11, 2023

I don't quite understand your usecase, it might be helpful if you gave an example of what you would want to happen with your data ordering.

Some partial answers:

  1. max_round hard-ends a session, it's very unlikely you want to hit it for any purpose.
  2. You might want to transmit "end session" proposals/items and terminate the session as soon as they are ordered, although this approach might be irrelevant to your usecase (as I said I didn't quite get it).
  3. You can influence how fast the algorithm progresses by changing how the DataProvider works – this mostly means you can sometimes slow down finalization (making it faster at other points, by lowering round_delay otherwise), which might be useful depending on what exactly you want to do.

@joschisan
Copy link
Contributor Author

joschisan commented Apr 11, 2023

I would like to implement a session manager that allows nodes that missed several sessions to catch up. Therefore I came up with the idea to hash all ordered items of a session and create a threshold signature signing the round number concatenated with the hash. Then lagging nodes can always verify a session items against the signature even if the session has been terminated long ago. Therefore I need N - f nodes to complete the session in order to create a threshold signature - thus it seems a node cannot immediately terminate a session after it has obtained the last item but needs to transmit messages to allow N - f nodes to complete it as well and create a threshold signature. As I said as latency is critical the session cannot be terminated by a preset number of items, so we could use your partial answer number 2 for this. However, it seems to me that theoretically we could also end the session after a fixed number of rounds by max_rounds, however, the algorithm immediately terminates if I use this. Using max_rounds instead of "end session items" would just make for a cleaner implementation on my side. Therefore, transmitting messages after max_rounds has been hit until the session is manually terminated seems like the more versatile behavior.

@timorl
Copy link
Collaborator

timorl commented Apr 11, 2023

The threshold signature part sounds solid, but probably not directly relevant to the rest of the problem. (Depending on your setting you might want to use the rmc crate to create it, it's quite well suited for such uses.)

Unfortunately max_rounds terminates the algorithm in a pretty brutal way – in particular there are likely to be items at the end of a session finished due to max_rounds that will never be ordered, so it doesn't work as a session ending criterion.

If the increased latency during session change is not acceptable you might want to run two sessions in parallel (starting the session slightly before the previous one ends, the details depend on your settings), submitting items to be ordered in both of them, and then joining the two orders after the "end session items" happens on the first. The join would remove all duplicated items ordered in the new session and then appends the rest of the order on top of the old one. This should avoid (almost?) all session switching lag, but adds some complication to the code. We toyed with a system like that and we are quite confident the mathematics checks out, but in our usecase the session change lag is not particularly troublesome so we didn't end up implementing it.

Kind of important – this ABFT implementation is not strictly censorship resistant (all nodes will end up with the same order, but it's not guaranteed all items submitted will eventually be ordered), so if you are submitting unique items and none of them can be lost it won't work for you. We do have a weak censorship resistance, in that if you submit an item to f+1 nodes it is guaranteed to eventually be ordered (modulo the max_round thing mentioned above). It's also relatively unlikely any items get lost in practice, but there is no formal guarantee. We are planning to eventually implement proper censorship resistance, we know how to do it and it's not even that big of a change, but we currently have other priorities.

Since it might be easier to talk about this in more of an IM style as you suggested, you can contact me on Matrix as @timorl:cardinal.ems.host. There is some Discord thing, but I don't frequent it and it's not quite for devs of ABFT, perhaps @DamianStraszak can say something more about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants