Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ames: always ack recork pleas #6364

Merged
merged 2 commits into from Mar 3, 2023
Merged

ames: always ack recork pleas #6364

merged 2 commits into from Mar 3, 2023

Conversation

yosoyubik
Copy link
Contributor

@yosoyubik yosoyubik commented Mar 1, 2023

Fixes #6363

We have a reproduction betwen ~halbex-palbep and ~rontul-dantyr-norsyr-torryn, I'll put this fix in ~halbex-palheb to confirm that it work and will then open it for review.

@yosoyubik yosoyubik changed the title anes: always ack recork pleas ames: always ack recork pleas Mar 1, 2023
@yosoyubik
Copy link
Contributor Author

~halbex-palheb is looking good, and with better performance since it has already acked thousands of recorks that many ships were sending to it.

@belisarius222
Copy link
Contributor

I'm wondering if there's a race condition here:

  1. subscriber sends %plea, then %cork immediately afterward; both messages are in flight simultaneously
  2. publisher receives the %plea and nacks it.
  3. publisher receives the %cork and acks it.
  4. both acks get dropped by the network
  5. subscriber waits until timeout, then re-sends %plea
  6. publisher hears repeat %plea, but it's already marked as corked so it responds with a positive ack instead of a nack

This is only possible if Ames will send a %cork before all the previous messages have been acked. I don't remember whether it does that or not. If this is in fact a problem, we could handle it by doing the following: when publisher receives a packet on a corked bone, check whether it's a %cork message -- if it is, ack it; otherwise, drop it. This would be legitimate because %cork messages always fit into a single packet.

@yosoyubik
Copy link
Contributor Author

yosoyubik commented Mar 2, 2023

We found an issue with the solution proposed above. No-oping on the non-cork plea will mean that it's going to retry forever. One solution would be to guarantee that a cork plea is never sent if there are live packets in the pump, so the publisher will only remove the flow when all previous messages have been processed.

This would happen in +feed-packets:make-message-pump, we will need to do something like this:

    =/  future-cork=?
      ?&  ::  in-flight packets
          ::
          ?=(^ live.packet-pump-state.state)
          ::  are we trying to send a cork?
          ::
          ?=(^ ;;((soft [%$ path %cork ~]) (cue message-blob)))
      ==
    ?:  future-cork
      ::  put the cork back in the queue and no-op
      ::
      =.  unsent-messages.state  (~(put to unsent-messages.state) message-blob)
      message-pump

One concern here is how much will affect call cue on possibly many messages (but only if there are live packets)

@yosoyubik
Copy link
Contributor Author

A different solution we have considered is to enforce on the receiver of a plea that there will be only 1 possible message in the future instead of the current 10. This will drop any future messages and guarantee that corks (the last message of a flow) are never processed before previous messages have been acked.

@yosoyubik
Copy link
Contributor Author

yosoyubik commented Mar 3, 2023

After some thinking, the solution currently on this PR (always acking a plea send on a corking bone) seems to be fine because there's always going to be two scenarios here:

  • a %watch plea, followed by a %cork (if we get kicked because of congestion)
  • a %watch plea, followed by a %leave, and then a %cork.

For both cases, whether to ack or nack the %plea message (in the scenario where the initial ack got lost, and the subscriber is trying to send it again) before the cork is irrelevant, if the flow has already been corked on the publisher.

@jalehman
Copy link
Member

jalehman commented Mar 3, 2023

It sounds like this is ready for review then @yosoyubik — if so, please mark as Ready for Review.

@belisarius222 belisarius222 marked this pull request as ready for review March 3, 2023 16:31
@belisarius222 belisarius222 merged commit 59e2d93 into develop Mar 3, 2023
@belisarius222 belisarius222 deleted the i/6363/ack-recork branch March 3, 2023 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ames: publisher doesn't ack cork pleas for corked flows
3 participants