Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ames: ignore message-num.task for %drop tasks #6996

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

yosoyubik
Copy link
Contributor

tl;dr: not working, but wanted to get some eyes on it first since this behavior has been in place since the %alef rewrite.

When a message is nacked (e.g. seq=25), we add it to nax.sink and create a %naxplanation message (e.g. seq=1; first ever naxplanation sent) that contains the nacked sequence number (25), and send it to the original sender on the naxplanation flow.

When the ack for the naxplanation (seq=1) comes back, we know that this was an ack for a naxplanation bone and then drop the sequence number of this naxplanation message (seq=1) instead of the original message (seq=25) that was nacked.

This causes a range of messages in nax.sink that will remain there until naxplanations catch up with the actual message numbers from the reference flow.

We are removing messages that might not have been nacked, but that seems ok since the message won't be there anyway. But this could mean that there are gaps in the sequence of messages that we have nacked so the fact that there are messages in nax.sink doesn't mean that we can say: "if a sequence number higher than what's in the queue is not in nax.sink, we can we assume that it has not been nacked?".

In any case, it looks like this behavior has not caused any (known?) issues, and as of this commit, this change introduced some problems (re: looking into it right now) so as it stands I wouldn't say that we should fix this before Directed Messaging is released since we could introduce other problems, but wanted to have some eyes on it since it looks like some of this was a part of the %alef rewrite (e.g. nax.peer-state which was never used, only nax.sink in the .message-sink) but got dropped when it got merged into %ames.

(PS: wip, currently does not work)

(PPS: a space-leak fix targeted these nacked messages directly and removed anything that had already been acked)

when a message in nacked (e.g. 25), we add it to
nax.sink and create a %naxplanation message
(e.g. 1; first ever naxplanation sent) that contains the
nacked sequence number (25), and send it to the
original sender on the naxplanation flow.

when the ack for the naxplanation (1) comes back, we
know that this was an ack for a naxplanation bone
and then drop the sequence number of _this_
naxplanation message (1) instead of the original
message (25) that was nacked.

this causes a range messages in nax.sink
that will remain there until naxplanations catch
up with the actual message numbers from the
reference flow.

we are removing messages that might not have been
nacked, but that seems ok since the message won't be
there anyway.

(note: wip, currently does not work)
@yosoyubik
Copy link
Contributor Author

Correction:

it turns out that what I though was a bug introduced by this PR is actually not, this is happening on master.

If we poke a ship that's offline repeatedly (~7/8 times) and then it comes back online, if the pokes are nacked, we will send a naxplanation together with the nack (getting the naxplanation is what increases current on the sender of the poke, allowing to send more messages and to get (n)acks out of the queue of pending-acks to be sent to the vane) until somehow we skip sending a naxplanation.

Because of this we won't increase current, and the subsequent poke-nacks are going to get enqueued in the pending-vane-ack and won't be delivered to the vane since we are still waiting to hear that naxplanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant