Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gall: don't ack %leave for non-running agents #6953

Merged
merged 2 commits into from Apr 3, 2024
Merged

gall: don't ack %leave for non-running agents #6953

merged 2 commits into from Apr 3, 2024

Conversation

yosoyubik
Copy link
Contributor

@yosoyubik yosoyubik commented Mar 26, 2024

In %gall, %pokes to a non-running agent are dropped, and a %flub task is sent to %ames to "forget" about the message that has been sent by decreasing the heard (but not acked) sequence number, and removing the pending-ack from its queue.

If the %poke is a %leave(s), %gall automatically %acks it (giving a %done to %ames) before sending the %flub, which means that %ames has already remove the pending-ack, and we are going to crash on the %leave, and send a %nack, making the subscriber to retry the %leave forever, every ~m2.

The fix: we just check that the queue is not empty. don't ack %leave for non-running agents

:: so %ames has already removed the pending ack from its queue
::
~(nap to pending-vane-ack.state)
sink(last-heard.state (dec last-heard.state))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than patching this up here, I think it'd be better to change %gall to not unconditionally ack %leave in the case where the app is suspended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was my immediate reaction too

@yosoyubik yosoyubik changed the title ames: on flub, check if pending-vane-ack queue is not empty gall: don't ack %leave for non-running agents Mar 26, 2024
Copy link
Member

@joemfb joemfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will do

@pkova pkova merged commit 3c926e5 into develop Apr 3, 2024
1 check passed
@pkova pkova deleted the yu/fix-flub branch April 3, 2024 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants