Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual acknowledgments #57

Merged
merged 5 commits into from
Jul 2, 2021
Merged

Manual acknowledgments #57

merged 5 commits into from
Jul 2, 2021

Conversation

fracasula
Copy link
Contributor

This PR is to address this issue: #53

We want the ability to manually control the PUBACK/PUBREC transmission in order to avoid message loss.

The idea is to acknowledge a message only after it has been processed successfully by a client.

Changes

  • the default behaviour is not changed, acks will be sent in order automatically by the client, right after routing
  • if the users want to manually enable acknowledgments they just have to set EnableManualAcknowledgment to true and then call Ack() in any order. the client will take care of sending the acks to the server back in right order.
    • this is something I've seen implemented in other MQTT clients (e.g. HiveMQ Java client)
  • a background routine will periodically try to flush the acks in order, by acquiring a lock, as often as configured via the SendAcksInterval attribute

Sidenotes

  • this code has been running for several days with a few clients in our staging environment with no issues (with manual acks enabled)
  • we're planning to go live in production with this branch in 1-2 weeks here at Netdata which should give us a certain degree of certainty that it works and performs well. of course we'll keep monitoring it afterwards as well.

@alsm
Copy link
Contributor

alsm commented Jun 8, 2021

Thanks, I'll take a look at it, have a poke, see how it feels.

@fracasula
Copy link
Contributor Author

@alsm I just wanted to give you a quick update. We've been running this code in production for a few days now.
The only issues that we came across was about a panic unrelated to these changes. I've fixed it here as well but I'm planning to remove the commit from here and rebase once we manage to merge this hotfix.

I've created a Grafana dashboard to monitor these new deployments as well, specifically to see if there would be missing acknowledgments. Given that our VerneMQ is set to drop messages once a specific threshold of unacked messages is reached, looking at these kind of metrics help us detect anomalies.

Screenshot from 2021-06-16 15-14-39

We couldn't see any dropped messages and the PUBACKs received by VerneMQ seems to be consistent with what the previous version was giving us.

As a sidenote, I got the confirmation from the HiveMQ guys that they buffer acknowledgments similarly to what I'm doing here. I think this should give us enough confidence given that they had the client behaving like that for a long time now.

@alsm
Copy link
Contributor

alsm commented Jun 23, 2021

I like what this does, I've just been wondering if there's a nice way to do it to make it more customisable. For example Pinger, Auther, Persistence and Router are all interfaces, while the code you've proposed meets your particular need do I/we want that to be the only method available. If I can't come up with something shortly I may merge this as is with a view to coming back to it again in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants