-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit number of redelivery attempts #728
Comments
ACK it.. this is not being sarcastic ;-). Really, if your application keeps getting a redelivered message and is unable to acknowledge it because say processing of it is failing, and the application really decides that it does not want to have to deal with it, simply ACK it. The server will stop the redelivery. Messages are not removed from the message log because they are ack'ed by consumers. Messages are always in the message log since they may have to be delivered to new consumers starting at any sequence in the log: https://github.com/nats-io/nats-streaming-server#message-log The only advantage of having such configuration in the server would be when you try to limit the redelivery of a message to an application that has exited without calling Close() (therefore server does not know). But with pings from server to client, server will ultimately detect that the client is not responding and will suspend redelivery. The default heartbeat interval is 30seconds. If that's too high, it can be lowered through configuration. |
I understand what you are saying, but this means that the app needs to be aware and keep track of the messages for a period of time in order to detect repeated delivery attempts (even the failed ones) up to a maximum. My first thought was to keep the message id in Redis for a while for idempotency checks. This is not very elegant and it scales up to a point. A better mechanism would be in my opinion to keep track on the server of number of redeliveries and either server or the client to have a mechanism to decide if more redeliveries are allowed. This would imply to have an explicit Nack command or to allow the client to instruct the server to deliver based on a specific policy. Probably, what I need here is to have the properties of a queuing system in nats server. |
@robertmircea if you're getting the same message redelivered over and over, chances are, a single subscriber will see it multiple times - so you don't need to create a counter on Redis for every single message you receive, only the ones that you're seeing twice (meaning they're getting redelivered in a loop). Keep an in-memory counter of the messages you've seen with a cuckoo filter and if you see a message twice, then you start counting its redeliveries on Redis. This should scale a lot better since it won't hit Redis everytime, only when you suspect that a message is getting redelivered. Once you hit enough redelivery attempts, you It won't perfectly track the number of redeliveries if you have many subscribers (since each subscriber will have its own counter that needs to see a message twice), but it should be good enough and scale well. |
if i have invalid message, for example payload of it created with buggy service, consumers can't success ack this message. |
Again, from a consumer perspective, if you want to stop the server to redeliver this message, simply ack it. When the message callback is invoked, and say your application cannot process it, you still can call msg.Ack() on it. What you can't do, is remove a "random" message from a message log. |
my question more from devops perspective, for example i see in prometheus that some service have big error counter in subscribe handler. what can i do? i can't rewrite service now. so what workaround to temporary fix this issue , why service rewritten and deployed? |
I understand this, but this strategy means that on first failed message processing I should remove ( |
Yes,does it poossible to count delivery attemps per durable group? |
I can check attemps and do some client side action, for example serialize event to store and ack message |
Same need. Any updates on this? |
@sujeshthekkepatt I commented on the other issue related to this: #789 |
@kozlovic while I understand that limiting the number of redelivery attempts or implementing any kind of dead-letter queue is out of scope, would it be feasible for NATS Streaming to provide a counter of redeliveries as suggested here (possibly as a new field rather than replacing the existing |
@bmcstdio that would be great. I have done a work around using the same concept where we manually add some metadata like number of retries etc and use that fields for pushing to multi stage poison queues and later to a dead letter queue. But as you said having this built in would be perfect. I think they are working on a new revised version of Stan. |
@bmcstdio Possibly. The caveat is that this number would likely be "valid" only during the runtime of a server. Meaning that I don't think that it would be feasible to persist the delivery count (but maybe?). Also, in clustering mode, when leader changes, it may not have the redelivery count (again, if that info is not stored/replicated). In this worst case scenario (valid only at runtime/leader election), would that still be valuable? |
@kozlovic I think it would still be valuable, yes, but of course the absolute best would be to persist that information. I am not familiar at all with the NATS/NATS Streaming code base, but I'd be willing to contribute to this somehow (either implementing, reviewing or testing). |
Redeliveries counter would be nice to have for us as well! Even if only during the runtime (not persistent). Thanks! |
This is the first step to support server setting a RedeliveryCount in PubMsg when those are redelivered. See nats-io/nats-streaming-server#728 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
@bmcstdio @sujeshthekkepatt @kell18 Ok, I have decided to add RedeliveryCount (only runtime at the moment, and won't survive a leader election if leader is different node). Starting with the client's repo that holds the PubMsg protobuf: nats-io/stan.go#295. |
Server PR will be submitted shortly after we agree on the field name/type in the client repo. Thanks! |
@bmcstdio @sujeshthekkepatt @kell18 Question for you guys.. the redelivery count is per subscription, but in the case of a queue group, would you expect the redelivery count to be for the group, or individual member? |
@kozlovic I'd expect it to be for the group. |
For group.
пт, 20 дек. 2019 г., 18:58 Bruno M. Custódio <notifications@github.com>:
… @kozlovic <https://github.com/kozlovic> I'd expect it to be for the group.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#728?email_source=notifications&email_token=AADVQGYCB2BN42SAUSZRTH3QZTTUHA5CNFSM4GPTUCO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHNJTDY#issuecomment-567974287>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADVQGY4NWZXEC5BZAZBOZTQZTTUHANCNFSM4GPTUCOQ>
.
|
For group for sure! |
Ok, I hear you, but in case you did not notice, if you use clustering, as of now, a message is already redelivered to the same member (subject to change in the future, but as of now that is the case), so that will not make much of a difference :-) |
Guys... closing this one in favor of #996 |
What would be the best strategy to limit number of redeliveries of the same message to subscribers which are failing to ack?
For example: If the messages is not acked several times while attempting delivery to client, I want the message to be redelivered for up to x times and then dropped.
Derived question: Is it possible to keep a counter on server of how many times a message has been redelivered to client?
The text was updated successfully, but these errors were encountered: