-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pubsub: getter for all paused publishers on a topic, or method to resume all #5905
Comments
If the issue is not knowing which ordering key has failed, would it be helpful if errors from |
Not in my use case, as in that context we would generally still know what we had tried to publish when we got that error, since we would be waiting on that result to mark a "thing" complete. The issue I'm facing is more that applications get get into an apparent "broken until restart" condition because of gRPC timeouts (I have an Enterprise Support ticket open looking into why we have seen a sudden increase in those timeouts since early April, but of course am also looking to make our applications more robust). Calling the resume before each publish or group of publishes, integrated in our retry logic, is an option, but seems poor for performance as it will involve various mutex and map hits each time, when the vast majority of the time it is not necessary. A higher level API would allow us to:
|
I think I'm a little confused here and would appreciate some more context. Here's what the
I think my confusion stems from this:
Could you describe a bit more why this is preferred to handling the error within the publish goroutine? Also, if you need to handle this elsewhere, could you keep track of the failed ordering keys in a set/map to later call Thanks! |
A couple bits of context for my use case that may help:
|
Thanks for the clarification. Let me bring this up with some folks (including the Node library maintainer) and get back to you. |
Actually sorry one more point that I kind of brought up earlier but wasn't addressed, given that you know which ordering keys failed on Publish, is it not possible to keep track of these in a set? Since these errors are not happening that often, there shouldn't be much overhead to keeping this around. |
Yes, and that is something we're looking into in our application. There are some complexities there due to potentially multiple components publishing. And there is a bit of frustration when I can |
Yeah I understand the frustration. However, for the majority of use cases, the best pattern is to handle the We're not keen on exposing a |
That makes sense. How would you feel about:
|
I think exposing those methods is kind of in the same realm as exposing Metrics and alerting is something we are looking at more broadly and there may be some information within the client library exposed for summary purposes, but I'm not sure it would necessarily include just the list of paused ordering keys. The request to expose these additional methods is something we'll keep in mind and see if we get other similar requests for such use cases. |
I had a question from a coworker that wanted to pass along here: For their example
Isn't the proper use of ordering keys is to for them to publish just three messages, each of which would be a set or list of their original msgs: [ {msg1, msg2, msg3} , {mesg4, msg5, ... , msg400}, and {msg401, msg402, msg403} ]? I think this question is slightly unrelated to the original issue, but wanted to bring this up anyway. Is it possible you might want to batch these messages? |
Closing this issue since this isn't something we're looking to implement at this time. |
Is your feature request related to a problem? Please describe.
When ordered publishers get stuck on an error, it's difficult to build good application recovery code when we cannot enumerate the paused publishers, nor resume them in bulk.
Describe the solution you'd like
One or both of:
Describe alternatives you've considered
Additional context
The same basic problem is affecting us in the NodeJS client: googleapis/nodejs-pubsub#1524
The text was updated successfully, but these errors were encountered: