Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revert(message)!: remove nack delay parameter #668

Merged
merged 5 commits into from
Jul 1, 2019
Merged

revert(message)!: remove nack delay parameter #668

merged 5 commits into from
Jul 1, 2019

Conversation

callmehiphop
Copy link
Contributor

@callmehiphop callmehiphop commented Jun 24, 2019

  • Tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jun 24, 2019
@codecov
Copy link

codecov bot commented Jun 24, 2019

Codecov Report

Merging #668 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #668      +/-   ##
==========================================
- Coverage   97.79%   97.79%   -0.01%     
==========================================
  Files          14       14              
  Lines         861      860       -1     
  Branches      179      179              
==========================================
- Hits          842      841       -1     
  Misses          2        2              
  Partials       17       17
Impacted Files Coverage Δ
src/subscriber.ts 96.29% <100%> (-0.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c92a09a...1e183b2. Read the comment docs.

@callmehiphop callmehiphop added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 25, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 25, 2019
@codecov
Copy link

codecov bot commented Jun 25, 2019

Codecov Report

Merging #668 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #668      +/-   ##
==========================================
- Coverage   97.79%   97.79%   -0.01%     
==========================================
  Files          14       14              
  Lines         861      860       -1     
  Branches      179      179              
==========================================
- Hits          842      841       -1     
  Misses          2        2              
  Partials       17       17
Impacted Files Coverage Δ
src/subscriber.ts 96.29% <100%> (-0.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ede5784...96e72f4. Read the comment docs.

@callmehiphop callmehiphop merged commit ca8fe65 into googleapis:master Jul 1, 2019
@manovotny
Copy link

@JustinBeckwith @bcoe @callmehiphop can one of you please explain why this was reverted / removed? I tried searching issues to figure out why on my own, but couldn't find anything.

I believe our team utilized this because the global pub / sub delay setting wasn't working properly. Has that been fixed now and why this is no longer needed? Even if the global setting has been fixed, wouldn't there still be value in a lower level setting for overriding or performing and incremental backoff / increase?

Thanks in advance! 🙏

@callmehiphop
Copy link
Contributor Author

@manovotny I believe this came at the request of the PubSub team. @anguillanneuf do you have any insights here?

@mgabeler-lee-6rs
Copy link
Contributor

Yes ... not allowing a "retry delay" is very frustrating on the app side. If the app isn't ready to handle the message now, why would it be reasonable to think it will be ready in 10 milliseconds or so?

It's not like the low level API doesn't have this capability, hiding it from the "friendly" interface is unfortunate.

@callmehiphop
Copy link
Contributor Author

@mgabeler-lee-6rs technically the API doesn't offer such functionality, the client was just using the API in a way the backend team never intended, hence why they asked us to remove it. The team definitely sees value in this kind of functionality, so that's not to say it won't resurface in some form one day.

I know that using setTimeout + nack will decrease the speed at which you can process messages, but it is probably the best option at this point in time.

@mgabeler-lee-6rs
Copy link
Contributor

we tried setTimeout + nack ... it doesn't decrease the speed we can process messages, it prevents us from ever getting to see the messages we need to process before the ones we can't process right away

@callmehiphop
Copy link
Contributor Author

@mgabeler-lee-6rs just to clarify, you are saying that you get stuck by the flow control limits since you're waiting for all the pending nacks to fire off to allow new messages to come in?

@mgabeler-lee-6rs
Copy link
Contributor

mgabeler-lee-6rs commented Aug 2, 2019

right -- if we only have the memory budget for X messages, and we need to have 10*X messages "in progress" before we see a message we need as a pre-requisite to one of the first X messages, we are up the proverbial creek.

Edit: And also we found empirically that we tend to get the just-nack'd messages back more than we get the other "later" messages that we're wanting to process first.

But if we can tell those first 9*X messages to come back in a few seconds or maybe a minute, we can blast through and handle the prerequisite messages no problem.

This is one of several things we learned in a painful incident that involved a backlog of over eight million messages on a single subscription one day (the other being that nacking messages at all is hazardous to the subscription's health).

It is also just quite icky to have to use application memory in our kubernetes cluster to do message queueing on behalf of ... a message queuing system.

@callmehiphop
Copy link
Contributor Author

@anguillanneuf @kamalaboulhosn have you seen this issue in any of the other clients?

@kamalaboulhosn
Copy link
Contributor

It seems like this is trying to use nack delay as a means to get around the fact that Cloud Pub/Sub does not currently support ordering. Using the delay as a means to overcome the lack of ordering would be considered an anti pattern. If this is a requirement for you, I would recommend pinning to a version of the library that supports it until we provide a solution that better meets your needs.

@mgabeler-lee-6rs
Copy link
Contributor

@kamalaboulhosn Indeed ... for newer code we have found better patterns to avoid this issue. And pinning at 0.30.1 is exactly what we have done for now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants