-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spicedb] Fix intermittent "Error: 4 DEADLINE_EXCEEDED...Waiting for LB pick" #20637
Conversation
Updated dependencies detected. Learn more about Socket for GitHub ↗︎
|
560e4dc
to
a2465ac
Compare
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
…5.0 -> 1.2.2 Tool: gitpod/catfood.gitpod.cloud
- instead of doing retries on two levels, rely on the gRPC-level retries - to mitigate the loss of insights, introduce createDebugLogInterceptor - client options: use sane defaults derived from the documentation instead of the excessive ones we had in place before - use "waitForReady" option: it should a) make our calls for responsive on re-connects, while b) - because we keep re-trying on DEADLINE_EXCEEDED - should be as reliable as before Tool: gitpod/catfood.gitpod.cloud
Tool: gitpod/catfood.gitpod.cloud
…k" error Tool: gitpod/catfood.gitpod.cloud
This is ready to be merged now, see https://linear.app/gitpod/issue/CLC-370/grpcgrcp-js-deadline-exceeded-waiting-for-lb-pick#comment-7e018052 for details on how this has been tested and the results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me 👍
/unhold |
Description
This PR is a fresh attempt at fixing the gRPC error
Error: 4 DEADLINE_EXCEEDED...Waiting for LB pick
we see every couple of weeks betweenserver
andspicedb
. This seems always be triggered by thespicedb
pods restarting, while we have active/new requests coming in.Most likely it's fixed by bumping
grpc/grpc-js
, as there were at couple of potential fixes between1.10.8
and1.12.6
[1, 2, 3, 4, ....].It turned out we still get the
Waiting for LB pick
, where the client stalls for ~120s before connecting to upstream again, although the pod is up andReady
after ~5-8s.To we are catching that specific case now, and re-trying the call with a fresh client.
Next steps:
load test in cluster (while manually tearing downspicedb
Related Issue(s)
Fixes CLC-370
How to test
Documentation
Preview status
gitpod:summary
Build Options
Build
Run the build with werft instead of GHA
Run Leeway with
--dont-test
Publish
Installer
Add desired feature flags to the end of the line above, space separated
Preview Environment / Integration Tests
If enabled this will build
install/preview
If enabled this will create the environment on GCE infra
Saves cost. Untick this only if you're really sure you need a non-preemtible machine.
Valid options are
all
,workspace
,webapp
,ide
,jetbrains
,vscode
,ssh
. If enabled,with-preview
andwith-large-vm
will be enabled./hold