Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spicedb] Fix intermittent "Error: 4 DEADLINE_EXCEEDED...Waiting for LB pick" #20637

Merged
merged 4 commits into from
Mar 17, 2025

Conversation

geropl
Copy link
Member

@geropl geropl commented Feb 27, 2025

Description

This PR is a fresh attempt at fixing the gRPC error Error: 4 DEADLINE_EXCEEDED...Waiting for LB pick we see every couple of weeks between server and spicedb. This seems always be triggered by the spicedb pods restarting, while we have active/new requests coming in.

Most likely it's fixed by bumping grpc/grpc-js, as there were at couple of potential fixes between 1.10.8 and 1.12.6 [1, 2, 3, 4, ....].

It turned out we still get the Waiting for LB pick, where the client stalls for ~120s before connecting to upstream again, although the pod is up and Ready after ~5-8s.

To we are catching that specific case now, and re-trying the call with a fresh client.

Next steps:

  • load test in cluster (while manually tearing down spicedb
  • ❓ controlled, temporary rollout to an internal installation

Related Issue(s)

Fixes CLC-370

How to test

Documentation

Preview status

gitpod:summary

Build Options

Build
  • /werft with-werft
    Run the build with werft instead of GHA
  • leeway-no-cache
  • /werft no-test
    Run Leeway with --dont-test
Publish
  • /werft publish-to-npm
  • /werft publish-to-jb-marketplace
Installer
  • analytics=segment
  • with-dedicated-emulation
  • workspace-feature-flags
    Add desired feature flags to the end of the line above, space separated
Preview Environment / Integration Tests
  • /werft with-local-preview
    If enabled this will build install/preview
  • /werft with-preview
  • /werft with-large-vm
  • /werft with-gce-vm
    If enabled this will create the environment on GCE infra
  • /werft preemptible
    Saves cost. Untick this only if you're really sure you need a non-preemtible machine.
  • with-integration-tests=all
    Valid options are all, workspace, webapp, ide, jetbrains, vscode, ssh. If enabled, with-preview and with-large-vm will be enabled.
  • with-monitoring

/hold

Copy link

socket-security bot commented Feb 27, 2025

Updated dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
npm/@authzed/authzed-node@0.15.01.2.2 None +3 6.47 MB authzednpm
npm/@grpc/grpc-js@1.10.81.12.6 None +33 8.54 MB grpc-packages, murgatroid99, nicolasnoble

View full report↗︎

@geropl geropl force-pushed the gpl/370-grpcjs branch 2 times, most recently from 560e4dc to a2465ac Compare March 6, 2025 09:27
@roboquat roboquat added size/XL and removed size/L labels Mar 6, 2025
Copy link
Contributor

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the meta: stale This issue/PR is stale and will be closed soon label Mar 16, 2025
geropl added 4 commits March 17, 2025 08:05
…5.0 -> 1.2.2

Tool: gitpod/catfood.gitpod.cloud
 - instead of doing retries on two levels, rely on the gRPC-level retries
 - to mitigate the loss of insights, introduce createDebugLogInterceptor
  - client options: use sane defaults derived from the documentation instead of the excessive ones we had in place before
  - use "waitForReady" option: it should a) make our calls for responsive on re-connects, while b) - because we keep re-trying on DEADLINE_EXCEEDED - should be as reliable as before

Tool: gitpod/catfood.gitpod.cloud
Tool: gitpod/catfood.gitpod.cloud
@geropl
Copy link
Member Author

geropl commented Mar 17, 2025

This is ready to be merged now, see https://linear.app/gitpod/issue/CLC-370/grpcgrcp-js-deadline-exceeded-waiting-for-lb-pick#comment-7e018052 for details on how this has been tested and the results.

Copy link
Contributor

@corneliusludmann corneliusludmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me 👍

@geropl
Copy link
Member Author

geropl commented Mar 17, 2025

/unhold

@roboquat roboquat merged commit 5d557f7 into main Mar 17, 2025
34 checks passed
@roboquat roboquat deleted the gpl/370-grpcjs branch March 17, 2025 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants