New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crdb: add a connection-balancing retry-aware connection pool #1294
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
github-actions
bot
added
area/datastore
Affects the storage system
area/dependencies
Affects dependencies
area/tooling
Affects the dev or user toolchain (e.g. tests, ci, build tools)
labels
May 1, 2023
ecordell
force-pushed
the
crdb-balanced-retry
branch
2 times, most recently
from
May 3, 2023 12:23
5379e47
to
a720a00
Compare
ecordell
force-pushed
the
crdb-balanced-retry
branch
10 times, most recently
from
May 6, 2023 20:35
8306700
to
dc0b09d
Compare
This was referenced May 6, 2023
ecordell
force-pushed
the
crdb-balanced-retry
branch
from
May 6, 2023 20:50
dc0b09d
to
65691b7
Compare
ecordell
force-pushed
the
crdb-balanced-retry
branch
4 times, most recently
from
May 10, 2023 19:06
c73972d
to
eb29ad2
Compare
ecordell
force-pushed
the
crdb-balanced-retry
branch
2 times, most recently
from
May 12, 2023 18:01
c574f6e
to
3b1e714
Compare
ecordell
force-pushed
the
crdb-balanced-retry
branch
7 times, most recently
from
May 18, 2023 21:29
90539ce
to
6b5924d
Compare
vroldanbet
previously approved these changes
May 19, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some serious load-testing of this thing, I believe this is ready to go! Fantastic work @ecordell 👏🏻 💯
ecordell
force-pushed
the
crdb-balanced-retry
branch
from
May 19, 2023 17:02
6b5924d
to
4de7d06
Compare
ecordell
force-pushed
the
crdb-balanced-retry
branch
4 times, most recently
from
May 19, 2023 17:11
56113ce
to
336c67e
Compare
vroldanbet
reviewed
May 19, 2023
ecordell
force-pushed
the
crdb-balanced-retry
branch
2 times, most recently
from
May 19, 2023 18:49
3f922a9
to
98b5d4a
Compare
we used a "transaction factory" everywhere but never used it to create a transaction (ever since we switched to implicit transactions). this removes the extra abstraction. it also gives the DBReader interface a better name (since it can be used for more than just reading)
this ensures that small network blips don't inadvertently mark a node as unhealthy, and that new nodes coming online don't get flooded with new connections
ecordell
force-pushed
the
crdb-balanced-retry
branch
from
May 19, 2023 20:05
98b5d4a
to
b084db5
Compare
vroldanbet
approved these changes
May 22, 2023
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
area/CLI
Affects the command line
area/datastore
Affects the storage system
area/dependencies
Affects dependencies
area/tooling
Affects the dev or user toolchain (e.g. tests, ci, build tools)
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR:
RetryPool
that exposes a query interface that can correctly perform CRDB connection reset / retry logicTogether, these allow SpiceDB to have long-lived connections to CRDB (longer than the 5m maximum without this) but still survive loss of cockroach nodes (especially during an upgrade). The additional connection management doesn't require admin permission and doesn't require any additional roundtrips (contrasting with previous approaches in #1283 and #1284).
Cockroach's view of connections during a CRDB upgrade:
vs. SpiceDB's view:
API availability during the upgrade:
It's always possible that a request will be unlucky and unable to find a good connection to use before timing out, but it seems to be within acceptable levels (and is expected when we're reactive to changes like this).
Fixes #1295