New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsqlrun: swallow uncertainty errors when draining #32401

Merged
merged 1 commit into from Nov 26, 2018

Conversation

Projects
None yet
4 participants
@andreimatei
Copy link
Member

andreimatei commented Nov 15, 2018

For the motivation, quoting from the patch:

We're already draining, so we don't care about whatever data generated
this uncertainty error. Besides generally seeming like a good idea,
doing this allows us to offer a nice guarantee to SQL clients: a
read-only query that produces at most one row, run as an implicit txn,
never produces retriable errors, regardless of the size of the row being
returned (in relation to the size of the result buffer on the
connection). One would naively expect that to be true: either the error
happens before any rows have been delivered to the client, in which case
the auto-retries kick in, or, if a row has been delivered, then the
query is done and so how can there be an error? What our naive friend is
ignoring is that, if it weren't for this code, it'd be possible for a
retriable error to sneak in after the query's limit has been satisfied
but while processors are still draining. Note that uncertainty errors
are not retried automatically by the leaf TxnCoordSenders (i.e. by
refresh txn interceptor).

Release note: None

@andreimatei andreimatei requested review from cockroachdb/core-prs as code owners Nov 15, 2018

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Nov 15, 2018

This change is Reviewable

@andreimatei

This comment has been minimized.

Copy link
Member

andreimatei commented Nov 15, 2018

Everything but the last commit is elsewhere.

@jordanlewis

This comment has been minimized.

Copy link
Member

jordanlewis commented Nov 15, 2018

I'm tagging @asubiotto on this as well.

@andreimatei andreimatei force-pushed the andreimatei:distsql-retry-err branch from 4d4c7a6 to aa35dd7 Nov 15, 2018

@jordanlewis
Copy link
Member

jordanlewis left a comment

:lgtm:

Reviewed 11 of 11 files at r1, 5 of 5 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/distsqlrun/processors.go, line 711 at r3 (raw file):

			if err := meta.Err; err != nil {
				// We only look for UnhandledRetryableErrors. Local reads (which would
				// be transormed by the Root TxnCoordSender into HandledRetryableErrors)

nit: s/transormed/transformed/


pkg/sql/pgwire/conn.go, line 105 at r1 (raw file):

// TODO(andrei): This setting is under "sql.defaults", but there's no way to
// control the setting on a per-connection basis. We should introduce a
// corresponding session variable.

Can you do this as part of this PR? It feels incomplete otherwise, and hard to test with. Or maybe it's hard for plumbing reasons?


pkg/sql/pgwire/server.go, line 52 at r2 (raw file):

// control the setting on a per-connection basis. We should introduce a
// corresponding session variable.
var connResultsBufferSize = settings.RegisterByteSizeSetting(

No big deal, but looks like you amended the 2nd commit instead of the first to move this setting into server.go...

@andreimatei andreimatei force-pushed the andreimatei:distsql-retry-err branch from aa35dd7 to dc95259 Nov 26, 2018

@andreimatei
Copy link
Member

andreimatei left a comment

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/distsqlrun/processors.go, line 711 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

nit: s/transormed/transformed/

done


pkg/sql/pgwire/conn.go, line 105 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Can you do this as part of this PR? It feels incomplete otherwise, and hard to test with. Or maybe it's hard for plumbing reasons?

The first 2 commits are https://reviewable.io/reviews/cockroachdb/cockroach/32366, btw.
I'm not interested in adding the session variable myself, because of both effort and also because I don't think it'd be particularly useful at the session level. What you really want is using it at the query level, but how to do that is not clear.


pkg/sql/pgwire/server.go, line 52 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

No big deal, but looks like you amended the 2nd commit instead of the first to move this setting into server.go...

well it was on purpose; in the first commit the setting made sense in a different file

distsqlrun: swallow uncertainty errors when draining
For the motivation, quoting from the patch:

We're already draining, so we don't care about whatever data generated
this uncertainty error. Besides generally seeming like a good idea,
doing this allows us to offer a nice guarantee to SQL clients: a
read-only query that produces at most one row, run as an implicit txn,
never produces retriable errors, regardless of the size of the row being
returned (in relation to the size of the result buffer on the
connection). One would naively expect that to be true: either the error
happens before any rows have been delivered to the client, in which case
the auto-retries kick in, or, if a row has been delivered, then the
query is done and so how can there be an error? What our naive friend is
ignoring is that, if it weren't for this code, it'd be possible for a
retriable error to sneak in after the query's limit has been satisfied
but while processors are still draining. Note that uncertainty errors
are not retried automatically by the leaf TxnCoordSenders (i.e. by
refresh txn interceptor).

Release note (sql change): Some categories of SELECT queries that return 0 or 1
rows (namely, queries by a PK or a unique index or LIMIT 1 queries) are
now guaranteed to not return retriable errors when running as implicit
transactions (i.e., outside of a BEGIN...COMMIT block).

@andreimatei andreimatei force-pushed the andreimatei:distsql-retry-err branch from dc95259 to 3e59c35 Nov 26, 2018

@andreimatei
Copy link
Member

andreimatei left a comment

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

craig bot pushed a commit that referenced this pull request Nov 26, 2018

Merge #32401
32401: distsqlrun: swallow uncertainty errors when draining r=andreimatei a=andreimatei

For the motivation, quoting from the patch:

We're already draining, so we don't care about whatever data generated
this uncertainty error. Besides generally seeming like a good idea,
doing this allows us to offer a nice guarantee to SQL clients: a
read-only query that produces at most one row, run as an implicit txn,
never produces retriable errors, regardless of the size of the row being
returned (in relation to the size of the result buffer on the
connection). One would naively expect that to be true: either the error
happens before any rows have been delivered to the client, in which case
the auto-retries kick in, or, if a row has been delivered, then the
query is done and so how can there be an error? What our naive friend is
ignoring is that, if it weren't for this code, it'd be possible for a
retriable error to sneak in after the query's limit has been satisfied
but while processors are still draining. Note that uncertainty errors
are not retried automatically by the leaf TxnCoordSenders (i.e. by
refresh txn interceptor).

Release note: None

Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@craig

This comment has been minimized.

Copy link

craig bot commented Nov 26, 2018

Build succeeded

@craig craig bot merged commit 3e59c35 into cockroachdb:master Nov 26, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@andreimatei andreimatei deleted the andreimatei:distsql-retry-err branch Nov 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment