-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add SPLIT AT #8938
sql: add SPLIT AT #8938
Conversation
Does this need some code that ensures SPLIT is only being run outside a BEGIN? |
Reviewed 9 of 9 files at r1. sql/split.go, line 52 [r1] (raw file):
You need to check for aggregate functions here. Use AggregateInExpr(). If you arbitrarily decide to not support sub-queries nor placeholders for prepare/execute, then you must reject them explicitly in the code (and confirm via a test that they are rejected properly. Right now I fear your code would cause a panic). For this, use the same logic as in MakeColumnDefDescs (sqlbase/table.go) calling SanitizeVarFreeExpr(). Or you could decide to support them, and move the Eval call to a new splitNode's Start() method, add a startSubQueries() somewhere and populate expandPlan() accordingly. sql/split.go, line 68 [r1] (raw file):
This AdminSplit() call hould go to Start() and not stay in the constructor. Comments from Reviewable |
I don't think it's an issue if you run a split inside a transaction. Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. Comments from Reviewable |
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed. sql/split_at_test.go, line 44 [r1] (raw file):
Woah no that's the short way to a flaky test. I'm not sure what the better way is though. sql/split_at_test.go, line 85 [r1] (raw file):
There needs to be more logic in here to actually validate the split is taking place. Comments from Reviewable |
Nice. Should this be an
This could also be extended to splitting an index:
|
Nice, that was fast! That should be I kind of like the The values should be parenthesized (similar to the I'm fine with the fact that the split occurs outside of any current transaction. An alternate implementation would be to write the proposed split point into a system table somewhere and have the split queue consult this table (similar to what it does for splitting at table boundaries today). However, at least for our current usage splitting synchronously is better. What permissions are required for Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed. sql/split_at_test.go, line 85 [r1] (raw file):
|
Nice, this will be very useful for writing tests! Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed. sql/split_at_test.go, line 35 [r1] (raw file):
can use sqlrunner to make these shorter, e.g. cockroach/sql/distsql/server_test.go Line 43 in 5498634
sql/split_at_test.go, line 44 [r1] (raw file):
|
Our existing Permissions are Added parens to values. I also added two result columns with the raw key bytes and pretty version. This was to help in testing to verify that the range is split, but I think is just useful in general when using this feature. Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed. sql/split.go, line 52 [r1] (raw file):
|
r := sqlutils.MakeSQLRunner(t, sqlDB) |
sql/split_at_test.go, line 44 [r1] (raw file):
Previously, RaduBerinde wrote…
I agree, things can occasionally be much slower than the average run, especially on CircleCI. Also, this means that a "normal" client that creates a table and then runs SPLIT might encounter the same error? Seems like SPLIT needs to wait for some update to complete as part of its implementation..
Assuming this error comes from
AdminSplit
, maybe we should run that in a retry loop?
sql/split_at_test.go, line 85 [r1] (raw file):
Previously, bdarnell (Ben Darnell) wrote…
How about a
SHOW SPLITS
command (or something like that)?
Comments from Reviewable
Review status: 1 of 12 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed. sql/split_at_test.go, line 44 [r1] (raw file):
|
So now I like this code even more than earlier. My concern is that making the statement return result rows now conflicts with making it a variant of ALTER. The conflict comes from the general SQL principle that DDL statements do not return results, and ALTER (in all its variants) is supposed to be a DDL statement. From the documentation / teaching perspective, having to stay "ALTER is a DDL statement, except if you use SPLIT" feels wrong: it's very asymmetric and rather arbitrary. Mind that I do not know technical reasons why a variant of ALTER could not be a non-DDL statement while the others are, so my argument at this point could be deemed "academic". Yet I am concerned about consistency and the ease for users to understand what we're doing. An unknown risk could be be also that external SQL tools/drivers decide whether to parse return results or not whether on which statement was sent, instead of using the pgwire statement tag properly. Of course it's theoretical, but there is really no precedent whatsoever for ALTER not always being a DDL statement, so we're sailing in uncharted territory by changing this. Meanwhile I understand you implemented this to ease testing. There are really two positions to take:
My own preference goes for option 2. What do you think? Reviewed 11 of 11 files at r2. sql/split.go, line 74 [r2] (raw file):
You can change this to sql/parser/sql.y, line 1483 [r2] (raw file):
Ok here I don't have a very strong feeling but for consistency I'd like to suggest to prefix the node name with "AlterTable" like "AlterTableSplit" for consistency with the other Alter node. Same with the planNode btw, perhaps "alterTableSplitNode" instead of just "splitNode". Don't overthink it too much though, your call. But please also consider my comment at the top of the review. Comments from Reviewable |
By the way regardless of which direction you go, I may want to suggest to define this builtin function to compute the key encoding anyway, and simplify your split implementation by requiring just 1 expression of type string as argument to SPLIT (And let the user specify either an already-encoded constant string literal argument, or a function call to the builtin). If you don't want to do this right away please let me know, I can then file a separate issue to do this in a follow-up. Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed. Comments from Reviewable |
I don't care what the syntax is, though, if we wanted to make it a non ALTER for that reason. But both Ben and Peter appear to think ALTER is the best. However maybe they have new opinions now because of the return data. My vote is to leave things as they are even in the face of your arguments. One problem with adding a SQL function that does what you describe is a user must also specify a table and optional index. I'm not sure how to do that in a function without adding some special syntax (for example similar to the EXTRACT function which supports arbitrary keywords in it). I think a family of functions exposing lower-level information about cockroach would be neat, but I'm not sure how it would work in this case. Review status: 10 of 12 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed. sql/split.go, line 74 [r2] (raw file):
|
cc @sploiselle @jesse: for docs I'd like to see a section about which statements are DDL and which are not, with special mention of ALTER TABLE .. SPLIT in there. See #8990 for the followup with special functions. Reviewed 2 of 2 files at r3. sql/parser/sql.y, line 1483 [r2] (raw file):
|
rename_stmt: |
Comments from Reviewable
Reviewed 6 of 6 files at r4. Comments from Reviewable |
@@ -171,6 +171,14 @@ func (ts *TestServer) RPCContext() *rpc.Context { | |||
return nil | |||
} | |||
|
|||
// DistSender returns the DistSender used by the TestServer. | |||
func (ts *TestServer) DistSender() *kv.DistSender { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is there both DistSender()
and GetDistSender
?
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. server/testserver.go, line 175 [r4] (raw file):
|
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. server/testserver.go, line 175 [r4] (raw file):
|
Review status: 10 of 13 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. server/testserver.go, line 175 [r4] (raw file):
|
Review status: 10 of 13 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. server/testserver.go, line 175 [r4] (raw file):
|
This enables easy splitting of the KV space by specifying a table and values of its primary index. Fixes #8860
@bdarnell @petermattis Would one of you like to take a final look at this? |
The syntax looks good to me now. I'll trust the other reviewers on the rest of the change. Review status: 10 of 13 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. Comments from Reviewable |
This enables easy splitting of the KV space by specifying a table and
values of its primary index.
Fixes #8860
This change is