New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vt: case preserve column names #1688
Conversation
LGTM on splitquery after you address my comments.
|
@erzel code review comments
Review status: 4 of 47 files reviewed at latest revision, 3 unresolved discussions. go/vt/tabletserver/splitquery/full_scan_algorithm.go, line 198 [r1] (raw file):
|
Reviewed 40 of 47 files at r1. go/cistring/cistring.go, line 15 [r1] (raw file):
Can you add to the doc comment an explanation of why you chose the "extra storage" implementation rather than "lowercase on compare"? Is it mainly for the CPU during comparisons? Or do you expect the conversion-on-compare to put extra load on the GC? go/cistring/cistring.go, line 18 [r1] (raw file):
At lunch, we discussed the zero byte version of this, used here: https://github.com/youtube/vitess/blob/master/go/vt/mysqlctl/replication/replication.go#L31 Will that work here? go/cistring/cistring.go, line 22 [r1] (raw file):
This should be called go/cistring/cistring.go, line 34 [r1] (raw file):
The name go/cistring/cistring.go, line 46 [r1] (raw file):
For cases where you want to lowercase once and compare in a loop, maybe you could recommend a pattern like this: ciInput := ci.NewString(input)
for ... {
if ciName.Equal(ciInput) {
...
}
} This has the benefit of avoiding manual calls to go/cistring/cistring.go, line 48 [r1] (raw file):
Do you ever need to compare go/vt/sqlparser/ast.go, line 1746 [r1] (raw file):
If you embed, you won't have to duplicate the type ColIdent struct {
cistring.CIString
} Is there a reason you prefer wrapping over embedding? test/vtgatev3_test.py, line 359 [r1] (raw file):
Is there a test for giving the case used in schema if they do Comments from Reviewable |
@enisoc code review comments
Review status: 23 of 47 files reviewed at latest revision, 11 unresolved discussions. go/cistring/cistring.go, line 15 [r1] (raw file):
|
LGTM for the parts outside splitquery.
|
Review status: 44 of 47 files reviewed at latest revision, 4 unresolved discussions. go/cistring/cistring.go, line 18 [r1] (raw file):
|
Use the newly-sugested zero-size construct that prevents the struct from comparing with itself. The previous construct used a nil function pointer that consumed one word.
Review status: 42 of 47 files reviewed at latest revision, 4 unresolved discussions. go/cistring/cistring.go, line 18 [r1] (raw file):
|
Reviewed 2 of 2 files at r4. Comments from Reviewable |
Signed-off-by: Vitess Cherry-Pick Bot <vitess-cherrypick-bot@planetscale.com> Co-authored-by: Vitess Cherry-Pick Bot <vitess-cherrypick-bot@planetscale.com>
@erzel for split query
@michael-berlin for vtworker
@enisoc and @alainjobart for the rest
I've created two interchangeable types: cistring.CIString and sqlparser.ColIdent. Those who don't want to depend on sqlparser are supposed to use CIString. Others can use either type depending on which is more convenient. sqlparser.ColIdent has the necessary functions to be in the AST.
Cases are mostly preserved. There are some exceptions:
The general approach I've used is to convert case-insensitive data into CI types as early as possible. This way, the compiler is likely to catch more accidental comparisons. The main caveat is that this does not work in situations where a variable gets anonymized into an interface. I found a few places: Printf, DeepEqual & JSON. Hopefully, there shouldn't be more. If there are, we just have to fix forward.
The worker code uses protobufs to store the schema. So, column names have remained normal strings there.
Ive added new tests using some reasonable judgment.
This change is