New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Add builtin function to return locality
of current node
#37310
Comments
CC @awoods187 |
Another possible idea is determining the type at SQL analysis time. For example, say the locality is:
Then we'd create a tuple datum like this with type
This would allow more natural SQL expressions like this:
|
@knz, do you have an opinion on how this function should work? I think that the second idea is more useable. However, it might set a new precedent, since we'd be deriving a more specific return type during type checking, based on the sema context. That would need to now contain a new I think I'll create a PR with this approach, and let people react there. I'd like to use this function in a demo I'm building for a talk. |
There is currently no easy way to programmatically inspect the locality of the current node. It appears in crdb_internal.gossip_nodes, but it's hard to work with because it's JSON, not constant-foldable, and keyed by node id. This commit adds a new locality builtin function that: Returns the hierarchical location of the current node as a tuple of labeled values, ordered from most inclusive to least inclusive. For example: `region=east,datacenter=us-east-1`. When building geo-distributed applications, this enables a very nice way to automatically assign the partition key, as illustrated below: CREATE TABLE charges ( region STRING NOT NULL DEFAULT (locality()).region, id UUID NOT NULL DEFAULT gen_random_uuid(), ... The DEFAULT expression for the region column automatically inserts the region value from the current node's locality. It also enables queries like this, that only touch rows in the current region (which is necessary to avoid cross-region hops): SELECT * FROM charges WHERE region = (locality()).region AND id = $1 The locality is constant, so the optimizer is able to fold column access to a constant value, which can then be used to select an optimal index. Resolves cockroachdb#37310 Release note (sql change): Adds a new locality builtin function that returns the hierarchical location of the current node as a tuple of labeled values, ordered from most inclusive to least inclusive.
Woah there is a problem here, you are defining a dynamic type: even with full knowledge of the SQL schema (the declarative context) there is no way to predict the type signature of the result when looking at the SQL text on the client. In fact the type signature and the validity of the SQL query as a whole may be different depending on the node where you happen to be connected. Think about this: if the client connects through a load balancer, they'll get 100% query failure on some nodes. This is not great design! pg labeled tuples were not meant to be used as associative arrays (which is what you are doing here). My recommendation is either to introduce an actual associative array type, or instead define the result of the function as a 2 dimensional array. This way the client can use |
I initially had similar thoughts to you, which is why I first did it exactly as you suggest (I have commits with both implementations). However, the more I thought about it, the more I saw the value in this other approach. Below are my arguments; you can decide if they're convincing. This type is known at analysis time, so it is not a dynamic type by the traditional definition (type not known until runtime). I'm saying that the locality would be considered part of the declarative context, just as the types of the placeholders are. That is the way the user thinks about locality as well, since they use it in As for the argument that depending on which nodes the clients connect to, they could get different results:
Why is giving different results preferable to sometimes raising an error and sometimes not? Ultimately, we took a shortcut with localities. To be consistent with how SQL schema works, we really should have had a statement like (don't get hung up on syntax, just want to communicate the idea):
Then we wouldn't even be having this discussion. We're treating locality when starting up nodes as if it's the dynamic thing, but it's really not. We really should be error'ing if you try to start up a node with locality fields that don't match other nodes. But now we've got it in our heads that there's some kind of dynamism here, when there really isn't. |
@RaduBerinde, @jordanlewis, see here for more discussion. |
Continuing the discussion from #37369.
|
So it seems to me that JSON is the superior solution so far. The syntax |
I think github had a glitch, several comments (including my last) just disappeared. Also I see Andy's comments is reportedly posted "6 hours from now". |
I like the JSON solution a lot.
Also I think that the syntax `locality()->'field'` is more readable than
`(locality()).field`
…--
Raphael 'kena' Poss
|
I like @andy-kimball's point about the I also agree with @knz that JSON is much easier to read. |
The JSON solution looks OK, but I find that anytime we use JSON I've got to go to the docs to remember how to access fields. The JSON and relational data models make for a rough fit. The more I work with constraints and localities, the more I think we've missed the boat on making them intuitive to use. Each person who's tried using the new locality-sensitive optimizer feature (Andy W, Rich, Roko, Nate) has been confused by some aspect of how to set it up. One of the reasons is due to the strange syntax:
instead of something that feels more natural with SQL, such as:
I'd expect to use strongly-typed localities, so that I get an error if I know I'm pulling in a bunch of extra stuff into what seems like a simple issue. But it's context to help others understand why I think we should at least consider more strongly-typed handling of |
I appreciate the issue has multiple aspects, and that using and creating zone configs are intimately related. However, please consider:
I think that's lack of familiarity. It's actually pretty simple and regular. Just don't pull locality labels in the type system. That's a bad idea all around. |
Your answer seems to be mixing locality with attributes. My understanding is that they're distinct and are passed separately on the command line:
I agree with you 100% that arbitrary attributes should not become part of type system. The proposed |
I'd say that it's possible/desirable for certain nodes to have more keys than the common set. for example suppose there are 3 regions, then in each region there's just 1 server, then later on additional servers are added in one of the regions. I expect that the nodes in that region will get additional attributes to distinguish them within the region, while the other nodes will remain with just a region attribute. |
Meanwhile the argument still stands, regardless of "mistakes" made, the same query should type-check identically at a given time stamp regardless of the node where you issue it. |
This is why I brought up the idea of Given that any sort of statically declared, strongly-typed solution like this is one or more releases away, I'm going to propose that we sidestep some of these questions by implementing only the function that Radu described:
That's easy to understand and optimize, and will fit nicely with whatever we decide to do in the future. |
I haven't seen a convincing argument against JSON.. |
Here are my reasons for not wanting to add
Users will need to know that they should use
|
I see, thanks. The yaml / json thing is pretty bad indeed. I see the value in holding off on this. We may have some of the same backward compatibility issues with |
Yes, I like the suggestion to put the new function in the |
There is currently no easy way to programmatically inspect the locality of the current node. It appears in crdb_internal.gossip_nodes, but it's hard to work with because it's JSON, not constant-foldable, and keyed by node id. This commit adds a new crdb_internal.locality_value builtin function that returns the value of the locality key given as its argument. When building geo-distributed applications, this enables a very nice way to automatically assign the partition key, as illustrated below: CREATE TABLE charges ( region STRING NOT NULL DEFAULT crdb_internal.locality_value('region'), id UUID NOT NULL DEFAULT gen_random_uuid(), ... The DEFAULT expression for the region column automatically inserts the region value from the current node's locality. It also enables queries like this, that only touch rows in the current region (which is necessary to avoid cross-region hops): SELECT * FROM charges WHERE region = crdb_internal.locality_value('region') AND id = $1 The optimizer folds the function to a constant value, which can then be used to select an optimal index. Resolves cockroachdb#37310 Release note (sql change): Adds a new locality_value builtin function that returns the value of the locality key given as its argument.
37369: sql: Add crdb_internal.locality_value builtin function r=andy-kimball a=andy-kimball There is currently no easy way to programmatically inspect the locality of the current node. It appears in crdb_internal.gossip_nodes, but it's hard to work with because it's JSON, not constant-foldable, and keyed by node id. This commit adds a new crdb_internal.locality_value builtin function that returns the value of the locality key given as its argument. When building geo-distributed applications, this enables a very nice way to automatically assign the partition key, as illustrated below: CREATE TABLE charges ( region STRING NOT NULL DEFAULT crdb_internal.locality_value('region'), id UUID NOT NULL DEFAULT gen_random_uuid(), ... The DEFAULT expression for the region column automatically inserts the region value from the current node's locality. It also enables queries like this, that only touch rows in the current region (which is necessary to avoid cross-region hops): SELECT * FROM charges WHERE region = crdb_internal.locality_value('region') AND id = $1 The optimizer folds the function to a constant value, which can then be used to select an optimal index. Resolves #37310 Release note (sql change): Adds a new locality_value builtin function that returns the value of the locality key given as its argument. 37513: opt: maintain a set of grouping cols instead of a list r=ridwanmsharif a=ridwanmsharif Fixes #37317 and possibly also fixes #37444. This commit changes the semantics of the grouping columns stored in scope to only now save the columns if they're unique. Release note: None Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com> Co-authored-by: Ridwan Sharif <ridwan@cockroachlabs.com>
Suggested feature
There is currently no easy way to programmatically inspect the locality of the current node (it appears in
crdb_internal.gossip_nodes
, but it's hard to work with). I think we should add some kind of builtinlocality
function that returns the tiers of the locality in a way that can easily be manipulated in SQL. Here's a strawman implementation that returns an ARRAY of (key, value) TUPLEs:An example use case
When building geo-distributed applications, this would enable a very nice way to automatically assign the partition key, as illustrated below. This is a credit card
charges
table that is partitioned onregion
. TheDEFAULT
expression for theregion
column automatically inserts the region value from the current node's locality (e.g.cloud=gce,region=us-east1,zone=us-east1-b
).There are undoubtedly other use cases for this as well.
Notes
We need to take care to always evaluate this on the Gateway node so that it's not shipped via DistSQL to other nodes (where each node could evaluate it to a different value).
The text was updated successfully, but these errors were encountered: