-
Notifications
You must be signed in to change notification settings - Fork 71
unify SQL and pipe name inference and simplify single by-only agg output #6263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit unifies the way l-values are inferred for expressions that omit the LHS of a field assignment or the AS clause in a select SELECT expression. ANSI SQL does not have specific guidance on how to infer arbitrary expressions so we blended in some of the existing logic from the pipe syntax. We also changed how "this" is handled. Instead of raising an error, the l-value becomes "that". We also adapted the heuristic for a single column agg with no groupings to work also for a single grouping with no aggs (i.e., emitting the values instead of a single-column record). These two changes means that "by this" now works and behaves like distinct. These changes also means that there's more typing with operators like cut when you want to preserve the object path, e.g., cut x.y.z:=x.y.z, but less typing when you don't, e.g., cut x.y.z to mean cut z:=x.y.z. Overall, this improves the consistency and orthogonality of the language.
| output: | | ||
| {y:123,"upper(s)":"HELLO, WORLD","sqrt(2)/z":2.8284271247461903} | ||
| {y:123,upper:"HELLO, WORLD","sqrt(2)/z":2.8284271247461903} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure there'll always be ways to trip over this, but the loss of the (s) feels like maybe a step backwards since now it's much easier to hit this:
$ echo '{x:{y:123},s:"Hello, world", r:"bye", z:0.5}' | super -c 'values {x.y,upper(s),upper(r),sqrt(2)/z}' -
record expression: duplicate field: "upper" at line 1, column 22:
values {x.y,upper(s),upper(r),sqrt(2)/z}
~~~~~~~~
But I guess a temporary problem since we'll eventually get to #5977?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes temporary but I didn't realize the extent of the impact here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the name of this test should change (or be deleted entirely?) since its original raison d'etre seemed to be confirming that query produced an error and as your change shows it no longer does that.
Per our discussion, I tried to change cut to take path expressions instead of assignments but this had an impact on the optimizer tests. When changing cut to values in these tests, they were no longer optimized the same way, which means the optimizer needs some work. It needed this work anyway, but rather than do this here, I backed out some of the cut changes and will defer the updates to a subsequent PR.
The GET /query/describe route sends a response containing the output keys for aggregation operations but Zui is interested in the input keys, so send those instead. This wasn't a problem for Zui for most queries until #6263, which changed the output key for "aggregate by a.b" from a.b (the path of the input key) to b (the last path component of the input key).
The GET /query/describe route sends a response containing the output keys for aggregation operations but Zui is interested in the input keys, so send those instead. This wasn't a problem for Zui for most queries until #6263, which changed the output key for "aggregate by a.b" from a.b (the path of the input key) to b (the last path component of the input key).
This commit unifies the way l-values are inferred for expressions that omit the LHS of a field assignment or the AS clause in a select SELECT expression. ANSI SQL does not have specific guidance on how to infer arbitrary expressions so we blended in some of the existing logic from the pipe syntax.
We also changed how "this" is handled. Instead of raising an error, the l-value becomes "that". We also adapted the heuristic for a single column agg with no groupings to work also for a single grouping with no aggs (i.e., emitting the values instead of a single-column record). These two changes means that "by this" now works and behaves like distinct.
Overall, this improves the consistency and orthogonality of the language.
Fixes #6157
Fixes #5315