Skip to content

Conversation

@mccanne
Copy link
Collaborator

@mccanne mccanne commented Sep 22, 2025

This commit unifies the way l-values are inferred for expressions that omit the LHS of a field assignment or the AS clause in a select SELECT expression. ANSI SQL does not have specific guidance on how to infer arbitrary expressions so we blended in some of the existing logic from the pipe syntax.

We also changed how "this" is handled. Instead of raising an error, the l-value becomes "that". We also adapted the heuristic for a single column agg with no groupings to work also for a single grouping with no aggs (i.e., emitting the values instead of a single-column record). These two changes means that "by this" now works and behaves like distinct.

Overall, this improves the consistency and orthogonality of the language.

Fixes #6157
Fixes #5315

This commit unifies the way l-values are inferred for expressions
that omit the LHS of a field assignment or the AS clause in a select
SELECT expression.  ANSI SQL does not have specific guidance on how
to infer arbitrary expressions so we blended in some of the existing
logic from the pipe syntax.

We also changed how "this" is handled.  Instead of raising an error,
the l-value becomes "that".  We also adapted the heuristic for a single
column agg with no groupings to work also for a single grouping with
no aggs (i.e., emitting the values instead of a single-column record).
These two changes  means that "by this" now works and behaves like distinct.

These changes also means that there's more typing with operators like
cut when you want to preserve the object path, e.g., cut x.y.z:=x.y.z,
but less typing when you don't, e.g., cut x.y.z to mean cut z:=x.y.z.

Overall, this improves the consistency and orthogonality of the language.
output: |
{y:123,"upper(s)":"HELLO, WORLD","sqrt(2)/z":2.8284271247461903}
{y:123,upper:"HELLO, WORLD","sqrt(2)/z":2.8284271247461903}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there'll always be ways to trip over this, but the loss of the (s) feels like maybe a step backwards since now it's much easier to hit this:

$ echo '{x:{y:123},s:"Hello, world", r:"bye", z:0.5}' | super -c 'values {x.y,upper(s),upper(r),sqrt(2)/z}' -
record expression: duplicate field: "upper" at line 1, column 22:
values {x.y,upper(s),upper(r),sqrt(2)/z}
                     ~~~~~~~~

But I guess a temporary problem since we'll eventually get to #5977?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes temporary but I didn't realize the extent of the impact here.

Copy link
Contributor

@philrz philrz Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the name of this test should change (or be deleted entirely?) since its original raison d'etre seemed to be confirming that query produced an error and as your change shows it no longer does that.

Per our discussion, I tried to change cut to take path expressions
instead of assignments but this had an impact on the optimizer tests.
When changing cut to values in these tests, they were no longer
optimized the same way, which means the optimizer needs some work.
It needed this work anyway, but rather than do this here, I backed out
some of the cut changes and will defer the updates to a subsequent PR.
@mccanne mccanne merged commit 5163d68 into main Sep 24, 2025
5 checks passed
@mccanne mccanne deleted the unify-name-inference branch September 24, 2025 02:09
nwt added a commit that referenced this pull request Nov 17, 2025
The GET /query/describe route sends a response containing the output
keys for aggregation operations but Zui is interested in the input keys,
so send those instead.

This wasn't a problem for Zui for most queries until #6263, which
changed the output key for "aggregate by a.b" from a.b (the path of the
input key) to b (the last path component of the input key).
nwt added a commit that referenced this pull request Nov 17, 2025
The GET /query/describe route sends a response containing the output
keys for aggregation operations but Zui is interested in the input keys,
so send those instead.

This wasn't a problem for Zui for most queries until #6263, which
changed the output key for "aggregate by a.b" from a.b (the path of the
input key) to b (the last path component of the input key).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dotted path expr in field assignment should create value named for last element in path Cast shorthand in record expression creates field name "cast"

4 participants