optbuilder: support unparenthesized composite-type field access in PL/pgSQL#169592
optbuilder: support unparenthesized composite-type field access in PL/pgSQL#169592rafiss wants to merge 1 commit intocockroachdb:masterfrom
Conversation
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
|
Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link) |
In PL/pgSQL, accessing a field of a composite-typed variable previously
required surrounding the variable name in parentheses, e.g. `(v).x`
instead of `v.x`. The unparenthesized form returned an
`UndefinedRelationError` because SQL's `name1.name2` syntax is parsed
as `relation.column`, never as `column.field`.
Postgres works around this in PL/pgSQL by installing parser hooks
(`plpgsql_pre_column_ref` / `plpgsql_post_column_ref` in
`src/pl/plpgsql/src/pl_comp.c`) that resolve a `ColumnRef` against the
PL/pgSQL variable namespace when SQL resolution fails or finds an
ambiguous match. This commit reproduces the post-column-ref behavior at
the optbuilder layer:
- `scope` gains a `plpgsqlVarRef` hook field that the PL/pgSQL
builder installs while building SQL expressions and statements.
- When `scope.VisitPre` encounters a `ColumnItem` whose qualified
SQL resolution fails, the hook is consulted; if the prefix names
a composite-typed PL/pgSQL variable with a matching field label,
the node is rewritten to a `ColumnAccessExpr` over the variable
(the equivalent of `(v).x`) and the visitor re-resolves it
normally.
- When SQL resolution succeeds and the hook also matches, an
`AmbiguousColumn` error is raised, matching Postgres' behavior
when a PL/pgSQL variable shadows a real column in scope.
- When the hook reports that the prefix names a PL/pgSQL variable
but no rewrite applies (scalar variable, or composite without the
given field), the visitor suppresses the existing
"surround in parentheses" hint, since parens would not help in
those cases. Postgres similarly emits no hint there.
The "surround in parentheses" hint continues to fire for the original
case it was added for: an SQL composite-typed column accessed without
parens outside of PL/pgSQL. That case is a fundamental SQL grammar
limitation shared with Postgres and is not addressed here.
Out of scope (potential follow-ups): block-label-qualified references
(`outer.v.x`), `v.*` whole-row references, and the
`#variable_conflict` directive.
Resolves: cockroachdb#114687
Epic: CRDB-49018
Release note (sql change): PL/pgSQL routines can now access fields of
composite-typed variables without surrounding the variable name in
parentheses (for example `v.x` in addition to the previously required
`(v).x`), matching PostgreSQL behavior. When a PL/pgSQL variable
shadows a column with the same name in a SQL statement inside the
routine, the reference is now rejected as ambiguous instead of being
silently resolved.
e2b188f to
f28bb8b
Compare
DrewKimball
left a comment
There was a problem hiding this comment.
Cool!
@DrewKimball reviewed 6 files and all commit messages, and made 4 comments.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on rafiss).
pkg/sql/logictest/testdata/logic_test/plpgsql_record line 453 at r1 (raw file):
# Unparenthesized field access on a composite-typed PL/pgSQL variable. # Regression test for #114687. subtest unparenthesized_composite_access
It would be good to test successful variable resolution in other statements as well:
- Standalone SQL statement
- SELECT ... INTO
- Bound statement for a cursor
pkg/sql/opt/optbuilder/scope.go line 223 at r1 (raw file):
r := s.builder.allocScope() r.parent = s r.plpgsqlVarRef = s.plpgsqlVarRef
nit: rather than having to propagate this field during scope pushing, could we walk the parent pointers until we see a non-nil plpgsqlVarRef?
Another approach might be to mark scopeColumns as being PL/pgSQL vars, attempt to look up that column by name in response to UndefinedRelationError, and then make the replacement if the resolved scopeColumn turns out to be a variable.
pkg/sql/opt/optbuilder/scope.go line 1109 at r1 (raw file):
panic(errors.WithHint(resolveErr, "to access a field of a composite-typed column or variable, "+ "surround the column/variable name in parentheses: (varName).fieldName"))
nit: Now that we're handling PL/pgSQL variables separately, let's remove the mention of variables here.
In PL/pgSQL, accessing a field of a composite-typed variable previously required surrounding the variable name in parentheses, e.g.
(v).xinstead ofv.x. The unparenthesized form returned anUndefinedRelationErrorbecause SQL'sname1.name2syntax is parsed asrelation.column, never ascolumn.field.Postgres works around this in PL/pgSQL by installing parser hooks that resolve a
ColumnRefagainst the PL/pgSQL variable namespace when SQL resolution fails or finds an ambiguous match. This commit reproduces the post-column-ref behavior at the optbuilder layer:scopegains aplpgsqlVarRefhook field that the PL/pgSQL builder installs while building SQL expressions and statements.scope.VisitPreencounters aColumnItemwhose qualified SQL resolution fails, the hook is consulted; if the prefix names a composite-typed PL/pgSQL variable with a matching field label, the node is rewritten to aColumnAccessExprover the variable (the equivalent of(v).x) and the visitor re-resolves it normally.AmbiguousColumnerror is raised, matching Postgres' behavior when a PL/pgSQL variable shadows a real column in scope.The "surround in parentheses" hint continues to fire for the original case it was added for: a composite-typed column accessed without parens outside of PL/pgSQL. That case is a fundamental SQL grammar limitation shared with Postgres and is not changed here.
Resolves: #114687
Epic: CRDB-49018
Release note (sql change): PL/pgSQL routines can now access fields of composite-typed variables without surrounding the variable name in parentheses (for example
v.xin addition to the previously required(v).x), matching PostgreSQL behavior. When a PL/pgSQL variable shadows a column with the same name in a SQL statement inside the routine, the reference is now rejected as ambiguous instead of being silently resolved.