Lateral pg19 by yjhjstz · Pull Request #13 · quantumiodb/pgorca

yjhjstz · 2026-06-20T00:09:13Z

No description provided.

…ed commute PG-style LATERAL in the FROM clause now goes through ORCA instead of falling back to the standard planner. Three changes are needed together to make this work: 1. gpopt/translate/CTranslatorQueryToDXL.cpp Remove the blanket "LATERAL unsupported" raise. Outer references from a LATERAL RTE's inner Query already resolve via the parent translator's CMappingVarColId (initialised in the constructor) — no separate plumbing needed for the Var-to-ColRef mapping. 2. libgpopt/src/translate/CTranslatorExprToDXL.cpp (PdxlnNLJoin) Generalise the existing DPE-PartitionSelector branch that turns a NL into an IndexNLJ when the inner side has outer refs to the outer child's output. Any inner subtree whose outer references intersect the outer child's output now goes the same route: outer_refs get registered in m_phmcrdxlnIndexLookup so the inner scalar translator emits a resolved Ident, and the NL emits PARAM_EXEC nest params (under EopttraceIndexedNLJOuterRefAsParams, which pg_orca enables by default). Also drops the GPOS_ASSERT_IMP that forbade outer refs in non-index NL inner children — that condition is now legal. 3. libgpopt/src/xforms/CXformInnerJoinCommutativity.{cpp,h} Tighten the xform's promise: refuse to commute an InnerJoin when one child's outer references reach into the other child's output columns. A plain CLogicalInnerJoin is symmetric in ORCA's algebra, so without this guard the join enumerator happily produced the swapped orientation for a LATERAL-shaped join. Putting the correlated side on the NL outer is unexecutable: the executor opens outer first and the outer-ref columns are still unbound. After this commit the six LATERAL shapes (uncorrelated, equi-correlated, constant projection, TVF outer-ref arg, LIMIT, LEFT JOIN LATERAL) all optimise to a correct ORCA plan, though equi-correlated cases still emit NL+Materialize rather than HashJoin — that decorrelation comes in the follow-up commit.

LATERAL with a top-level CLogicalSelect on the right side (the typical "FROM x, LATERAL (SELECT ... WHERE inner = x.col)" pattern) was being turned into a plain CLogicalInnerJoin, which forced the optimizer onto the NL+Materialize path. PG's own planner pulls these LATERALs up into a HashJoin; ORCA can do the same once the algebra is right. libgpopt/src/translate/CTranslatorDXLToExpr.cpp (PexprLogicalJoin) Heuristic gate: if the right child's top operator is a CLogicalSelect whose outer references intersect the left child's output, rebuild the join as CLogicalInnerApply / CLogicalLeftOuterApply instead of a plain Join. CXformInnerApply2InnerJoin / CXformLeftOuterApply2LeftOuterJoin then pull the correlated predicate out of the Select and lower the result to a Join, which the cost model picks as HashJoin. Restricting to the "top is CLogicalSelect" shape keeps the other LATERAL forms (TVF outer-ref args, LIMIT/Sort, constant projection) on the plain-Join path, where the previous commit's commutativity guard handles them via NL + nest params. libgpopt/include/gpopt/xforms/CXformApply2Join.h (CreateCorrelatedApply) Guard against TApply::PdrgPcrInner() == nullptr. Apply objects built from LATERAL have no inner scalar colref (LATERAL returns a relation, not a scalar), so the scalar-subquery-shaped correlated-apply form here doesn't apply — no-op instead of dereferencing the null pointer. Effect on the six LATERAL cases: equi-correlated inner-LATERAL drops from NL+Materialize to a clean HashJoin matching PG; LEFT LATERAL drops to HashRightJoin (still carries a triplicated Hash Cond pending the LOJ inferred-pred dedup in the next commit). Other shapes unchanged.

PexprInferPredicates extends a join's predicate with extras derived from constraint propagation (e.g. the commuted form of an equality, transitive closures, etc). After predicate push-through, MakeJoinWithoutInferredPreds strips the redundant ones back out via PexprRemoveImpliedConjuncts, keyed on equivalence classes. CanRemoveInferredPredicates was hard-coded to InnerJoin only — left over from when LOJ semantics were considered too tricky to dedupe. The original note "currently, only inner join is included, but we can add more later" acknowledged the limitation. LeftOuterJoin's null-preserving side cares about which qualifying tuples pair up, not about how many copies of an equivalent equality predicate the matcher evaluates, so the dedup is semantically safe. Symptom: LEFT JOIN LATERAL (... WHERE inner = outer.col) produced a HashRightJoin whose Hash Cond was a 3-way AND of equivalent equalities: Hash Cond: ((lt2.a = lt1.x) AND (lt2.a = lt1.x) AND (lt2.a = lt1.x)) Three forms entered the predicate during preprocessing — original a=x, commuted x=a from constraint inference, and a re-pushed copy — and the LOJ branch of MakeJoinWithoutInferredPreds was a no-op. After this commit the predicate is the single (lt2.a = lt1.x).

PexprPruneUnusedComputedColsRecursive walks the expression top-down with a required-columns set, dropping CScalarProjectElements that nothing upstream consumes. The required set was built from each operator's own PcrsLocalUsed and its scalar children's used columns (via CExpressionHandle::PcrsUsedColumns) — neither of which captures outer references that one relational child holds against another sibling. For a LATERAL whose inner references a computed column from a derived table on the outer side, the chain is: LogicalApply / LogicalJoin ├── LogicalProject(dv = val * 2) │ └── LogicalGet(na) └── LogicalSelect(filter: nb.id = dv) └── LogicalGet(nb) The inner Select's DeriveOuterReferences() = {dv}, but the Apply's PcrsUsedColumns() returns only the columns from the scalar predicate (true) and PcrsLocalUsed (empty). The pruner descends into the outer Project with `dv` absent from pcrsReqd → defined - required = {dv} → the Project gets stripped. The dangling CScalarIdent "dv" then crashes DXL→PlStmt translation with "Attribute number N not found". Fix: before recursing into children, fold each relational child's DeriveOuterReferences() into pcrsReqd. Those refs are columns the child needs from its siblings, so siblings' producers must be preserved. Includes outer refs that escape this operator entirely (genuine refs to the grandparent) — those just stay in pcrsReqd as we descend; they have no producer at this level and pruning logic only acts on Project / GbAgg defined columns, so the extra entries are harmless. Symptom: `SELECT count(*) FROM (SELECT id, val*2 AS dv FROM t) a, LATERAL (SELECT * FROM s WHERE s.id = a.dv) x` fell back to PG with "DXL-to-PlStmt Translation: Attribute number 8 not found in project list". After this commit the query lowers to a clean HashJoin under ORCA.

Adds a dedicated lateral.sql / lateral.out regression test under test/schedule covering the six base LATERAL shapes (uncorrelated, equi-correlated, scalar projection, TVF outer-ref arg, LIMIT, LEFT JOIN LATERAL) plus twelve nested / composite variants: - 2-level and 3-level chains (each LATERAL references its immediate outer; or one LATERAL references both outer and middle) - LATERAL nested inside another LATERAL - LATERAL containing an inner JOIN (decorrelates to HashJoin chain) - LEFT JOIN LATERAL with a nested LATERAL + LIMIT - LATERAL containing an aggregate - LATERAL with non-equi range correlation - 3-level chain ending in a TVF - LATERAL inside EXISTS - LEFT LATERAL with an inner filter that excludes everything - LATERAL referencing a derived-table computed column under an aggregate (regression for the sibling-correlated outer-ref pruning bug fixed in PexprPruneUnusedComputedColsRecursive) The expected file pins the actual plan shape (Hash Join / Hash Right Join / Function Scan / NL with nest params, etc.), so any future change that regresses the commutativity guard, the selective Apply conversion, the LOJ inferred-pred dedup, or the preprocessor outer- ref preservation will diff visibly. Three back-to-back fresh-instance runs show the plans are deterministic at ~150 ms.

CLogicalInnerApply / CLogicalLeftOuterApply built from LATERAL have no inner scalar colref (LATERAL returns a relation, not a scalar), so m_pdrgpcrInner is nullptr. PopCopyWithRemappedColumns blindly called CUtils::PdrgpcrRemap on it, which dereferences the null array in Release builds and SIGSEGVs (Debug builds catch it at the assert). The crash is reached whenever ORCA needs to deep-copy a LATERAL-derived Apply with column remapping, e.g. when a CLogicalCTEConsumer inlines a producer whose body contains the Apply: WITH t AS (SELECT * FROM a, LATERAL (SELECT * FROM b WHERE ...) s) SELECT count(*) FROM t; Guard against nullptr m_pdrgpcrInner and rebuild the copy with the 1-arg ctor (the 2-arg form asserts pdrgpcrInner is non-null+non-empty). Found by walking the LATERAL edge-case matrix; lateral.sql now covers this shape under E2.

Adds an "Edge cases" section to lateral.sql covering shapes that came out of an LATERAL edge-case sweep: E1 varlevelsup=2: LATERAL nested in a correlated scalar subquery E2 LATERAL inside a CTE body (locks in the InnerApply nullptr- PdrgPcrInner copy-with-remap crash that this commit pairs with) E3 VALUES + LATERAL E4 LATERAL + GROUP BY at outer E5 LATERAL + GROUPING SETS E6 LATERAL + window function E7 UNION ALL inside the LATERAL body E8 LATERAL with DISTINCT outside E9 LATERAL top-N per outer (ORDER BY ... LIMIT 1) E10 INSERT...SELECT with LATERAL E11 3-level LATERAL where the grandchild references the outermost E12 LATERAL unnest(array) All twelve go through ORCA without fallback. Three back-to-back fresh-instance runs settle at ~211 ms; the plan shapes are stable. Edge cases that intentionally do NOT land here: - CTE inside a LATERAL body -> pre-existing ORCA limitation "Operator CTE with outer references not supported" - PREPARE/EXECUTE with $params -> pre-existing limitation (requires optimizer_enable_query_parameter, not wired in pg_orca) - FOR UPDATE on an aggregate query -> SQL-level rejection

Adds test/sql/pg_lateral.sql — a verbatim port of the "Test LATERAL" block from PostgreSQL upstream's src/test/regress/sql/join.sql (the chunk introduced by the "-- Test LATERAL" header), plus inline setup for int2_tbl / int4_tbl / int8_tbl / tenk1 / onerow mirroring upstream's test_setup.sql and the top of join.sql. tenk1's 10000-row payload is loaded via COPY from $PG_REGRESS_SQL/data/tenk.data so the test stays in sync with PG when that file changes. Locks in roughly 95 queries covering: - basic equi-correlated LATERAL with tenk1 / int4_tbl / int8_tbl - lateral-versus-parent scope resolution (the int8_tbl q1/q2 case) - LATERAL with TVF args, UNION ALL, VALUES, GROUPING SETS-adjacent aggregates, JOIN inside LATERAL - lateral references requiring pullup at outer-join boundaries - PlaceHolderVar nesting and the bug #9041 postponed-quals case - dummy/empty inner rels (bug #15694) - LATERAL with VALUES tuple containing outer refs to both sides of an enclosing LEFT JOIN - intentional SQL-level rejections (missing LATERAL keyword, RIGHT/FULL JOIN with LATERAL, ambiguous column refs, UPDATE/DELETE LATERAL restrictions) Current ORCA coverage on this set: 15 queries optimised by ORCA (visible Optimizer: pg_orca marker), 22 queries fall back to PG, all correctness preserved. Fallback breakdown: - 14 "DXL-to-PlStmt Translation: Attribute number N not found in project list" — same class as the sibling-correlated outer-ref bug fixed in CExpressionPreprocessor, but in more complex shapes involving LeftOuter + LATERAL VALUES referencing both sides of the outer join; ORCA's enumerator places the LATERAL on a side that can't see one of its referenced columns at execution time - 2 "Whole-row variable" (pre-existing ORCA limitation, e.g. coalesce(i) on a record type) - 2 "no plan has been computed for required properties" (enum gap) - 4 intentional PG-side SQL rejections Three back-to-back fresh-instance runs land at ~370 ms with stable plans. Future fixes to the still-falling-back patterns will show up as fewer fallback INFO lines and more Optimizer: pg_orca markers in the expected output diff. Statement timeout of 20s guards against ORCA picking a pathologically bad plan that would otherwise hang the whole regression suite.

CJoinOrderDPv2 enumerates join orders by combining subsets of the NAryJoin's atoms via dynamic programming. It tracked LOJ right-child dependencies (an LOJ's right side must be paired with its left), but did not track the more general LATERAL-style dependency where one atom holds outer references to another sibling atom's output columns. Without this check the enumerator happily formed subsets like {x, lateral_ref_to_y} from a query SELECT * FROM int8_tbl x LEFT JOIN (SELECT q1, coalesce(q2,0) q2 FROM int8_tbl) y ON x.q2 = y.q1, LATERAL (VALUES (x.q1, y.q1, y.q2)) v(xq1, yq1, yq2); The chosen physical plan placed the LATERAL VALUES inside a Nested Loop whose outer was just x — but the VALUES references colids produced by y, which is on a different side of the enclosing LeftOuter. At execution time those refs are unbound; DXL→PlStmt catches it as "Attribute number N not found in project list" and falls back to the PG planner. Precompute per-atom sibling requirements in the DPv2 constructor: outer_refs_i = atom_i.DeriveOuterReferences() sibling_refs = outer_refs_i − m_outer_refs // refs to NAryJoin // siblings, not refs // escaping the join sibling_required[i] = { j | sibling_refs ∩ atom_j.DeriveOutputColumns() } In GetJoinExpr, reject any candidate join whose combined atom set is missing a required sibling of one of its members. The DP table then never enumerates the unexecutable subset, and the LATERAL atom can only enter the join once all its required siblings are already in. Effect on test/sql/pg_lateral.sql (the PostgreSQL upstream LATERAL section ported over): fallbacks drop from 22 to 12, ORCA-handled EXPLAINs go from 15 to 17, and the "DXL-to-PlStmt Translation: Attribute number N not found" pattern from this query shape disappears. Other ORCA tests, the PG --pg-tests suite, and cost_align.sh are unchanged. Three back-to-back fresh-instance pg_lateral runs land at ~358 ms with deterministic plans.

Re-captures test/expected/pg_lateral.out after the DPv2 sibling- visibility enforcement. Bottom-line change: fallback INFO lines: 22 -> 12 (-10) "Optimizer: pg_orca" markers: 15 -> 17 (+2) The "DXL-to-PlStmt Translation: Attribute number N not found in project list" class of failure that came from the LeftOuter + LATERAL VALUES atom-subset bug is now gone. Remaining 12 fallbacks are unrelated patterns (Whole-row variable, "no plan computed for required properties", and the four intentional PG-side SQL errors from upstream's join.sql).

cr-gpt · 2026-06-20T00:09:16Z

Seems you are using me but didn't get OPENAI_API_KEY seted in Variables/Secrets for this repo. you could follow readme for more information

CI ran pg_lateral with $PG_REGRESS_SQL unset, so the \set tenkdata `echo "$PG_REGRESS_SQL/data/tenk.data"` trick expanded to "/data/tenk.data" and the server-side COPY failed: ERROR: could not open file "/data/tenk.data" for reading: No such file or directory Copy PostgreSQL's tenk.data (670 KB, 10 000 rows) into the repo at test/data/tenk.data and locate it via pg_regress's :abs_srcdir variable (populated from PG_ABS_SRCDIR with \getenv). That env var is set unconditionally by pg_regress to the --inputdir argument, so the COPY now resolves wherever the test runs. The file is verbatim from PG REL_18_3 src/test/regress/data/tenk.data; since the test header already declares the section as a port of PG's join.sql LATERAL block, shipping the matching data file alongside keeps the test reproducible without an external postgres source tree. Verified locally with three back-to-back fresh-instance --orca-tests runs: pg_lateral now passes in ~350 ms on each.

cr-gpt · 2026-06-20T00:37:06Z

Seems you are using me but didn't get OPENAI_API_KEY seted in Variables/Secrets for this repo. you could follow readme for more information

…tchlevels CI's Debug build (apt postgresql-server-dev-18, likely 18.4+) renders the EXPLAIN VERBOSE output of select * from int4_tbl i left join lateral (select coalesce(i) from int2_tbl j where i.f1 = j.f1) k on true; as Output: i.f1, (i.*) ... Output: j.f1, i.* while the older PG18.3 (local dev box) emits the COALESCE wrapper: Output: i.f1, (COALESCE(i.*)) ... Output: j.f1, COALESCE(i.*) This is a PG-side deparser/simplification change: newer PG18 elides COALESCE on a whole-row reference inside a LATERAL output list when the planner can prove the row is non-null in that position. The underlying query goes through ORCA's "Whole-row variable" fallback, so the output comes from PG's planner+executor; ORCA doesn't see this difference. Add a matchsubs rule that strips COALESCE(<ident>.*) wrappers in the test output so the same expected file works regardless of which PG18 patchlevel CI installs.

cr-gpt · 2026-06-20T01:20:03Z

Seems you are using me but didn't get OPENAI_API_KEY seted in Variables/Secrets for this repo. you could follow readme for more information

yjhjstz added 10 commits June 19, 2026 14:51

yjhjstz merged commit f705c04 into main Jun 23, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Lateral pg19#13

Lateral pg19#13
yjhjstz merged 12 commits into
mainfrom
lateral-pg19

yjhjstz commented Jun 20, 2026

Uh oh!

cr-gpt Bot commented Jun 20, 2026

Uh oh!

cr-gpt Bot commented Jun 20, 2026

Uh oh!

cr-gpt Bot commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

yjhjstz commented Jun 20, 2026

Uh oh!

cr-gpt Bot commented Jun 20, 2026

Uh oh!

cr-gpt Bot commented Jun 20, 2026

Uh oh!

cr-gpt Bot commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant