Test: join_precedence
File: sqlparser_common.rs
Description:
The following query:
SELECT *
FROM t1
NATURAL JOIN t5
INNER JOIN t0 ON (t0.v1 + t5.v0) > 0
WHERE t0.v1 = t1.v0;
is canonicalized for the Snowflake dialect as:
SELECT * FROM t1 NATURAL JOIN (t5 INNER JOIN t0 ON (t0.v1 + t5.v0) > 0) WHERE t0.v1 = t1.v0
However, this rewrite is not semantics-preserving.
NATURAL JOIN implicitly creates a join condition based on all columns
that have the same name in both inputs.
By moving t0 inside the right side of the NATURAL JOIN,
the set of columns visible to the NATURAL JOIN changes.
As a result, additional implicit join predicates may be introduced,
changing the query result.
This behavior violates the expectation that canonicalization should
not alter query semantics.
Minimal reproducible example:
CREATE OR REPLACE TEMP TABLE t1 (v0 INT, v1 INT);
CREATE OR REPLACE TEMP TABLE t5 (v0 INT);
CREATE OR REPLACE TEMP TABLE t0 (v1 INT);
INSERT INTO t1 VALUES (1, 100), (1, 200);
INSERT INTO t5 VALUES (1);
INSERT INTO t0 VALUES (1), (2);
-- Q1: original query
SELECT COUNT(*) AS cnt
FROM t1
NATURAL JOIN t5
INNER JOIN t0 ON (t0.v1 + t5.v0) > 0
WHERE t0.v1 = t1.v0;
-- Q2: semantics-preserving canonical form
SELECT COUNT(*) AS cnt
FROM (t1 NATURAL JOIN t5)
INNER JOIN t0 ON (t0.v1 + t5.v0) > 0
WHERE t0.v1 = t1.v0;
-- Q3: canonical form currently produced for Snowflake
SELECT COUNT(*) AS cnt
FROM t1
NATURAL JOIN (t5 INNER JOIN t0 ON (t0.v1 + t5.v0) > 0)
WHERE t0.v1 = t1.v0;
-- Expected: Q1 == Q2
-- Actual: Q3 returns a different result
Test: join_precedence
File: sqlparser_common.rs
Description:
The following query:
SELECT *
FROM t1
NATURAL JOIN t5
INNER JOIN t0 ON (t0.v1 + t5.v0) > 0
WHERE t0.v1 = t1.v0;
is canonicalized for the Snowflake dialect as:
SELECT * FROM t1 NATURAL JOIN (t5 INNER JOIN t0 ON (t0.v1 + t5.v0) > 0) WHERE t0.v1 = t1.v0
However, this rewrite is not semantics-preserving.
NATURAL JOIN implicitly creates a join condition based on all columns
that have the same name in both inputs.
By moving t0 inside the right side of the NATURAL JOIN,
the set of columns visible to the NATURAL JOIN changes.
As a result, additional implicit join predicates may be introduced,
changing the query result.
This behavior violates the expectation that canonicalization should
not alter query semantics.
Minimal reproducible example:
CREATE OR REPLACE TEMP TABLE t1 (v0 INT, v1 INT);
CREATE OR REPLACE TEMP TABLE t5 (v0 INT);
CREATE OR REPLACE TEMP TABLE t0 (v1 INT);
INSERT INTO t1 VALUES (1, 100), (1, 200);
INSERT INTO t5 VALUES (1);
INSERT INTO t0 VALUES (1), (2);
-- Q1: original query
SELECT COUNT(*) AS cnt
FROM t1
NATURAL JOIN t5
INNER JOIN t0 ON (t0.v1 + t5.v0) > 0
WHERE t0.v1 = t1.v0;
-- Q2: semantics-preserving canonical form
SELECT COUNT(*) AS cnt
FROM (t1 NATURAL JOIN t5)
INNER JOIN t0 ON (t0.v1 + t5.v0) > 0
WHERE t0.v1 = t1.v0;
-- Q3: canonical form currently produced for Snowflake
SELECT COUNT(*) AS cnt
FROM t1
NATURAL JOIN (t5 INNER JOIN t0 ON (t0.v1 + t5.v0) > 0)
WHERE t0.v1 = t1.v0;
-- Expected: Q1 == Q2
-- Actual: Q3 returns a different result