[SPARK-29648][SQL][TESTS] Port limit.sql #26311

maropu · 2019-10-30T06:50:42Z

What changes were proposed in this pull request?

This PR ports limit.sql from PostgreSQL regression tests https://github.com/postgres/postgres/blob/REL_12_STABLE/src/test/regress/sql/limit.sql

The expected results can be found in the link: https://github.com/postgres/postgres/blob/REL_12_STABLE/src/test/regress/expected/limit.out

Why are the changes needed?

To check behaviour differences between Spark and PostgreSQL

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass the Jenkins. And, Comparison with PgSQL results

maropu · 2019-10-30T06:57:44Z

sql/core/src/test/resources/sql-tests/inputs/postgreSQL/limit.sql

+-- Test null limit and offset.  The planner would discard a simple null
+-- constant, so to ensure executor is exercised, do this:
+-- [SPARK-29650] Discard a NULL constant in LIMIT
+select * from int8_tbl limit (case when random() < 0.5 then bigint(null) end);


Filed: https://issues.apache.org/jira/browse/SPARK-29650

SparkQA · 2019-10-30T07:05:02Z

Test build #112892 has finished for PR 26311 at commit 5032350.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2019-10-30T07:16:06Z

retest this please

SparkQA · 2019-10-30T10:32:45Z

Test build #112898 has finished for PR 26311 at commit 3abbe57.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-10-30T10:53:54Z

Test build #112894 has finished for PR 26311 at commit 3abbe57.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-11-05T03:16:29Z

sql/core/src/test/resources/sql-tests/inputs/postgreSQL/limit.sql

+-- select sum(tenthous) as s1, sum(tenthous) + random()*0 as s2
+--   from tenk1 group by thousand order by thousand limit 3;
+
+select sum(tenthous) as s1, sum(tenthous) + random()*0 as s2


Although I believe Apache Spark behavior random()*0 == 0.0 is more natural, do we have a JIRA issue for this difference from PostgreSQL where random()*0 == 0?

Sorry, but I missed your point and the result seems to be the same;

select sum(tenthous) as s1, sum(tenthous) + random()*0 as s2 from tenk1 group by thousand order by thousand limit 3; s1 | s2 -------+------- 45000 | 45000 45010 | 45010 45020 | 45020 (3 rows)

https://github.com/postgres/postgres/blob/REL_12_STABLE/src/test/regress/expected/limit.out#L497-L504

Also, I tried below;

postgres=# select * from (values (1)) t(v) where random() * 0.0 = 0.0; v --- 1 (1 row) scala> sql("""select * from (values (1)) t(v) where random() * 0.0 = 0.0""").show() +---+ | v| +---+ | 1| +---+

What's the difference here?

The generated result is not the same in this PR.

dongjoon-hyun

+1, LGTM (except one minor request to add a JIRA ID comment for the difference).

sql/core/src/test/resources/sql-tests/results/postgreSQL/limit.sql.out

dongjoon-hyun · 2019-11-05T06:11:55Z

Then, merged to master! Thank you, @maropu !

maropu added 2 commits October 30, 2019 15:47

Fix

5032350

File a jira

3abbe57

maropu commented Oct 30, 2019

View reviewed changes

dongjoon-hyun added the SQL label Oct 30, 2019

dongjoon-hyun reviewed Nov 5, 2019

View reviewed changes

dongjoon-hyun approved these changes Nov 5, 2019

View reviewed changes

dongjoon-hyun added the TESTS label Nov 5, 2019

dongjoon-hyun reviewed Nov 5, 2019

View reviewed changes

sql/core/src/test/resources/sql-tests/results/postgreSQL/limit.sql.out Show resolved Hide resolved

dongjoon-hyun closed this in 41be512 Nov 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-29648][SQL][TESTS] Port limit.sql #26311

[SPARK-29648][SQL][TESTS] Port limit.sql #26311

maropu commented Oct 30, 2019

maropu Oct 30, 2019

SparkQA commented Oct 30, 2019

maropu commented Oct 30, 2019

SparkQA commented Oct 30, 2019

SparkQA commented Oct 30, 2019

dongjoon-hyun Nov 5, 2019

maropu Nov 5, 2019

dongjoon-hyun Nov 5, 2019

dongjoon-hyun left a comment

dongjoon-hyun commented Nov 5, 2019

[SPARK-29648][SQL][TESTS] Port limit.sql #26311

[SPARK-29648][SQL][TESTS] Port limit.sql #26311

Conversation

maropu commented Oct 30, 2019

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

maropu Oct 30, 2019

Choose a reason for hiding this comment

SparkQA commented Oct 30, 2019

maropu commented Oct 30, 2019

SparkQA commented Oct 30, 2019

SparkQA commented Oct 30, 2019

dongjoon-hyun Nov 5, 2019

Choose a reason for hiding this comment

maropu Nov 5, 2019

Choose a reason for hiding this comment

dongjoon-hyun Nov 5, 2019

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented Nov 5, 2019