[SPARK-27934][SQL][TEST] Port case.sql #24782

wangyum · 2019-06-03T14:08:08Z

What changes were proposed in this pull request?

This PR is to port case.sql from PostgreSQL regression tests. https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/sql/case.sql

The expected results can be found in the link: https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/expected/case.out

When porting the test cases, found one PostgreSQL specific features that do not exist in Spark SQL:

SPARK-27930: Add built-in Math Function: RANDOM

How was this patch tested?

N/A

wangyum · 2019-06-03T14:12:22Z

sql/core/src/test/resources/sql-tests/inputs/pgSQL/case.sql

+-- Test the case statement
+--
+-- There are 2 join condition is missing in this test case. we set spark.sql.crossJoin.enabled=true.
+set spark.sql.crossJoin.enabled=true;


Need to set spark.sql.crossJoin.enabled=true. otherwise:

-- !query 30 SELECT '' AS Five, NULLIF(a.i,b.i) AS `NULLIF(a.i,b.i)`, NULLIF(b.i, 4) AS `NULLIF(b.i,4)` FROM CASE_TBL a, CASE2_TBL b -- !query 30 schema struct<> -- !query 30 output org.apache.spark.sql.AnalysisException Detected implicit cartesian product for INNER join between logical plans Project [i#x] +- Relation[i#x,f#x] parquet and Project [i#x] +- Relation[i#x,j#x] parquet Join condition is missing or trivial. Either: use the CROSS JOIN syntax to allow cartesian products between these relations, or: enable implicit cartesian products by setting the configuration variable spark.sql.crossJoin.enabled=true;

wangyum · 2019-06-03T14:18:11Z

sql/core/src/test/resources/sql-tests/results/pgSQL/case.sql.out

+SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl
+-- !query 20 schema
+struct<CASE WHEN (i > 100) THEN (CAST(1 AS DOUBLE) / CAST(0 AS DOUBLE)) ELSE CAST(0 AS DOUBLE) END:double>
+-- !query 20 output


PostgresSQL throws ERROR: division by zero: https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/expected/case.out#L98-L99
I create SPARK-27923 to track this: https://issues.apache.org/jira/browse/SPARK-27923

wangyum · 2019-06-03T14:20:29Z

sql/core/src/test/resources/sql-tests/inputs/pgSQL/case.sql

+  FROM CASE_TBL a, CASE2_TBL b
+  WHERE COALESCE(f,b.i) = 2;
+
+-- We don't support update now.


Skip UPDATE cases? I add a comment here.

Hi, @gatorsmile and @wangyum . The half of the file is comment which is irrelevant to Apache Spark. Do we need to keep all the invalid comments? In fact, the original will be changed time to time, too. For the simply invalid one (which has no SPARK JIRA), shall we skip adding comments?

wangyum · 2019-06-03T14:20:51Z

sql/core/src/test/resources/sql-tests/inputs/pgSQL/case.sql

+
+-- SELECT * FROM CASE_TBL;
+
+-- We don't support the features below:


Skip these cases? I add a comment here.

SparkQA · 2019-06-03T17:13:41Z

Test build #106111 has finished for PR 24782 at commit 58e9c2d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-06-03T21:18:04Z

Could you fix the following PR description? case.sql instead of AGGREGATES.sql?

This PR is to port AGGREGATES.sql from PostgreSQL regression tests.

wangyum · 2019-06-04T00:42:40Z

Thank you @dongjoon-hyun

dongjoon-hyun · 2019-06-09T02:29:25Z

sql/core/src/test/resources/sql-tests/inputs/pgSQL/case.sql

+-- https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/sql/case.sql
+-- Test the case statement
+--
+-- There are 2 join condition is missing in this test case. we set spark.sql.crossJoin.enabled=true.


is missing -> which is missing or missed?

we set -> We set.

BTW, it's not clear about the relation between the missed one and crossJoin.

How about rewriting it to the following text?

This test suite contains two Cartesian products without using explicit CROSS JOIN syntax. Thus, we set spark.sql.crossJoin.enabled to true.

dongjoon-hyun · 2019-06-09T02:29:51Z

sql/core/src/test/resources/sql-tests/inputs/pgSQL/case.sql

+--
+-- CASE
+-- https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/sql/case.sql
+-- Test the case statement


I know this is copied from the original, but for readability, the case statement -> the CASE statement

gatorsmile · 2019-06-10T09:51:02Z

sql/core/src/test/resources/sql-tests/inputs/pgSQL/case.sql

+
+-- [SPARK-27930] Add built-in Math Function: RANDOM
+-- SELECT '7' AS `None`,
+--   CASE WHEN random() < 0 THEN 1


Can we first use rand()?

OK. Rewrite it to rand().

gatorsmile · 2019-06-10T09:56:05Z

sql/core/src/test/resources/sql-tests/inputs/pgSQL/case.sql

+--   'begin return $1; end' language plpgsql volatile;
+
+-- SELECT CASE
+--   (CASE vol('bar')


This test case is pretty useful. Could we use a udf here?

Yes. Add vol to SQLQueryTestSuite.

SparkQA · 2019-06-10T15:41:11Z

Test build #106350 has finished for PR 24782 at commit 89e3a9e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2019-06-11T07:48:58Z

sql/core/src/test/resources/sql-tests/results/pgSQL/case.sql.out

+
+
+-- !query 21
+SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl


Your comment https://github.com/apache/spark/pull/24782/files#r289867687 should be moved to here.

gatorsmile · 2019-06-11T07:56:23Z

LGTM

Merged to master.

## What changes were proposed in this pull request? This PR is to port case.sql from PostgreSQL regression tests. https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/sql/case.sql The expected results can be found in the link: https://github.com/postgres/postgres/blob/REL_12_BETA1/src/test/regress/expected/case.out When porting the test cases, found one PostgreSQL specific features that do not exist in Spark SQL: - [SPARK-27930](https://issues.apache.org/jira/browse/SPARK-27930): Add built-in Math Function: RANDOM ## How was this patch tested? N/A Closes apache#24782 from wangyum/SPARK-27934. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>

wangyum added 3 commits June 2, 2019 23:36

Add case.sql

f8fe7f1

Add result

06c493b

Fix file name.

58e9c2d

wangyum commented Jun 3, 2019

View reviewed changes

dongjoon-hyun reviewed Jun 9, 2019

View reviewed changes

dongjoon-hyun changed the title ~~[SPARK-27934][SQL] Port case.sql~~ [SPARK-27934][SQL][TEST] Port case.sql Jun 9, 2019

gatorsmile reviewed Jun 10, 2019

View reviewed changes

wangyum added 2 commits June 10, 2019 18:09

Merge remote-tracking branch 'upstream/master' into SPARK-27934

09a83bc

address comment

89e3a9e

gatorsmile reviewed Jun 11, 2019

View reviewed changes

gatorsmile closed this in 6284ac7 Jun 11, 2019

dongjoon-hyun added the SQL label Feb 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-27934][SQL][TEST] Port case.sql #24782

[SPARK-27934][SQL][TEST] Port case.sql #24782

wangyum commented Jun 3, 2019 •

edited

Loading

wangyum Jun 3, 2019

wangyum Jun 3, 2019

wangyum Jun 3, 2019

dongjoon-hyun Jun 9, 2019

wangyum Jun 3, 2019

SparkQA commented Jun 3, 2019

dongjoon-hyun commented Jun 3, 2019

wangyum commented Jun 4, 2019

dongjoon-hyun Jun 9, 2019

gatorsmile Jun 10, 2019

dongjoon-hyun Jun 9, 2019 •

edited

Loading

gatorsmile Jun 10, 2019

wangyum Jun 10, 2019

gatorsmile Jun 10, 2019

wangyum Jun 10, 2019

SparkQA commented Jun 10, 2019

gatorsmile Jun 11, 2019

gatorsmile commented Jun 11, 2019


		-- SELECT * FROM CASE_TBL;

		-- We don't support the features below:



		-- !query 21
		SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl

[SPARK-27934][SQL][TEST] Port case.sql #24782

[SPARK-27934][SQL][TEST] Port case.sql #24782

Conversation

wangyum commented Jun 3, 2019 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 3, 2019

dongjoon-hyun commented Jun 3, 2019

wangyum commented Jun 4, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun Jun 9, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 10, 2019

Choose a reason for hiding this comment

gatorsmile commented Jun 11, 2019

wangyum commented Jun 3, 2019 •

edited

Loading

dongjoon-hyun Jun 9, 2019 •

edited

Loading