Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-42874][SQL] Enable new golden file test framework for analysis for all input files #40496

Closed
wants to merge 24 commits into from

Conversation

dtenedor
Copy link
Contributor

What changes were proposed in this pull request?

This PR enables the new golden file test framework for analysis for all input files.

Background:

  • In [SPARK-42791][SQL] Create a new golden file test framework for analysis #40449 we added the ability to exercise the analyzer on the SQL queries in existing golden files in the sql/core/src/test/resources/sql-tests/inputs directory, writing separate output test files in the new sql/core/src/test/resources/sql-tests/analyzer-results directory in additional to the original output directory for full end-to-end query execution results.
  • That PR also added an allowlist of input files to include in this new dual-run mode.
  • In this PR, we remove that allowlist exercise the new dual-run mode for all the input files. We also extend the analyzer testing to support separate test cases in ANSI-mode, TimestampNTZ, and UDFs.

Why are the changes needed?

This improves test coverage and helps prevent against accidental regressions in the future as we edit the code.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

This PR adds testing only.

@github-actions github-actions bot added the SQL label Mar 20, 2023
@HyukjinKwon
Copy link
Member

LGTM if tests pass

commit

commit

commit

commit

commit

commit

commit

commit

commit
@amaliujia
Copy link
Contributor

LGTM

@dtenedor
Copy link
Contributor Author

@HyukjinKwon the tests are passing now, this is ready to merge if you are ready :)

@LuciferYang
Copy link
Contributor

@dtenedor Wait a few minutes for me to check with Scala 2.13 manually

@LuciferYang
Copy link
Contributor

@dtenedor Wait a few minutes for me to check with Scala 2.13 manually

done, should be ok ~

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HyukjinKwon
Copy link
Member

Merged to master.

@HyukjinKwon
Copy link
Member

I think ANSI test fails after this PR:

[info] - timestampNTZ/datetime-special.sql_analyzer_test *** FAILED *** (31 milliseconds)
[info]   timestampNTZ/datetime-special.sql_analyzer_test
[info]   Expected "...date(999999, 3, 18, [false) AS make_date(999999, 3, 18)#x, make_date(-1, 1, 28, fals]e) AS make_date(-1, ...", but got "...date(999999, 3, 18, [true) AS make_date(999999, 3, 18)#x, make_date(-1, 1, 28, tru]e) AS make_date(-1, ..." Result did not match for query #1
[info]   select make_date(999999, 3, 18), make_date(-1, 1, 28) (SQLQueryTestSuite.scala:777)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
[info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
[info]   at org.scalatest.funsuite.AnyFunSuite.newAssertionFailedException(AnyFunSuite.scala:1564)
[info]   at org.scalatest.Assertions.assertResult(Assertions.scala:847)
[info]   at org.scalatest.Assertions.assertResult$(Assertions.scala:842)
[info]   at org.scalatest.funsuite.AnyFunSuite.assertResult(AnyFunSuite.scala:1564)
[info]   at org.apache.spark.sql.SQLQueryTestSuite.$anonfun$readGoldenFileAndCompareResults$3(SQLQueryTestSuite.scala:777)
[info]   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)

https://github.com/apache/spark/actions/runs/4496107425/jobs/7910457741

@dtenedor mind taking a look please? cc @gengliangwang

@dtenedor
Copy link
Contributor Author

dtenedor commented Mar 24, 2023 via email

@dtenedor
Copy link
Contributor Author

@HyukjinKwon I ran the test locally and it passes. Maybe it is fixed at head now?

@LuciferYang
Copy link
Contributor

[info] - timestampNTZ/datetime-special.sql_analyzer_test *** FAILED *** (11 milliseconds)
[info]   timestampNTZ/datetime-special.sql_analyzer_test
[info]   Expected "...date(999999, 3, 18, [false) AS make_date(999999, 3, 18)#x, make_date(-1, 1, 28, fals]e) AS make_date(-1, ...", but got "...date(999999, 3, 18, [true) AS make_date(999999, 3, 18)#x, make_date(-1, 1, 28, tru]e) AS make_date(-1, ..." Result did not match for query #1
[info]   select make_date(999999, 3, 18), make_date(-1, 1, 28) (SQLQueryTestSuite.scala:777)
[info]   org.scalatest.exceptions.TestFailedException:

@dtenedor

Seems not fixed, run

SPARK_ANSI_SQL_MODE=true build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"

can reproduce the failure

@dtenedor
Copy link
Contributor Author

Looks like @LuciferYang fixed it with #40552. Thanks so much for the fix!

HyukjinKwon pushed a commit that referenced this pull request Mar 27, 2023
…nto w/ and w/o `ansi` suffix to pass sql analyzer test in ansi mode

### What changes were proposed in this pull request?
After #40496, run

```
SPARK_ANSI_SQL_MODE=true build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

There is one test faild with `spark.sql.ansi.enabled = true`

```
[info] - timestampNTZ/datetime-special.sql_analyzer_test *** FAILED *** (11 milliseconds)
[info]   timestampNTZ/datetime-special.sql_analyzer_test
[info]   Expected "...date(999999, 3, 18, [false) AS make_date(999999, 3, 18)#x, make_date(-1, 1, 28, fals]e) AS make_date(-1, ...", but got "...date(999999, 3, 18, [true) AS make_date(999999, 3, 18)#x, make_date(-1, 1, 28, tru]e) AS make_date(-1, ..." Result did not match for query #1
[info]   select make_date(999999, 3, 18), make_date(-1, 1, 28) (SQLQueryTestSuite.scala:777)
[info]   org.scalatest.exceptions.TestFailedException:
```

The failure reason is the last parameter of function `MakeDate` is `failOnError: Boolean = SQLConf.get.ansiEnabled`.

So this pr split `timestampNTZ/datetime-special.sql` into w/ and w/o ansi to mask this test difference.

### Why are the changes needed?
Make SQLQueryTestSuite test pass with `spark.sql.ansi.enabled = true`.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual checked `SPARK_ANSI_SQL_MODE=true build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"`

Closes #40552 from LuciferYang/SPARK-42921.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants