Skip to content

Commit

Permalink
[SPARK-28894][SQL][TESTS] Add a clue to make it easier to debug via J…
Browse files Browse the repository at this point in the history
…enkins's test results

### What changes were proposed in this pull request?

See https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109834/testReport/junit/org.apache.spark.sql/SQLQueryTestSuite/

![Screen Shot 2019-08-28 at 4 08 58 PM](https://user-images.githubusercontent.com/6477701/63833484-2a23ea00-c9ae-11e9-91a1-0859cb183fea.png)

```xml
<?xml version="1.0" encoding="UTF-8"?>
<testsuite hostname="C02Y52ZLJGH5" name="org.apache.spark.sql.SQLQueryTestSuite" tests="3" errors="0" failures="0" skipped="0" time="14.475">
    ...
    <testcase classname="org.apache.spark.sql.SQLQueryTestSuite" name="sql - Scala UDF" time="6.703">
    </testcase>
    <testcase classname="org.apache.spark.sql.SQLQueryTestSuite" name="sql - Regular Python UDF" time="4.442">
    </testcase>
    <testcase classname="org.apache.spark.sql.SQLQueryTestSuite" name="sql - Scalar Pandas UDF" time="3.33">
    </testcase>
    <system-out/>
    <system-err/>
</testsuite>
```

Root cause seems a bug in SBT - it truncates the test name based on the last dot.

sbt/sbt#2949
https://github.com/sbt/sbt/blob/v0.13.18/testing/src/main/scala/sbt/JUnitXmlTestsListener.scala#L71-L79

I tried to find a better way but couldn't find. Therefore, this PR proposes a workaround by appending the test file name into the assert log:

```diff
  [info] - inner-join.sql *** FAILED *** (4 seconds, 306 milliseconds)
+ [info]   inner-join.sql
  [info]   Expected "1	a
  [info]   1	a
  [info]   1	b
  [info]   1[]", but got "1	a
  [info]   1	a
  [info]   1	b
  [info]   1[	b]" Result did not match for query #6
  [info]   SELECT tb.* FROM ta INNER JOIN tb ON ta.a = tb.a AND ta.tag = tb.tag (SQLQueryTestSuite.scala:377)
  [info]   org.scalatest.exceptions.TestFailedException:
  [info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528)
```

It will at least prevent us to search full logs to identify which test file is failed by clicking filed test.

Note that this PR does not fully fix the issue but only fix the logs on its failed tests.

### Why are the changes needed?
To debug Jenkins logs easier. Otherwise, we should open full logs and search which test was failed.

### Does this PR introduce any user-facing change?
It will print out the file name of failed tests in Jenkins' test reports.

### How was this patch tested?
Manually tested but Jenkins tests are required in this PR.

Now it at least shows which file it is:

![Screen Shot 2019-08-30 at 10 16 32 PM](https://user-images.githubusercontent.com/6477701/64023705-de22a200-cb73-11e9-8806-2e98ad35adef.png)

Closes apache#25630 from HyukjinKwon/SPARK-28894-1.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
HyukjinKwon authored and dongjoon-hyun committed Aug 30, 2019
1 parent 3b07a4e commit 7cc0f0e
Showing 1 changed file with 35 additions and 30 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -342,39 +342,44 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession {
stringToFile(resultFile, goldenOutput)
}

// Read back the golden file.
val expectedOutputs: Seq[QueryOutput] = {
val goldenOutput = fileToString(new File(testCase.resultFile))
val segments = goldenOutput.split("-- !query.+\n")

// each query has 3 segments, plus the header
assert(segments.size == outputs.size * 3 + 1,
s"Expected ${outputs.size * 3 + 1} blocks in result file but got ${segments.size}. " +
s"Try regenerate the result files.")
Seq.tabulate(outputs.size) { i =>
QueryOutput(
sql = segments(i * 3 + 1).trim,
schema = segments(i * 3 + 2).trim,
output = segments(i * 3 + 3).replaceAll("\\s+$", "")
)
// This is a temporary workaround for SPARK-28894. The test names are truncated after
// the last dot due to a bug in SBT. This makes easier to debug via Jenkins test result
// report. See SPARK-28894.
withClue(s"${testCase.name}${System.lineSeparator()}") {
// Read back the golden file.
val expectedOutputs: Seq[QueryOutput] = {
val goldenOutput = fileToString(new File(testCase.resultFile))
val segments = goldenOutput.split("-- !query.+\n")

// each query has 3 segments, plus the header
assert(segments.size == outputs.size * 3 + 1,
s"Expected ${outputs.size * 3 + 1} blocks in result file but got ${segments.size}. " +
s"Try regenerate the result files.")
Seq.tabulate(outputs.size) { i =>
QueryOutput(
sql = segments(i * 3 + 1).trim,
schema = segments(i * 3 + 2).trim,
output = segments(i * 3 + 3).replaceAll("\\s+$", "")
)
}
}
}

// Compare results.
assertResult(expectedOutputs.size, s"Number of queries should be ${expectedOutputs.size}") {
outputs.size
}

outputs.zip(expectedOutputs).zipWithIndex.foreach { case ((output, expected), i) =>
assertResult(expected.sql, s"SQL query did not match for query #$i\n${expected.sql}") {
output.sql
}
assertResult(expected.schema,
s"Schema did not match for query #$i\n${expected.sql}: $output") {
output.schema
// Compare results.
assertResult(expectedOutputs.size, s"Number of queries should be ${expectedOutputs.size}") {
outputs.size
}
assertResult(expected.output, s"Result did not match for query #$i\n${expected.sql}") {
output.output

outputs.zip(expectedOutputs).zipWithIndex.foreach { case ((output, expected), i) =>
assertResult(expected.sql, s"SQL query did not match for query #$i\n${expected.sql}") {
output.sql
}
assertResult(expected.schema,
s"Schema did not match for query #$i\n${expected.sql}: $output") {
output.schema
}
assertResult(expected.output, s"Result did not match for query #$i\n${expected.sql}") {
output.output
}
}
}
}
Expand Down

0 comments on commit 7cc0f0e

Please sign in to comment.