Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18292][SQL] LogicalPlanToSQLSuite should not use resource dependent path for golden file generation #15789

Closed
wants to merge 2 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 6, 2016

What changes were proposed in this pull request?

LogicalPlanToSQLSuite uses the following command to update the existing answer files.

SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "hive/test-only *LogicalPlanToSQLSuite"

However, after introducing getTestResourcePath, it fails to update the previous golden answer files in the predefined directory. This issue aims to fix that.

How was this patch tested?

It's a testsuite update. Manual.

@dongjoon-hyun
Copy link
Member Author

Hi, @gatorsmile .
I create a PR for LogicalPlanToSQLSuite part from #15546 .

@SparkQA
Copy link

SparkQA commented Nov 6, 2016

Test build #68233 has finished for PR 15789 at commit bd7258d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Oh, you mean the the following. I'll fix this PR again. Sorry, @gatorsmile .

  private val baseResourcePath = {
    // If regenerateGoldenFiles is true, we must be running this in SBT and we use hard-coded
    // relative path. Otherwise, we use classloader's getResource to find the location.
    if (regenerateGoldenFiles) {
      java.nio.file.Paths.get("src", "test", "resources", "sql-tests").toFile
    } else {
      val res = getClass.getClassLoader.getResource("sql-tests")
      new File(res.getFile)
    }
  }

@dongjoon-hyun
Copy link
Member Author

Retest this please

@dongjoon-hyun
Copy link
Member Author

It's fixed like SQLQueryTestSuite.

@dongjoon-hyun
Copy link
Member Author

Retest this please.

@SparkQA
Copy link

SparkQA commented Nov 6, 2016

Test build #68232 has finished for PR 15789 at commit 725cd45.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

There occurs four randomized aggregation tests failures due to OutOfMemoryError. But, it seems to be irrelevant. I'll wait the next test running currently.

[info] - randomized aggregation test - [typed, with partial + safe] - with grouping keys - with non-empty input *** FAILED *** (1 second, 512 milliseconds)
[info]   org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 1897.0 failed 1 times, most recent failure: Lost task 4.0 in stage 1897.0 (TID 6408, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space

@dongjoon-hyun
Copy link
Member Author

Retest this please

@SparkQA
Copy link

SparkQA commented Nov 6, 2016

Test build #68237 has finished for PR 15789 at commit bd7258d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// If regenerateGoldenFiles is true, we must be running this in SBT and we use hard-coded
// relative path. Otherwise, we use classloader's getResource to find the location.
if (regenerateGoldenFiles) {
java.nio.file.Paths.get("src", "test", "resources", "sqlgen").toFile.getCanonicalPath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see. This seems OK. I think it's OK to just do new File("src/test/resources/sqlgen") here instead, since that's what other tests do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for review, @srowen .
For this one, there is a request in #15546 to follow the way SQLQueryTestSuite.

// If regenerateGoldenFiles is true, we must be running this in SBT and we use hard-coded
// relative path. Otherwise, we use classloader's getResource to find the location.
if (regenerateGoldenFiles) {
java.nio.file.Paths.get("src", "test", "resources", "sqlgen").toFile.getCanonicalPath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to confirm that you already tested it in your private branch, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked that. But, to make it sure, I'll check it again right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I tested this in my private branch in a separate folder. When I changed the query for range, the output file is changed like the following.

SPARK-18292-SQLGEN:SPARK-18292$ pwd
/Users/dhyun/SPARK-18292-SQLGEN
SPARK-18292-SQLGEN:SPARK-18292$ git branch
* SPARK-18292
  master
SPARK-18292-SQLGEN:SPARK-18292$ pwd
/Users/dhyun/SPARK-18292-SQLGEN
SPARK-18292-SQLGEN:SPARK-18292$ git status
On branch SPARK-18292
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   sql/hive/src/test/resources/sqlgen/range.sql
    modified:   sql/hive/src/test/scala/org/apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala

@gatorsmile
Copy link
Member

LGTM except a comment.

@dongjoon-hyun
Copy link
Member Author

Thank you, @srowen and @gatorsmile .
I replied the comment.
For java.nio.file.Paths.get("src", "test", "resources", "sqlgen").toFile.getCanonicalPath, could you give me advice again if I need to change that again?

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh OK, let's leave it then.

@dongjoon-hyun
Copy link
Member Author

Thank you, @srowen !

@rxin
Copy link
Contributor

rxin commented Nov 6, 2016

Can you explain what the actual issue is? I read the description and still have no idea.

@dongjoon-hyun
Copy link
Member Author

Hi, @rxin .

Yep. The actual issue is the current code generates the new golden files in the following folder. So, we cannot identify which file is changed and cannot commit the newly generated files easily.

.../spark/sql/hive/target/scala-2.11/test-classes/sqlgen

When we added this module initially, we faced the exactly same issue and decided to use the absolute path and to assume the following command in the source tree root.

SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "hive/test-only *LogicalPlanToSQLSuite"

@srowen
Copy link
Member

srowen commented Nov 8, 2016

I'll merge if there are no further comments. To restate -- basically the test helper function returns a path that is perfectly fine for reading test data, and is flexible because it doesn't depend on what the current working dir is. However it won't work as a location to write to, which is only rarely ever needed like in cases like this.

@dongjoon-hyun
Copy link
Member Author

Thank you, @srowen !

@srowen
Copy link
Member

srowen commented Nov 9, 2016

Merged to master/2.1

asfgit pushed a commit that referenced this pull request Nov 9, 2016
…ndent path for golden file generation

## What changes were proposed in this pull request?

`LogicalPlanToSQLSuite` uses the following command to update the existing answer files.

```bash
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "hive/test-only *LogicalPlanToSQLSuite"
```

However, after introducing `getTestResourcePath`, it fails to update the previous golden answer files in the predefined directory. This issue aims to fix that.

## How was this patch tested?

It's a testsuite update. Manual.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #15789 from dongjoon-hyun/SPARK-18292.

(cherry picked from commit 02c5325)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@dongjoon-hyun
Copy link
Member Author

Thank you for review and merging, @srowen , @gatorsmile , and @rxin .

@asfgit asfgit closed this in 02c5325 Nov 9, 2016
@dongjoon-hyun dongjoon-hyun deleted the SPARK-18292 branch November 16, 2016 19:36
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…ndent path for golden file generation

## What changes were proposed in this pull request?

`LogicalPlanToSQLSuite` uses the following command to update the existing answer files.

```bash
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "hive/test-only *LogicalPlanToSQLSuite"
```

However, after introducing `getTestResourcePath`, it fails to update the previous golden answer files in the predefined directory. This issue aims to fix that.

## How was this patch tested?

It's a testsuite update. Manual.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes apache#15789 from dongjoon-hyun/SPARK-18292.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants