Skip to content

[SPARK-32976][SQL][FOLLOWUP] SET and RESTORE hive.exec.dynamic.partition.mode for HiveSQLInsertTestSuite to avoid flakiness #30843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

yaooqinn
Copy link
Member

What changes were proposed in this pull request?

As #29893 (comment) mentioned:

We need to set spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict") before executing this suite; otherwise, test("insert with column list - follow table output order + partitioned table") will fail.
The reason why it does not fail because some test cases [running before this suite] do not change the default value of hive.exec.dynamic.partition.mode back to strict. However, the order of test suite execution is not deterministic.

Why are the changes needed?

avoid flakiness in tests

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing tests

…ion.mode for Hive related TestSuites to avoid flakiness
@github-actions github-actions bot added the SQL label Dec 18, 2020
@yaooqinn
Copy link
Member Author

cc @cloud-fan @gatorsmile @maropu thanks

@SparkQA
Copy link

SparkQA commented Dec 18, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37623/

@SparkQA
Copy link

SparkQA commented Dec 18, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37623/

@SparkQA
Copy link

SparkQA commented Dec 18, 2020

Test build #133024 has finished for PR 30843 at commit a5b3846.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn yaooqinn changed the title [SPARK-32976][SQL][FOLLOWUP] SET and RESTORE hive.exec.dynamic.partition.mode for Hive related TestSuites to avoid flakiness [SPARK-32976][SQL][FOLLOWUP] SET and RESTORE hive.exec.dynamic.partition.mode for HiveSQLInsertTestSuite to avoid flakiness Dec 18, 2020
@@ -21,5 +21,23 @@ import org.apache.spark.sql.SQLInsertTestSuite
import org.apache.spark.sql.hive.test.TestHiveSingleton

class HiveSQLInsertTestSuite extends SQLInsertTestSuite with TestHiveSingleton {

private var originalPartitionMode = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just use null instead of an empty string.

private var originalPartitionMode: String = _

here and

spark.conf.get("hive.exec.dynamic.partition.mode", null)

later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also do something like:

Option(spark.conf.get("hive.exec.dynamic.partition.mode", null))

too. and later:

originalPartitionMode
  .map(v => spark.conf.set("hive.exec.dynamic.partition.mode", v)
  .getOrElse(spark.conf.unset("hive.exec.dynamic.partition.mode"))

up to you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forked this from HiveCharVarchar Test Suite. Shall we change there
too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No in this context because this is a follow-up of SPARK-32976. To touch other test suite, we need to make another JIRA.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master/3.1.

dongjoon-hyun pushed a commit that referenced this pull request Dec 19, 2020
…ion.mode for HiveSQLInsertTestSuite to avoid flakiness

### What changes were proposed in this pull request?

As #29893 (comment) mentioned:

> We need to set spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict") before executing this suite; otherwise, test("insert with column list - follow table output order + partitioned table") will fail.
The reason why it does not fail because some test cases [running before this suite] do not change the default value of hive.exec.dynamic.partition.mode back to strict. However, the order of test suite execution is not deterministic.
### Why are the changes needed?

avoid flakiness in tests

### Does this PR introduce _any_ user-facing change?

no
### How was this patch tested?

existing tests

Closes #30843 from yaooqinn/SPARK-32976-F.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit dd44ba5)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@yaooqinn
Copy link
Member Author

Er... Shall we wait for the Jenkins or GA? 😁

@dongjoon-hyun
Copy link
Member

Oh.. My bad.

@dongjoon-hyun
Copy link
Member

Thanks. I double-checked it by manually verifying quickly.

$ build/sbt -Phive "hive/testOnly *.HiveSQLInsertTestSuite"
...
[info] HiveSQLInsertTestSuite:
...
[info] Run completed in 21 seconds, 835 milliseconds.
[info] Total number of tests run: 8
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 8, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 8, Failed 0, Errors 0, Passed 8
[success] Total time: 256 s (04:16), completed Dec 19, 2020, 8:13:26 AM

@SparkQA
Copy link

SparkQA commented Dec 19, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37674/

@SparkQA
Copy link

SparkQA commented Dec 19, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37674/

@SparkQA
Copy link

SparkQA commented Dec 19, 2020

Test build #133074 has finished for PR 30843 at commit bafdbe2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -21,5 +21,20 @@ import org.apache.spark.sql.SQLInsertTestSuite
import org.apache.spark.sql.hive.test.TestHiveSingleton

class HiveSQLInsertTestSuite extends SQLInsertTestSuite with TestHiveSingleton {

private val originalPartitionMode = spark.conf.getOption("hive.exec.dynamic.partition.mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in sql/core, the SparkSession is created for each test suite, and we can't use spark in the class constructor. In sql/hive it's OK as all test suites share one global SparkSession, but let's avoid using this pattern that only works in sql/hive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, this should be avoided

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants