Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4037][SQL] Removes the SessionState instance created in HiveThriftServer2 #2887

Closed
wants to merge 4 commits into from

Conversation

liancheng
Copy link
Contributor

HiveThriftServer2 creates a global singleton SessionState instance and overrides HiveContext to inject the SessionState object. This messes up SessionState initialization and causes problems.

This PR replaces the global SessionState with HiveContext.sessionState to avoid the initialization conflict. Also HiveContext reuses existing started SessionState if any (this is required by SparkSQLCLIDriver, which uses specialized CliSessionState).

@SparkQA
Copy link

SparkQA commented Oct 22, 2014

QA tests have started for PR 2887 at commit 04b374d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 22, 2014

QA tests have finished for PR 2887 at commit 04b374d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22024/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Oct 22, 2014

QA tests have started for PR 2887 at commit 0c5bd26.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 22, 2014

QA tests have finished for PR 2887 at commit 0c5bd26.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22032/
Test PASSed.

s"LOAD DATA LOCAL INPATH '$dataFilePath' OVERWRITE INTO TABLE test",
"CACHE TABLE test")
val queries =
s"""SET spark.sql.shuffle.partitions=3;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This SET command is used as a regression test of SPARK-4037.

@zhzhan
Copy link
Contributor

zhzhan commented Oct 24, 2014

@liancheng I was just aware of this issue. In the commit this morning #2241, I have following change in HiveContext.scala, because I didn't realize that the state has changed to lazy eval. Could you check whether it effects your PR, and revert it back if necessary?

  • // Session state must be initilized before the CommandProcessor is created .
  • SessionState.start(sessionState)

@liancheng
Copy link
Contributor Author

Sure, will rebase this PR to fix that.

if (state.err == null) state.err = new PrintStream(outputBuffer, true, "UTF-8")
(state.getConf, state)
}
.get
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should SessionState.start(state) be invoked here? Otherwise, it relies on other code to initialize the state, and we lose the track whether the state is initialized, and other code may call SessionState.start() multiple times, resulting in unexpected behavior. Correct me if I am wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't call SessionState.start(...) here because before Hive 0.13.1 support is merged in, it's always called when necessary in HiveContext.runHive. Just noticed that this call was removed in the most recent master branch. I consider this change dangerous because the current SessionState may be switched at some other place and cause subtle errors.

However, add another SessionState.start(...) call here is harmless and does help to track when and whether the session state is properly initialized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we may call SessionState.start with the session state multiple times. In my point of view, we should check whether SessionState.get is null or not, and then decide whether we need to call SessionState.start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider initializing the session state in the constructor, maybe by forcing the realization of the lazy val. Matei just bumped into a problem where running "SHOW TABLES" as the first command null pointers. Lets make sure that this PR fixes those sorts of problems too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, all commands/DDLs that are directly delegated to Hive suffer this issue, and this PR fixes it.

@zhzhan
Copy link
Contributor

zhzhan commented Oct 27, 2014

@liancheng I send a PR #2967 against similar issue. Please take a look. My major concern is that the SessionState.start(state) may be invoked multiple times. If you think PR #2967 is duplicated with this one, please let me know and I will close it.

@liancheng
Copy link
Contributor Author

Hey @zhzhan, left some comments both here and in your PR. Considering all the reasons I mentioned in these comments, would you mind to close #2967? Hive SessionState is really an annoying and error prone abstraction, thanks all the same for working on this somewhat tedious but important issue!

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22357 has started for PR 2887 at commit 49b1c5b.

  • This patch merges cleanly.

proc match {
case driver: Driver =>
driver.init()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I check the function getCommandProcessor and found it has already initialize the driver, and we don't need to init it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22357 has finished for PR 2887 at commit 49b1c5b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22357/
Test FAILed.

@liancheng
Copy link
Contributor Author

retest this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22360/
Test FAILed.

@liancheng
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22361 has started for PR 2887 at commit 49b1c5b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22361 has finished for PR 2887 at commit 49b1c5b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22361/
Test FAILed.

@liancheng
Copy link
Contributor Author

According to the line number information in the Hive failure output, the test was executed against Hive 0.13.1, while the assembly was built with -Phive-0.12.0.

// Spark SQL Hive support uses a single `SessionState` for all Hive operations and breaks
// session isolation under multi-user scenarios (i.e. HiveThriftServer2).
// TODO Fix session isolation
SessionState.start(sessionState)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comments. Here we may start the same sessionState in the same thread multiple times.
How do you think following check?

if (SessionState.get == null) SessionState.start(sessionState)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, sessionState do gets started within the same thread multiple times. However, at least for Hive 0.12.0, I think SessionState.start(sessionState) should be an idempotent operation, calling it multiple times shouldn't hurt. Maybe this doesn't hold anymore for Hive 0.13.1.

Another issue that really puzzles me is this line from the Jenkins test failure log:

    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342)

The line number suggests that apparently Hive 0.13.1 was used (see here), but this Jenkins build was triggered with

-Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-0.12.0

Anyway, I'll follow your suggestion to avoid starting sessionState multiple times and try again. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, to avoid multiple start, I guess this is better?

if (SessionState.get() != sessionState) {
  SessionState.start(sessionState)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think now the default run-test is hive-0.13. So that is right.
Didn't go through the run-test detail, but there is another line
SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see where the problem is: the assembly jar is built with -Phive-0.12.0 [1] while the test is executed with -Phive [2], which implies -Phive-0.13.1.

[1]

BUILD_MVN_PROFILE_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-0.12.0"

[2]
SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I check with some hive experts here, and was told that the session state should not be started again and again. It may work, but not guaranteed. In hive, there is no such use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for this important information!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, just reproduced the above Jenkins build failure locally by:

  1. Building the assembly jar with -Phadoop-2.3 -Phive-0.12.0
  2. Running ./sbt/sbt -Phadoop-2.3 -Phive "hive/test-only *.HiveCompatibilitySuite -- -z groupby1"

Will fix dev/run-tests altogether.

@zhzhan
Copy link
Contributor

zhzhan commented Oct 28, 2014

@liancheng With the check if (SessionState.get() != sessionState), ./sbt/sbt -Phadoop-2.3 -Phive "hive/test-only *.HiveCompatibilitySuite -- -z groupby1" is running OK. Thanks for clean this up.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22393 has started for PR 2887 at commit a28fef5.

  • This patch merges cleanly.

@liancheng
Copy link
Contributor Author

@zhzhan Building assembly jar with -Phive-0.12.0 but run tests without it is introduced in marmbrus@8f6b09a, and was done on purpose. Avoiding multiple SessionState.start() do fixes this failure on my machine. Trying my luck on Jenkins. Thanks for the suggestion :)

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22395 has started for PR 2887 at commit 8446675.

  • This patch merges cleanly.

@liancheng
Copy link
Contributor Author

@marmbrus This should be ready to go once Jenkins nods.

@SparkQA
Copy link

SparkQA commented Oct 29, 2014

Test build #22393 has finished for PR 2887 at commit a28fef5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22393/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Oct 29, 2014

Test build #22395 has finished for PR 2887 at commit 8446675.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22395/
Test PASSed.

@marmbrus
Copy link
Contributor

Okay, I tested this manually with a SHOW TABLES on a fresh context and it seems to init correctly. Merging this to master. Thanks!

@marmbrus
Copy link
Contributor

Actually... sorry going to wait for the hive 13 thrift server PR so that we run tests on the server as a sanity check.

@marmbrus
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Oct 31, 2014

Test build #22638 has started for PR 2887 at commit 8446675.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 31, 2014

Test build #22638 has finished for PR 2887 at commit 8446675.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22638/
Test PASSed.

@liancheng
Copy link
Contributor Author

@marmbrus This should be ready to go.

// Spark SQL Hive support uses a single `SessionState` for all Hive operations and breaks
// session isolation under multi-user scenarios (i.e. HiveThriftServer2).
// TODO Fix session isolation
if (SessionState.get() != sessionState) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case this condition will be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, Spark SQL CLI uses a global CliSessionState instance, which inherits from SessionState. Also, this can be useful to fix the session isolation problem HiveThriftServer2 currently suffers from.

@marmbrus
Copy link
Contributor

marmbrus commented Nov 1, 2014

Thanks! Merged to master.

@asfgit asfgit closed this in ad0fde1 Nov 1, 2014
liancheng added a commit to liancheng/spark that referenced this pull request Nov 9, 2014
…riftServer2

`HiveThriftServer2` creates a global singleton `SessionState` instance and overrides `HiveContext` to inject the `SessionState` object. This messes up `SessionState` initialization and causes problems.

This PR replaces the global `SessionState` with `HiveContext.sessionState` to avoid the initialization conflict. Also `HiveContext` reuses existing started `SessionState` if any (this is required by `SparkSQLCLIDriver`, which uses specialized `CliSessionState`).

Author: Cheng Lian <lian@databricks.com>

Closes apache#2887 from liancheng/spark-4037 and squashes the following commits:

8446675 [Cheng Lian] Removes redundant Driver initialization
a28fef5 [Cheng Lian] Avoid starting HiveContext.sessionState multiple times
49b1c5b [Cheng Lian] Reuses existing started SessionState if any
3cd6fab [Cheng Lian] Fixes SPARK-4037

Conflicts:
	sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala
	sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
marmbrus pushed a commit to marmbrus/spark that referenced this pull request Nov 11, 2014
This PR backports apache#2843 to branch-1.1. The key difference is that this one doesn't support Hive 0.13.1 and thus always returns `0.12.0` when `spark.sql.hive.version` is queried.

6 other commits on which apache#2843 depends were also backported, they are:

- apache#2887 for `SessionState` lifecycle control
- apache#2675, apache#2823 & apache#3060 for major test suite refactoring and bug fixes
- apache#2164, for Parquet test suites updates
- apache#2493, for reading `spark.sql.*` configurations

Author: Cheng Lian <lian@databricks.com>
Author: Cheng Lian <lian.cs.zju@gmail.com>
Author: Michael Armbrust <michael@databricks.com>

Closes apache#3113 from liancheng/get-info-for-1.1 and squashes the following commits:

d354161 [Cheng Lian] Provides Spark and Hive version in HiveThriftServer2 for branch-1.1
0c2a244 [Michael Armbrust] [SPARK-3646][SQL] Copy SQL configuration from SparkConf when a SQLContext is created.
3202a36 [Michael Armbrust] [SQL] Decrease partitions when testing
7f395b7 [Cheng Lian] [SQL] Fixes race condition in CliSuite
0dd28ec [Cheng Lian] [SQL] Fixes the race condition that may cause test failure
5928b39 [Cheng Lian] [SPARK-3809][SQL] Fixes test suites in hive-thriftserver
faeca62 [Cheng Lian] [SPARK-4037][SQL] Removes the SessionState instance created in HiveThriftServer2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants