Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4908][SQL]narrow the scope of synchronized for PR 3834 #4001

Closed
wants to merge 1 commit into from

Conversation

baishuo
Copy link
Contributor

@baishuo baishuo commented Jan 12, 2015

compared with #3834, this PR narrow the scope of synchronized

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@liancheng
Copy link
Contributor

ok to test

@liancheng
Copy link
Contributor

This LGTM, thanks!

@SparkQA
Copy link

SparkQA commented Jan 12, 2015

Test build #25403 has started for PR 4001 at commit 4bfa306.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 12, 2015

Test build #25403 has finished for PR 4001 at commit 4bfa306.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25403/
Test PASSed.

@liancheng
Copy link
Contributor

But why this PR is marked as "hotfix"? Does it break anything?

@marmbrus
Copy link
Contributor

Yeah, generally hot fixes are reserved for things like "the build is broken and we need to fix it now"

@baishuo baishuo changed the title [SPARK-4908][SQL][hotfix]narrow the scope of synchronized for PR 3834 [SPARK-4908][SQL]narrow the scope of synchronized for PR 3834 Jan 14, 2015
@baishuo
Copy link
Contributor Author

baishuo commented Jan 14, 2015

Hi @liancheng and @marmbrus I had remove [hotfix] from the title.

@baishuo
Copy link
Contributor Author

baishuo commented Jan 14, 2015

Indeed, the code passed all the test when I do test locally, I add [hotfix] to title just because I want to illustrate that this is not the final solution of [SPARK-4908]

@baishuo
Copy link
Contributor Author

baishuo commented Jan 15, 2015

Hi @marmbrus ,can this PR be merged? :)

@marmbrus
Copy link
Contributor

Do we know that HiveShim.getCommandProcessor( and stuff are threadsafe? Since we only use this for DDL operations and not actually querying this seems more risky than worthwhile to me. Thoughts?

@liancheng
Copy link
Contributor

HiveShim.getCommandProcess delegates to methods defined in CommandProcessorFactory, which tries to find a cached Driver object and initialize it. The underlying Driver cache map is synchronized. However, I'm not quite sure whether Driver is thread-safe. Also, HiveServer2 actually creates a new Driver instance for every SQL statement and never caches them. Considering all the above, I'd agree that the risks is greater than the benefits. A better solution for this is to avoid using HiveShim.getCommandProcess (which caches Driver objects) but mimicing what HiveServer2 does and create new Driver instances for every SQL statement.

@liancheng
Copy link
Contributor

@baishuo Would you mind closing this PR? The Driver.plan field is not properly synchronized; otherwise we wouldn't need the synchronization at all. And other parts of Driver are not obviously thread-safe. Considering DDL operations is infrequent, synchronize the whole function is both safe and performance insensitive.

@baishuo
Copy link
Contributor Author

baishuo commented Jan 21, 2015

no problem,clolse it :)

@baishuo baishuo closed this Jan 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants