-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-6749] [SQL] Make metastore client robust to underlying socket connection loss #6912
Conversation
Test build #35340 has finished for PR 6912 at commit
|
// We use hive's conf for compatibility. | ||
private val retryLimit = conf.getIntVar(HiveConf.ConfVars.METASTORETHRIFTFAILURERETRIES) | ||
private val retryDelayMillis = 1000 * conf.getIntVar( | ||
HiveConf.ConfVars.METASTORE_CLIENT_CONNECT_RETRY_DELAY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, Hive 0.14 changed it to
METASTORE_CLIENT_CONNECT_RETRY_DELAY("hive.metastore.client.connect.retry.delay", "1s",
new TimeValidator(TimeUnit.SECONDS),
"Number of seconds for the client to wait between consecutive connection attempts"),
Unfortunately we can't use getTimeVar() yet since we compile against hive 0.13
Test build #35512 has finished for PR 6912 at commit
|
Test build #35585 has finished for PR 6912 at commit
|
test this please. |
Test build #35599 has finished for PR 6912 at commit
|
s"(${retryLimit - numTries} tries remaining)", e) | ||
Thread.sleep(retryDelayMillis) | ||
try { | ||
client = Hive.get(conf, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we call this method from a different thread than the one created this client wrapper, the conf may be different (the class loader associated with the conf may be different). Let me think about the potential impact of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we use the hive conf returned by state.getConf
. This is the conf we used to created the original client
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Test build #35628 has finished for PR 6912 at commit
|
LGTM. I am merging this to master. |
This works around a bug in the underlying RetryingMetaStoreClient (HIVE-10384) by refreshing the metastore client on thrift exceptions. We attempt to emulate the proper hive behavior by retrying only as configured by hiveconf.