-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-37561][SQL] Avoid loading all functions when obtaining hive's DelegationToken #34822
Conversation
Can one of the admins verify this patch? |
cc @sunchao |
val hive = Hive.get(conf, classOf[HiveConf]) | ||
val hive = conf match { | ||
case hiveConf: HiveConf => | ||
Hive.getWithoutRegisterFns(hiveConf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will have to mimic
spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
Lines 204 to 211 in 116255d
try { | |
classOf[Hive].getMethod("getWithoutRegisterFns", classOf[HiveConf]) | |
.invoke(null, conf).asInstanceOf[Hive] | |
} catch { | |
// SPARK-37069: not all Hive versions have the above method (e.g., Hive 2.3.9 has it but | |
// 2.3.8 don't), therefore here we fallback when encountering the exception. | |
case _: NoSuchMethodException => | |
Hive.get(conf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I thought at first that code that has nothing to do with HiveClientImpl
does not need to be compatible with lower versions. I also use reflection to solve compatibility problems.
val hive = Hive.get(conf, classOf[HiveConf]) | ||
val hive = conf match { | ||
case hiveConf: HiveConf => | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we reduce the code duplication by moving HiveClientImpl.getHive
to its companion object and then call the method here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
/** | ||
* Initialize Hive through Configuration. | ||
* first try to use getWithoutRegisterFns to initialize to avoid loading all functions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: first
-> First
Hive.get(hiveConf) | ||
} | ||
case _ => | ||
Hive.get(conf, classOf[HiveConf]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can also call getWithoutRegisterFns
for this path? by first converting the Configuration
to a HiveConf
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (pending CI).
Merged, thanks! |
What changes were proposed in this pull request?
Use
Hive.getWithoutRegisterFns
instead ofHive.get
to avoid loading all permanent functions from the hive meta store.Why are the changes needed?
At present, when obtaining the delegationToken of hive, all functions will be loaded.
This is unnecessary, it takes time to load the function, and it also increases the burden on the hive meta store.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
manual test