-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-22074: Slow compilation due to IN to OR transformation #746
Conversation
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java
Outdated
Show resolved
Hide resolved
|
||
HiveConf conf; | ||
try { | ||
conf = Hive.get().getConf(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of obtaining the conf object here statically, let's just pass the int value using the ctx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcamachor ctx
is passed on to genExprNode
by the callers. There are several callers which uses genExprNode
. For this we will have to update all of these callers to set the config in TypeCheckCtx
.
Fetching HiveConf only occurs if the expression at hand is GenericUDFIn
and CBO wasn't executed therefore this shouldn't add much overhead.
Let me know if you think otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to pass this value from the callers in the context. You would not need to change all callers; if value is not passed, rewriting could be skipped. I see mainly two advantages of doing this:
- if transformation is never happening, we will not be retrieving the conf and this value for every IN clause in a query (note that
isCBOExecuted
method is misleading, the value returned isfoldExpr
boolean which isfalse
sometimes even for calls coming from CBO cf. first line ingenFilterRelNode
method inCalcitePlanner
), and - removing the static call to Hive object from within the folding logic.
I see there are other calls to Hive.get()
in the class, that information should probably be moved to context too.
These can all be tackled together in a follow-up, but I think since we are cleaning up this logic, it would make sense to do it at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll create a separate jira and tackle this in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcamachor I have created https://issues.apache.org/jira/browse/HIVE-22135 for refactoring Hive.get().getConf().
For the current patch I have gotten green run. Let me know if everything looks good to you.
0ca6ba0
to
9040661
Compare
9040661
to
58a978f
Compare
No description provided.