New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-2134] Prevent implicit column filter list from getting serialized while submitting task to executor #1935
Conversation
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3535/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2299/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3377/ |
* | ||
* @param expression | ||
*/ | ||
public void removeInExpressionFromFilterExpression(Expression expression) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should not belong here. Better do in scanrdd only
4630dbf
to
252daa4
Compare
…and deserializing to executor to improve query performance
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3585/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2348/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3434/ |
LGTM |
…rialized while submitting task to executor Problem In the current store blocklet pruning in driver and no further pruning takes place in the executor side. But still the implicit column filter list being sent to executor. As the size of list grows the cost of serializing and deserializing the list is increasing which can impact the query performance. Solution Remove the list from the filter expression before submitting the task to executor. This closes #1935
…rialized while submitting task to executor Problem In the current store blocklet pruning in driver and no further pruning takes place in the executor side. But still the implicit column filter list being sent to executor. As the size of list grows the cost of serializing and deserializing the list is increasing which can impact the query performance. Solution Remove the list from the filter expression before submitting the task to executor. This closes apache#1935
Problem
In the current store blocklet pruning in driver and no further pruning takes place in the executor side. But still the implicit column filter list being sent to executor. As the size of list grows the cost of serializing and deserializing the list is increasing which can impact the query performance.
Solution
Remove the list from the filter expression before submitting the task to executor.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
No
No
No
UT added
NA