[CARBONDATA-2134] Prevent implicit column filter list from getting serialized while submitting task to executor #1935

manishgupta88 · 2018-02-06T08:49:51Z

Problem
In the current store blocklet pruning in driver and no further pruning takes place in the executor side. But still the implicit column filter list being sent to executor. As the size of list grows the cost of serializing and deserializing the list is increasing which can impact the query performance.

Solution
Remove the list from the filter expression before submitting the task to executor.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

Any interfaces changed?
No
Any backward compatibility impacted?
No
Document update required?
No
Testing done
UT added
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
NA

CarbonDataQA · 2018-02-06T09:28:15Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3535/

CarbonDataQA · 2018-02-06T09:34:09Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2299/

ravipesala · 2018-02-06T10:17:20Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3377/

ravipesala · 2018-02-08T06:55:39Z

hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java

+   *
+   * @param expression
+   */
+  public void removeInExpressionFromFilterExpression(Expression expression) {


This method should not belong here. Better do in scanrdd only

…and deserializing to executor to improve query performance

CarbonDataQA · 2018-02-08T09:39:49Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3585/

CarbonDataQA · 2018-02-08T09:43:02Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2348/

ravipesala · 2018-02-08T13:39:11Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3434/

ravipesala · 2018-02-09T07:31:44Z

LGTM

…rialized while submitting task to executor Problem In the current store blocklet pruning in driver and no further pruning takes place in the executor side. But still the implicit column filter list being sent to executor. As the size of list grows the cost of serializing and deserializing the list is increasing which can impact the query performance. Solution Remove the list from the filter expression before submitting the task to executor. This closes #1935

…rialized while submitting task to executor Problem In the current store blocklet pruning in driver and no further pruning takes place in the executor side. But still the implicit column filter list being sent to executor. As the size of list grows the cost of serializing and deserializing the list is increasing which can impact the query performance. Solution Remove the list from the filter expression before submitting the task to executor. This closes apache#1935

ravipesala reviewed Feb 8, 2018

View reviewed changes

manishgupta88 force-pushed the executor_filter_list_serialization branch from 4630dbf to 252daa4 Compare February 8, 2018 08:39

Modified code to prevent implicit column array list from serializing …

252daa4

…and deserializing to executor to improve query performance

asfgit closed this in 11a795c Feb 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-2134] Prevent implicit column filter list from getting serialized while submitting task to executor #1935

[CARBONDATA-2134] Prevent implicit column filter list from getting serialized while submitting task to executor #1935

manishgupta88 commented Feb 6, 2018

CarbonDataQA commented Feb 6, 2018

CarbonDataQA commented Feb 6, 2018

ravipesala commented Feb 6, 2018

ravipesala Feb 8, 2018

CarbonDataQA commented Feb 8, 2018

CarbonDataQA commented Feb 8, 2018

ravipesala commented Feb 8, 2018

ravipesala commented Feb 9, 2018

[CARBONDATA-2134] Prevent implicit column filter list from getting serialized while submitting task to executor #1935

[CARBONDATA-2134] Prevent implicit column filter list from getting serialized while submitting task to executor #1935

Conversation

manishgupta88 commented Feb 6, 2018

CarbonDataQA commented Feb 6, 2018

CarbonDataQA commented Feb 6, 2018

ravipesala commented Feb 6, 2018

ravipesala Feb 8, 2018

Choose a reason for hiding this comment

CarbonDataQA commented Feb 8, 2018

CarbonDataQA commented Feb 8, 2018

ravipesala commented Feb 8, 2018

ravipesala commented Feb 9, 2018