[CARBONDATA-1282]Choose BatchedDatasource scan only if schema fits for codegen #1148

ashokblend · 2017-07-08T10:58:06Z

Problem
When table is having large no of column say 2000, then spark gives code generation issue during full scan query as size of generated code exceeds 64KB.
Solution
As in code, we have two BatchedDataSourceScan and RowDataSourceScan to scan query. As per implementation BatchedDataSourceScan is used when code generation is supported else RowDataSourceScan . Spark checks the configuration spark.sql.codegen.wholeStage is enabled and also column size should not exceeds its configuration i.e spark.sql.codegen.maxFields. Hence we need to add one more check for spark.sql.codegen.maxFields.
Testing
Tested manually.

asfgit · 2017-07-08T10:58:07Z

Can one of the admins verify this patch?

asfgit · 2017-07-08T10:58:07Z

Can one of the admins verify this patch?

CarbonDataQA · 2017-07-08T11:08:47Z

Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/372/

CarbonDataQA · 2017-07-08T11:11:33Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2960/

gvramana · 2017-07-10T08:44:59Z

LGTM

Choose BatchedDatasource scan only if schema fits for codegen

3c20e3d

ashokblend changed the title ~~[WIP]Choose BatchedDatasource scan only if schema fits for codegen~~ [CARBONDATA-1282]Choose BatchedDatasource scan only if schema fits for codegen Jul 10, 2017

asfgit closed this in 619f1f9 Jul 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-1282]Choose BatchedDatasource scan only if schema fits for codegen #1148

[CARBONDATA-1282]Choose BatchedDatasource scan only if schema fits for codegen #1148

ashokblend commented Jul 8, 2017 •

edited

asfgit commented Jul 8, 2017

asfgit commented Jul 8, 2017

CarbonDataQA commented Jul 8, 2017

CarbonDataQA commented Jul 8, 2017

gvramana commented Jul 10, 2017

[CARBONDATA-1282]Choose BatchedDatasource scan only if schema fits for codegen #1148

[CARBONDATA-1282]Choose BatchedDatasource scan only if schema fits for codegen #1148

Conversation

ashokblend commented Jul 8, 2017 • edited

asfgit commented Jul 8, 2017

asfgit commented Jul 8, 2017

CarbonDataQA commented Jul 8, 2017

CarbonDataQA commented Jul 8, 2017

gvramana commented Jul 10, 2017

ashokblend commented Jul 8, 2017 •

edited