[CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap #2594

xuchuanyin · 2018-08-01T09:22:38Z

Problems & RootCause:

For non-lazy (not deferred rebuild, which is by default) index datamap, the data of datamap will be
generated immediately after:

the datamap is created
the main table is loaded
So there is no need to rebuild this datamap.
Actually, it will encounter error if we trigger rebuilding for these
datamaps due to the existence of old data.

The situation for preagg is the same.

Solution:

We will block rebuilding for bloom/lucene and preagg datamap as well as creating them with 'deferred rebuild' in carbondata 1.4.1.
Later we will optimize the full rebuilding and incremental rebuilding for these datamaps.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

Any interfaces changed?
NO
Any backward compatibility impacted?
NO
Document update required?
NO
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
NO
- How it is tested? Please attach test report.
Tested in local
- Is it a performance related change? Please attach the performance test report.
NO
- Any additional information to help reviewers in testing this change.
NA
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
NA

CarbonDataQA · 2018-08-01T11:04:45Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7706/

CarbonDataQA · 2018-08-01T11:45:07Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6432/

ravipesala · 2018-08-01T15:01:05Z

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6098/

ravipesala · 2018-08-02T02:29:00Z

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6108/

CarbonDataQA · 2018-08-02T02:35:51Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7715/

CarbonDataQA · 2018-08-02T02:37:07Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6441/

jackylk · 2018-08-03T03:46:15Z

.../main/scala/org/apache/spark/sql/execution/command/datamap/CarbonDataMapRebuildCommand.scala

+    // for non-lazy index datamap, the data of datamap will be generated immediately after
+    // the datamap is created or the main table is loaded, so there is no need to
+    // rebuild this datamap.
+    if (!schema.isLazy && provider.isInstanceOf[IndexDataMapProvider]) {


Even if it is other datamap like preaggregat datamap, we should not rebuild it, right?

For MV, current implementation requires rebuild.
For preagg, I'm not sure about its implementation, so leave it as it is.

Right now rebuild call on pre-aggregate DM ithrows "NoSuchDataMapException". Please handle to give correct message as pre-aggregate also rebuild is not required.

CarbonDataQA · 2018-08-03T10:11:27Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6489/

CarbonDataQA · 2018-08-03T11:05:39Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7765/

ravipesala · 2018-08-03T17:57:35Z

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6147/

ravipesala · 2018-08-03T18:07:47Z

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6148/

ravipesala · 2018-08-06T11:51:33Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6177/

As manual refresh currently only works fine for MV, it has some bugs with other types of datamap such as preaggregate, timeserials, lucene, bloomfilter, we will block 'deferred rebuild' for them as well as block rebuild command for them.

ravipesala · 2018-08-06T12:53:33Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6178/

CarbonDataQA · 2018-08-06T13:34:49Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7799/

CarbonDataQA · 2018-08-06T13:43:11Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6523/

ravipesala · 2018-08-06T15:46:27Z

@xuchuanyin Please check MVTests, it is failing

MV datamap will be deferred rebuild no matter whether the deferred flag is set or not.

xuchuanyin · 2018-08-07T01:34:54Z

@ravipesala
Fixed.
The root cause is that MV is actually 'deferred rebuild', but we didn't specify it while we create the datamap.
To make compliance, we will enable 'deferred rebuild' for MV datamap no matter whether the flag is enabled by user or not.

ravipesala · 2018-08-07T01:51:39Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6182/

CarbonDataQA · 2018-08-07T02:40:43Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7803/

CarbonDataQA · 2018-08-07T02:42:53Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6527/

brijoobopanna · 2018-08-07T05:19:02Z

retest this please

CarbonDataQA · 2018-08-07T06:33:45Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7806/

CarbonDataQA · 2018-08-07T06:57:35Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6530/

jackylk · 2018-08-07T10:10:56Z

LGTM

…gg datamap As manual refresh currently only works fine for MV, it has some bugs with other types of datamap such as preaggregate, timeserials, lucene, bloomfilter, we will block 'deferred rebuild' for them as well as block rebuild command for them. Fix bugs in deferred rebuild for MV MV datamap will be deferred rebuild no matter whether the deferred flag is set or not. This closes #2594

xuchuanyin force-pushed the 0801_skip_rebuild branch from 052b683 to 2bc8b1d Compare August 2, 2018 01:27

xuchuanyin changed the title ~~[CARBONDATA-2809][DataMap] Skip rebuilding for non-lazy datamap~~ [CARBONDATA-2809][DataMap] Skip rebuilding for non-lazy index datamap Aug 2, 2018

jackylk reviewed Aug 3, 2018

View reviewed changes

xuchuanyin changed the title ~~[CARBONDATA-2809][DataMap] Skip rebuilding for non-lazy index datamap~~ [CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap Aug 3, 2018

xuchuanyin force-pushed the 0801_skip_rebuild branch from 2bc8b1d to 519d909 Compare August 3, 2018 09:52

xuchuanyin force-pushed the 0801_skip_rebuild branch from 11a90c1 to 1da08df Compare August 6, 2018 11:30

xuchuanyin force-pushed the 0801_skip_rebuild branch from 1da08df to bbe5c3c Compare August 6, 2018 12:30

Fix bugs in deferred rebuild for MV

ad6ca21

MV datamap will be deferred rebuild no matter whether the deferred flag is set or not.

asfgit closed this in abcd4f6 Aug 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap #2594

[CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap #2594

xuchuanyin commented Aug 1, 2018 •

edited

CarbonDataQA commented Aug 1, 2018

CarbonDataQA commented Aug 1, 2018

ravipesala commented Aug 1, 2018

ravipesala commented Aug 2, 2018

CarbonDataQA commented Aug 2, 2018

CarbonDataQA commented Aug 2, 2018

jackylk Aug 3, 2018

xuchuanyin Aug 3, 2018

KanakaKumar Aug 3, 2018

xuchuanyin Aug 3, 2018

CarbonDataQA commented Aug 3, 2018

CarbonDataQA commented Aug 3, 2018

ravipesala commented Aug 3, 2018

ravipesala commented Aug 3, 2018

ravipesala commented Aug 6, 2018

ravipesala commented Aug 6, 2018

CarbonDataQA commented Aug 6, 2018

CarbonDataQA commented Aug 6, 2018

ravipesala commented Aug 6, 2018

xuchuanyin commented Aug 7, 2018

ravipesala commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

brijoobopanna commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

jackylk commented Aug 7, 2018

[CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap #2594

[CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap #2594

Conversation

xuchuanyin commented Aug 1, 2018 • edited

Problems & RootCause:

Solution:

CarbonDataQA commented Aug 1, 2018

CarbonDataQA commented Aug 1, 2018

ravipesala commented Aug 1, 2018

ravipesala commented Aug 2, 2018

CarbonDataQA commented Aug 2, 2018

CarbonDataQA commented Aug 2, 2018

jackylk Aug 3, 2018

Choose a reason for hiding this comment

xuchuanyin Aug 3, 2018

Choose a reason for hiding this comment

KanakaKumar Aug 3, 2018

Choose a reason for hiding this comment

xuchuanyin Aug 3, 2018

Choose a reason for hiding this comment

CarbonDataQA commented Aug 3, 2018

CarbonDataQA commented Aug 3, 2018

ravipesala commented Aug 3, 2018

ravipesala commented Aug 3, 2018

ravipesala commented Aug 6, 2018

ravipesala commented Aug 6, 2018

CarbonDataQA commented Aug 6, 2018

CarbonDataQA commented Aug 6, 2018

ravipesala commented Aug 6, 2018

xuchuanyin commented Aug 7, 2018

ravipesala commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

brijoobopanna commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

CarbonDataQA commented Aug 7, 2018

jackylk commented Aug 7, 2018

xuchuanyin commented Aug 1, 2018 •

edited