New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-2809][DataMap] Block rebuilding for bloom/lucene and preagg datamap #2594
Conversation
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7706/ |
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6432/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6098/ |
052b683
to
2bc8b1d
Compare
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6108/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7715/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6441/ |
// for non-lazy index datamap, the data of datamap will be generated immediately after | ||
// the datamap is created or the main table is loaded, so there is no need to | ||
// rebuild this datamap. | ||
if (!schema.isLazy && provider.isInstanceOf[IndexDataMapProvider]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it is other datamap like preaggregat datamap, we should not rebuild it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For MV, current implementation requires rebuild.
For preagg, I'm not sure about its implementation, so leave it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now rebuild call on pre-aggregate DM ithrows "NoSuchDataMapException". Please handle to give correct message as pre-aggregate also rebuild is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
2bc8b1d
to
519d909
Compare
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6489/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7765/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6147/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6148/ |
11a90c1
to
1da08df
Compare
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6177/ |
As manual refresh currently only works fine for MV, it has some bugs with other types of datamap such as preaggregate, timeserials, lucene, bloomfilter, we will block 'deferred rebuild' for them as well as block rebuild command for them.
1da08df
to
bbe5c3c
Compare
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6178/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7799/ |
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6523/ |
@xuchuanyin Please check MVTests, it is failing |
MV datamap will be deferred rebuild no matter whether the deferred flag is set or not.
@ravipesala |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/6182/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7803/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6527/ |
retest this please |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7806/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6530/ |
LGTM |
…gg datamap As manual refresh currently only works fine for MV, it has some bugs with other types of datamap such as preaggregate, timeserials, lucene, bloomfilter, we will block 'deferred rebuild' for them as well as block rebuild command for them. Fix bugs in deferred rebuild for MV MV datamap will be deferred rebuild no matter whether the deferred flag is set or not. This closes #2594
Problems & RootCause:
For non-lazy (not deferred rebuild, which is by default) index datamap, the data of datamap will be
generated immediately after:
So there is no need to rebuild this datamap.
Actually, it will encounter error if we trigger rebuilding for these
datamaps due to the existence of old data.
The situation for preagg is the same.
Solution:
We will block rebuilding for bloom/lucene and preagg datamap as well as creating them with 'deferred rebuild' in carbondata 1.4.1.
Later we will optimize the full rebuilding and incremental rebuilding for these datamaps.
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
NO
NO
NO
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
NO
- How it is tested? Please attach test report.
Tested in local
- Is it a performance related change? Please attach the performance test report.
NO
- Any additional information to help reviewers in testing this change.
NA
NA