-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-1971][Blocklet Prunning] Measure Null value blocklet prunning optimization #1751
base: master
Are you sure you want to change the base?
Conversation
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2496/ |
Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1271/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2658/ |
Retest this please. |
@@ -29,6 +29,8 @@ | |||
|
|||
private byte[][] maxValues; | |||
|
|||
private byte[] nullValues; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment
@@ -71,6 +71,8 @@ | |||
*/ | |||
byte[][] getColumnsMinValue(); | |||
|
|||
byte[] getColumnsNullValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment
f1be641
to
9efbc5b
Compare
Retest this please. |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2503/ |
Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1279/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2507/ |
Build Success with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1282/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2671/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2672/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1322/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2560/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2723/ |
@@ -45,6 +45,7 @@ struct BlockletBTreeIndex{ | |||
struct BlockletMinMaxIndex{ | |||
1: required list<binary> min_values; //Min value of all columns of one blocklet Bit-Packed | |||
2: required list<binary> max_values; //Max value of all columns of one blocklet Bit-Packed | |||
3: optional list<byte> null_value; // Null Bool turn on if null value present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use binary and try to use BitSet for null values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use PresenceMeta to store null value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Stored as a Bitset
@@ -39,8 +39,8 @@ object ExampleUtils { | |||
val warehouse = s"$rootPath/examples/spark2/target/warehouse" | |||
|
|||
CarbonProperties.getInstance() | |||
.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "yyyy/MM/dd HH:mm:ss") | |||
.addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "yyyy/MM/dd") | |||
.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "yyyy-MM-dd HH:mm:ss") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't change the example class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| SELECT * | ||
| FROM carbon_table | ||
| WHERE stringfield = 'spark' AND decimalField > 40 | ||
| SELECT * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't change this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rectified
@@ -25,6 +25,8 @@ | |||
|
|||
Object getMax(); | |||
|
|||
byte getNull(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use boolean and change method to isNull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do it. But in later phase of execution the Boolean will again get converted to Byte[]. So it doesn't make that much difference.
@@ -71,6 +82,10 @@ private void updateMeasureMinMax(EncodedColumnPage[] measures) { | |||
return dimensionMaxValue; | |||
} | |||
|
|||
public byte[] getDimentionNullValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo getDimensionNullValue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rectified
List<String> partitions = new ArrayList<>(); | ||
DataMapRow row = unsafeRow.getRow(2); | ||
DataMapRow row = unsafeRow.getRow(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add constant to partition index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -99,6 +99,10 @@ public BlockletDataRefNodeWrapper(List<TableBlockInfo> blockInfos, int index, | |||
return null; | |||
} | |||
|
|||
@Override public byte[] getColumnsNullValue() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change byte[] to BitSet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -46,7 +46,7 @@ boolean applyFilter(RowIntf value, int dimOrdinalMax) | |||
* @param blockMinValue | |||
* @return BitSet | |||
*/ | |||
BitSet isScanRequired(byte[][] blockMaxValue, byte[][] blockMinValue); | |||
BitSet isScanRequired(byte[][] blockMaxValue, byte[][] blockMinValue, byte[] blockNullValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better wrap these parameters to new BlockletStats object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -43,5 +43,6 @@ BitSet isFilterValuesPresentInBlockOrBlocklet(byte[][] maxValue, byte[][] minVal | |||
* | |||
* @return | |||
*/ | |||
Boolean isFilterValuesPresentInAbstractIndex(byte[][] maxValue, byte[][] minValue); | |||
Boolean isFilterValuesPresentInAbstractIndex(byte[][] maxValue, byte[][] minValue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here also use BlockletStats object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -34,7 +34,7 @@ | |||
* @return | |||
*/ | |||
BitSet isFilterValuesPresentInBlockOrBlocklet(byte[][] maxValue, byte[][] minValue, | |||
String uniqueBlockPath); | |||
byte[] nullValue, String uniqueBlockPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here also use BlockletStats object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2992/ |
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1762/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3031/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3036/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1805/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3432/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3050/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/4294/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5183/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/3973/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4466/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7125/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5901/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/53/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8802/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/735/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/557/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1781/ |
Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9846/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2005/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2182/ |
Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10257/ |
Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3564/ |
Build Failed with Spark 2.1.0, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.1/80/ |
Build Failed with Spark 2.3.2, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/79/ |
Build Failed with Spark 2.4.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.4/79/ |
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6020/ |
Recognise Null values in blocklets so that it can help in filter prunning.
Exclude filter isScanRequired is modified in order to help driver for blocklet prunning.
Any interfaces changed? Yes
Any backward compatibility impacted? No
Document update required? No
Testing done Yes
Please provide details on
- Whether new unit test cases have been added or why no new tests are required? Yes
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA