[CARBONDATA-2633][BloomDataMap] Fix bugs in bloomfilter for dictionary/sort/date/TimeStamp column #2403

xuchuanyin · 2018-06-24T03:01:13Z

In carbondata,

for dictionary column, carbon convert literal value to dict value, then
convert dict value to mdk value, at last it stores the mdk value as
internal value in carbonfile.

for other columns, carbon convert literal value to internal value using
field-converter.

Since bloomfilter datamap stores the internal value, during query we
should convert the literal value in filter to internal value in order to
match the value stored in bloomfilter datamap.

Changes are made:

FieldConverters were refactored to extract common value convert methods.
BloomQueryModel was optimized to support converting literal value to
internal value.
fix bugs for int/float/date/timestamp as bloom index column
fix bugs in dictionary/sort column as bloom index column
add tests
block (deferred) rebuild for bloom datamap (contains bugs that does
not fix in this commit)

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

Any interfaces changed?
No
Any backward compatibility impacted?
Yes. Because the encoding of the index value has changed. Besides, deferred build of bloom datamap has been blocked in this PR, will do it later
Document update required?
No
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
Added tests
- How it is tested? Please attach test report.
Tested in local machine
- Is it a performance related change? Please attach the performance test report.
Query performance with bloomfilter may decrease, because it contains extra value-convert procedure
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

xuchuanyin · 2018-06-24T03:18:21Z

retest this please

ravipesala · 2018-06-24T03:56:14Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5411/

CarbonDataQA · 2018-06-24T06:19:17Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5321/

CarbonDataQA · 2018-06-24T06:22:13Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6490/

xuchuanyin · 2018-06-25T01:10:34Z

retest this please

CarbonDataQA · 2018-06-25T04:12:24Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6508/

CarbonDataQA · 2018-06-25T04:14:22Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5340/

xuchuanyin · 2018-06-25T07:44:00Z

retest this please

CarbonDataQA · 2018-06-25T08:16:35Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6516/

CarbonDataQA · 2018-06-25T08:19:34Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5348/

ravipesala · 2018-06-25T09:49:29Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5433/

CarbonDataQA · 2018-06-25T10:06:13Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5351/

CarbonDataQA · 2018-06-25T10:39:57Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6520/

ravipesala · 2018-06-25T11:22:10Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5434/

jackylk · 2018-06-29T03:00:29Z

datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java

@@ -175,6 +297,7 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
  public void clear() {
    bloomIndexList.clear();
    bloomIndexList = null;
+


jackylk · 2018-06-29T03:01:09Z

datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java


-    private BloomQueryModel(String columnName, DataType dataType, Object filterValue) {
+    private BloomQueryModel(String columnName, byte[] filterValue) {


please describe the filterValue

xuchuanyin · 2018-06-29T15:31:07Z

@jackylk I refactored the commit based on our discussion, please check.

Some tests are added to clarify the scenarios

CarbonDataQA · 2018-06-29T15:37:11Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5488/

CarbonDataQA · 2018-06-29T15:37:11Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6661/

ravipesala · 2018-06-29T15:37:40Z

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5521/

CarbonDataQA · 2018-06-29T16:37:10Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6664/

CarbonDataQA · 2018-06-29T16:37:11Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5491/

ravipesala · 2018-06-29T17:33:41Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5524/

ravipesala · 2018-06-29T18:29:10Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5525/

xuchuanyin · 2018-06-30T01:12:05Z

retest this please

CarbonDataQA · 2018-06-30T01:35:37Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6671/

xuchuanyin · 2018-06-30T01:51:34Z

retest this please

CarbonDataQA · 2018-06-30T01:51:59Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5498/

ravipesala · 2018-06-30T02:46:51Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5529/

CarbonDataQA · 2018-06-30T03:21:32Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6674/

CarbonDataQA · 2018-06-30T03:44:44Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5501/

jackylk · 2018-06-30T15:55:46Z

datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomDataMapWriter.java

@@ -69,6 +86,27 @@
    indexBloomFilters = new ArrayList<>(indexColumns.size());
    initDataMapFile();
    resetBloomFilters();
+
+    keyGenerator = segmentProperties.getDimensionKeyGenerator();


Can we optimize this instead of passing the whole SegmentProperties into this Writer class? Please check @ravipesala

Here we use the keyGenerator, ColumnarSpitter and dimensions for this segment.

jackylk · 2018-06-30T15:58:55Z

datamap/bloom/pom.xml

+      <version>${project.version}</version>
+    </dependency>
+    <!--note: guava 14.0.1 is omitted during assembly.
+    The compile scope here is for building and running test-->


you have not added compiler scope

Oh, this line is to be removed. It is used for guava-cache previously. will fix it

jackylk · 2018-06-30T15:59:10Z

.../apache/carbondata/processing/loading/converter/impl/DirectDictionaryFieldConverterImpl.java

-      row.update(1, index);
-    } else if (value.equals(nullFormat)) {
-      row.update(1, index);
+      return 1;


Suggest to create a constant for null value (1)

For dictionary column, carbon convert literal value to dict value, then convert dict value to mdk value, at last it stores the mdk value as internal value in carbonfile. For sort column and date column, the value has also been encoded. Here in bloomfilter datamap, we will index on the encoded data, that is to say: For dictionary/date column, we use the surrogate key as bloom index key; For sort column and ordinary dimensions, we use the plain bytes as bloom index key; For measures, we convert the value to bytes and use it as the bloom index key. Changes are made: 1. FieldConverters were refactored to extract common value convert methods. 2. BloomQueryModel was optimized to support converting literal value to internal value. 2. fix bugs for int/float/date/timestamp as bloom index column 3. fix bugs in dictionary/sort column as bloom index column 4. add tests 5. block (deferred) rebuild for bloom datamap (contains bugs that does not fix in this commit, another PR has been raised)

CarbonDataQA · 2018-07-02T04:59:58Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/6696/

CarbonDataQA · 2018-07-02T05:08:00Z

Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/5523/

ravipesala · 2018-07-02T06:26:33Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5551/

jackylk · 2018-07-02T16:17:06Z

LGTM

…loom index column Filtering on longstring bloom index column is already supported in PR #2403, here we only add test for it. This closes #2416

[CARBONDATA-2587][CARBONDATA-2588] Local Dictionary Data Loading support What changes are proposed in this PR Added code to support Local Dictionary Data Loading for primitive type Added code to support Local Dictionary Data Loading for complex type. How this PR is tested Manual testing is done in 3 Node setup. UT will be raised in different PR This closes apache#2402 [CARBONDATA-2647] [CARBONDATA-2648] Add support for COLUMN_META_CACHE and CACHE_LEVEL in create table and alter table properties Things done as part of this PR Support for configuring COLUMN_META_CACHE in create and alter table set properties DDL. Support for configuring CACHE_LEVEL in create and alter table set properties DDL. Describe formatted display support for COLUMN_META_CACHE and CACHE_LEVEL Any interfaces changed? Create Table Syntax CREATE TABLE [dbName].tableName (col1 String, col2 String, col3 int,…) STORED BY ‘carbondata’ TBLPROPERTIES (‘COLUMN_META_CACHE’=’col1,col2,…’, 'CACHE_LEVEL'='BLOCKLET') Alter Table set properties Syntax ALTER TABLE [dbName].tableName SET TBLPROPERTIES (‘COLUMN_META_CACHE’=’col1,col2,…’, 'CACHE_LEVEL'='BLOCKLET') This closs apache#2418 [CARBONDATA-2549] Bloom remove guava cache and use CarbonCache Currently, bloom cache is implemented using guava cache, carbon has its own lru cache interfaces and complete sysytem it controls the cache intstead of controlling feature wise. So replace guava cache with carbon lru cache. This closes apache#2327 [CARBONDATA-2608]Document update about Json Writer with examples. Document update about Json Writer with examples. This closes apache#2409 [CARBONDATA-2634][BloomDataMap] Add datamap properties in show datamap outputs add datamap properties in show datamap outputs This closes apache#2404 [CARBONDATA-2647] [CARBONDATA-2648] Fix cache level display in describe formatted command 1. Correct CACHE_LEVEL display in describe formatted command. It was always displays BLOCK even though val was configured BLOCKLET. 2. Correct the method arguments to pass dbName first and then tableName. 3. Added test case for blocking column_meta_cache and cache_level on child dataMaps. This closes apache#2426 [CARBONDATA-2669] Local Dictionary Store Size optimisation and other function issues Problems Local dictionary store size issue. When all column data is empty and columns are not present in sort columns local dictionary size was more than no dictionary dictionary store size. Page level dictionary merging Issue While merging the page used dictionary values in a blocklet it was missing some of the dictionary values, this is because, AND operation was done on bitset Local Dictionary null values Null value was not added in LV because of this new dictionary values was getting generated for null values Local dictionary generator thread specific Solution: Added rle for unsorted dictionary values to reduce the size. Now OR operation is performed while merging the dictionary values Added LV for null values Local dictionary generator task specific This closes apache#2427 [CARBONDATA-2585][CARBONDATA-2586][Local Dictionary]Local dictionary support for alter table, preaggregate, varchar datatype, alter set and unset What changes were proposed in this pull request? In this PR, local dictionary support is added for alter table, preaggregate, varChar datatype, alter table set and unset command UTs are added for local dictionary load support All the validations related to above features are taken care in this PR How was this patch tested? All the tests were executed in 3 node cluster. UTs and SDV test cases are added in the same PR This closes apache#2401 [HOTFIX] Fixed compilation issues and bloom clear issue Fixed test This closes apache#2428 [CARBONDATA-2635][BloomDataMap] Support different index datamaps on same column User can create different provider based index datamaps on one column, for example user can create bloomfilter datamap and lucene datamap on one column, but not able to create two bloomfilter datamap on one column. This closes apache#2405 [CARBONDATA-2646][DataLoad]change the log level while loading data into a table with 'sort_column_bounds' property,'ERROR' flag change to 'WARN' flag for some expected tasks. change the log level while loading data into a table with 'sort_column_bounds' property,'ERROR' flag change to 'WARN' flag for some expected tasks. This closes apache#2407 [CARBONDATA-2545] Fix some spell error in CarbonData This closes apache#2419 [CARBONDATA-2629] Support SDK carbon reader read data from HDFS and S3 with filter function Now SDK carbon reader only support read data from local with filter function, it will throw exception when read data from HDFS and S3 with filter function This PR support it: Support SDK carbon reader read data from HDFS and S3 with filter function This closes apache#2399 [CARBONDATA-2644][DataLoad]ADD carbon.load.sortMemory.spill.percentage parameter invalid value check This closes apache#2397 [CARBONDATA-2653][BloomDataMap] Fix bugs in incorrect blocklet number in bloomfilter In non-deferred reuibuild scenario, the last bloomfilter index file has already been written onBlockletEnd, no need to write again, otherwise an extra blocklet number will be generated in the bloom index file. This closes apache#2408 [CARBONDATA-2674][Streaming]Streaming with merge index enabled does not consider the merge index file while pruning This closes apache#2429 [CARBONDATA-2606][Complex DataType Enhancements]Fix for ComplexDataType Projection PushDown Problem1: Fix for ComplexDataType Projection PushDown when Table Schema contains ColumnName in UpperCase Solution: Change ColumnName to Lowercase Problem2: If Struct contains Array, pushdown only parent column Solution: Check for ArrayType or GetArrayItem in the Complex Column, if any ArrayType is found, then pushdown parent column This closes apache#2421 [CARBONDATA-2633][BloomDataMap] Fix bugs in bloomfilter for dictionary/sort/date/TimeStamp column for dictionary column, carbon convert literal value to dict value, then convert dict value to mdk value, at last it stores the mdk value as internal value in carbonfile. for other columns, carbon convert literal value to internal value using field-converter. Since bloomfilter datamap stores the internal value, during query we should convert the literal value in filter to internal value in order to match the value stored in bloomfilter datamap. Changes are made: 1.FieldConverters were refactored to extract common value convert methods. 2.BloomQueryModel was optimized to support converting literal value to internal value. 3.fix bugs for int/float/date/timestamp as bloom index column 4.fix bugs in dictionary/sort column as bloom index column 5.add tests 6.block (deferred) rebuild for bloom datamap (contains bugs that does not fix in this commit) This closes apache#2403 [HOTFIX][32K]maintain proper mapping for varChar Columns and noDictionary Columns for all the dimensions while creating sort data rows instance Problem: when creating the column mapping for varChar columns and no dictionary columns for existing dimensions, the mapping is incorrect. Solution: remove unwanted variable counter and map correct index to varChar columns and noDictionary columns based on the number of dimensions This closes apache#2395 [CARBONDATA-2650][Datamap] Fix bugs in negative number of skipped blocklets Currently in carbondata, default blocklet datamap will be used to prune blocklets. Then other indexdatamap will be used. But the other index datamap works for segment scope, which in some scenarios, the size of pruned result will be bigger than that of default datamap, thus causing negative number of skipped blocklets in explain query output. Here we add intersection after pruning. If the pruned result size is zero, we will finish the pruning. This closes apache#2410 [CARBONDATA-2654][Datamap] Optimize output for explaining querying with datamap Currently if we have multiple datamaps and the query hits all the datamaps, carbondata explain command will only show the first datamap and all the datamaps are not shown. In this commit, we show all the datamaps that are hitted in this query. This closes apache#2411 [CARBONDATA-2687][BloomDataMap][Doc] Update document for bloomfilter datamap In previous PR, cache behaviour for bloomfilter datamap has been changed: changed from guava-cache to carbon-cache. This PR update the document for bloomfilter datamap and remove the description for cache. This closes apache#2446 Code Generator Error is thrown when Select filter contains more than one count of distinct of ComplexColumn with group by Clause [CARBONDATA-2684] [PR-2442] Distinct count fails on complex columns This PR fixes Code Generator Error thrown when Select filter contains more than one count of distinct of ComplexColumn with group by Clause This closes apache#2449 [CARBONDATA-2645] Segregate block and blocklet cache Things done as part of this PR Segregate block and blocklet cache. In this driver will cache the metadata based on CACHE_LEVEL. If CACHE_LEVEL is set to BLOCK then only carbondata files metadata will be cached in driver. If CACHE_LEVEL is set to BLOCKLET thenmetadata for number of carbondata files * number of blocklets in each carbondata file will be cached in driver. This closes apache#2437 [CARBONDATA-2675][32K] Support config long_string_columns when create datamap Create datamap use select statement, but long string column is defined with StringType in the result dataframe if this column is selected. This PR allows to set long_string_columns property in dmproperties. This closes apache#2432 [CARBONDATA-2683][32K] fix data convertion problem for Varchar Spark uses org.apache.spark.unsafe.types.UTF8String for string datatype internally. In carbon, varchar datatype should do the same convertion as string datatype. Or it may throw exception This closes apache#2438 [CARBONDATA-2657][BloomDataMap] Fix bugs in loading and querying on bloom column with empty values Fix bugs in loading and querying on bloom column … Fix bugs in loading and querying with empty values on bloom index columns. Convert null values to corresponding values. This closes apache#2413 [CARBONDATA-2585][CARBONDATA-2586][Local Dictionary]Added test cases for local dictionary support for alter table, set, unset and preaggregate Added test cases for local dictionary support for alter table, set, unset and pre-aggregate All the validations related to above features are taken care in this PR This closes apache#2422 [CARBONDATA-2606][Complex DataType Enhancements] Fixed Projection Pushdown when Select filter contains Struct column. Problem: If Select filter contains Struct Column which is not in Projection list, then only null value is stored for struct column given in filter and select query result is null. Solution: Pushdown Parent column of corresponding struct type if any struct column is present in Filter list. This closes apache#2439 [CARBONDATA-2642] Added configurable Lock path property A new property is being exposed which will allow the user to configure the lock path "carbon.lock.path" Refactored code to create a separate implementation for S3CarbonFile. This closes apache#2642 [CARBONDATA-2686] Implement Left join on MV datamap This closes apache#2444 [CARBONDATA-2660][BloomDataMap] Add test for querying on longstring bloom index column Filtering on longstring bloom index column is already supported in PR apache#2403, here we only add test for it. This closes apache#2416 [CARBONDATA-2689] Added validations for complex columns in alter set statements Issue: Alter set statements were not validating complex dataType columns correctly. Fix: Added a recursive method to validate string and varchar child columns of complex dataType columns. This closes apache#2450 [CARBONDATA-2681][32K] Fix loading problem using global/batch sort fails when table has long string columns In SortStepRowHandler, global/batch sort use convertRawRowTo3Parts instead of convertIntermediateSortTempRowTo3Parted. varcharDimCnt was not add up to noDictArray cause error: Problem while converting row to 3 parts. This closes apache#2435 [CARBONDATA-2658][DataLoad] Fix bugs in spilling in-memory pages the parameter carbon.load.sortMemory.spill.percentage configured the value range 0-100,according to configuration merge and spill in-memory pages to disk This closes apache#2414 [CARBONDATA-2666] updated rename command so that table directory is not renamed rename will not rename table folder but only changes metadata This closes apache#2420 [CARBONDATA-2637][BloomDataMap] Fix bugs for deferred rebuild for bloomfilter datamap Previously when we implement ISSUE-2633, deferred rebuild for bloom datamap is disabled for bloomfilter datamap due to unhandled bugs. In this commit, we fixed the bugs and bring this feature back. Since bloomfilter datamap create index on the carbon native raw bytes, we have to convert original literal value to carbon native bytes both in loading and querying. This closes apache#2425 [CARBONDATA-2701] Refactor code to store minimal required info in Block and Blocklet Cache 1. Refactored code to keep only minimal information in block and blocklet cache. 2. Introduced segment properties holder at JVM level to hold the segment properties. As it is heavy object, new segment properties object will be created only when schema or cardinality is changed for a table. This closes apache#2454 [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]Local dictionary query Support Supported Non filter query for local dictionary Supported Filter query on local dictionary Supported Query on complex column for primitive type local dictionary columns Local Dictionary support on Varchar columns Supported Vector reader on local dictionary [CARBONDATA-2589][CARBONDATA-2590][CARBONDATA-2602]Local dictionary query Support Supported Non filter query for local dictionary Supported Filter query on local dictionary Supported Query on complex column for primitive type local dictionary columns Local Dictionary support on Varchar columns Supported Vector reader on local dictionary This closes apache#2447 [CARBONDATA-2585][CARBONDATA-2586]Fix local dictionary support for preagg and set localdict info in column schema This PR fixes local dictionary support for preaggregate and set the column dict info of each column in column schema read and write for backward compatibility. This closes apache#2451 [CARBONDATA-2711] carbonFileList is not initalized when updatetablelist call bug fix: carbon is not initalized within updatetablelist method when we execute 'SELECT table_name FROM information_schema.tables WHERE table_schema = 'tmp_sbu_vadmdb' from command line This closes apache#2468 [CARBONDATA-2685][DataMap] Parallize datamap rebuild processing for segments Currently in carbondata, while rebuilding datamap, one spark job will be started for each segment and all the jobs are executed serailly. If we have many historical segments, the rebuild will takes a lot of time. Here we optimize the procedure for datamap rebuild and start one start for each segments, all the tasks can be done in parallel in one spark job. This closes apache#2443 [CARBONDATA-2706][BloomDataMap] clear bloom index files after segment is deleted clear bloom index files after corresponding segment is deleted and cleaned This closes apache#2461 [CARBONDATA-2715][LuceneDataMap] Fix bug in search mode with lucene datamap in windows While comparing two pathes, the file separator is different in windows, thus causing empty pruned blocklets. This PR will ignore the file separator This closes apache#2470 [CARBONDATA-2703][Tests] Clear up env after tests 1.reset session parameters after test 2.clean up output after test This closes apache#2458 [CARBONDATA-2607][Complex Column Enhancements] Complex Primitive DataType Adaptive Encoding In this PR the improvement was done to save the complex type more effectively so that reading becomes more efficient. The changes are: Primitive types inside complex types are separate pages. Previously it was a single byte array column page for a complex column. Now all sub-levels inside the complex data types are stored as separate pages with their respective datatypes. No Dictionary Primitive DataTypes inside Complex Columns will be processed through Adaptive Encoding. Previously only snappy compression was applied. All Primitive datatypes inside complex if it is now dictionary, only value will be saved except String, Varchar which is saved as ByteArray. Previously all sub-levels are saved as Length And Value Format inside a single Byte Array. Currently only Struct And Array type column pages are saved in ByteArray. All other primitive except String and varchar are saved in respective fixed datatype length. Support for the Safe and Unsafe Fixed length Column Page to support growing dynamic array implementation. This is done to support Array datatype. Co-authored-by: sounakr <sounakr@gmail.com> This closes apache#2417

…y/sort/date/TimeStamp column for dictionary column, carbon convert literal value to dict value, then convert dict value to mdk value, at last it stores the mdk value as internal value in carbonfile. for other columns, carbon convert literal value to internal value using field-converter. Since bloomfilter datamap stores the internal value, during query we should convert the literal value in filter to internal value in order to match the value stored in bloomfilter datamap. Changes are made: 1.FieldConverters were refactored to extract common value convert methods. 2.BloomQueryModel was optimized to support converting literal value to internal value. 3.fix bugs for int/float/date/timestamp as bloom index column 4.fix bugs in dictionary/sort column as bloom index column 5.add tests 6.block (deferred) rebuild for bloom datamap (contains bugs that does not fix in this commit) This closes #2403

…loom index column Filtering on longstring bloom index column is already supported in PR #2403, here we only add test for it. This closes #2416

xuchuanyin changed the title ~~[CARBONDATA-2632][BloomDataMap] Fix bugs in bloomfilter for dictionary/sort/date/TimeStamp column~~ [CARBONDATA-2633][BloomDataMap] Fix bugs in bloomfilter for dictionary/sort/date/TimeStamp column Jun 24, 2018

xuchuanyin closed this Jun 24, 2018

xuchuanyin reopened this Jun 24, 2018

This was referenced Jun 26, 2018

[CARBONDATA-2654][Datamap] Optimize output for explaining querying with datamap #2411

Closed

[CARBONDATA-2660][BloomDataMap] Add test for querying on longstring bloom index column #2416

Closed

jackylk reviewed Jun 29, 2018

View reviewed changes

xuchuanyin force-pushed the 0621_bloom_dict_sort_int_date branch 2 times, most recently from 6c33eb4 to 009e333 Compare June 29, 2018 15:23

xuchuanyin force-pushed the 0621_bloom_dict_sort_int_date branch from 854598c to b9d0976 Compare June 30, 2018 01:51

jackylk reviewed Jun 30, 2018

View reviewed changes

xuchuanyin added 3 commits July 2, 2018 11:19

fix review comments

c7f6df3

fix resolve comments

569e738

xuchuanyin force-pushed the 0621_bloom_dict_sort_int_date branch from b9d0976 to 569e738 Compare July 2, 2018 03:40

asfgit closed this in cd7c210 Jul 4, 2018


		private BloomQueryModel(String columnName, DataType dataType, Object filterValue) {
		private BloomQueryModel(String columnName, byte[] filterValue) {

[CARBONDATA-2633][BloomDataMap] Fix bugs in bloomfilter for dictionary/sort/date/TimeStamp column #2403

[CARBONDATA-2633][BloomDataMap] Fix bugs in bloomfilter for dictionary/sort/date/TimeStamp column #2403

Conversation

xuchuanyin commented Jun 24, 2018 • edited Loading

xuchuanyin commented Jun 24, 2018

ravipesala commented Jun 24, 2018

CarbonDataQA commented Jun 24, 2018

CarbonDataQA commented Jun 24, 2018

xuchuanyin commented Jun 25, 2018

CarbonDataQA commented Jun 25, 2018

CarbonDataQA commented Jun 25, 2018

xuchuanyin commented Jun 25, 2018

CarbonDataQA commented Jun 25, 2018

CarbonDataQA commented Jun 25, 2018

ravipesala commented Jun 25, 2018

CarbonDataQA commented Jun 25, 2018

CarbonDataQA commented Jun 25, 2018

ravipesala commented Jun 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuchuanyin commented Jun 29, 2018

CarbonDataQA commented Jun 29, 2018

CarbonDataQA commented Jun 29, 2018

ravipesala commented Jun 29, 2018

CarbonDataQA commented Jun 29, 2018

CarbonDataQA commented Jun 29, 2018

ravipesala commented Jun 29, 2018

ravipesala commented Jun 29, 2018

xuchuanyin commented Jun 30, 2018

CarbonDataQA commented Jun 30, 2018

xuchuanyin commented Jun 30, 2018

CarbonDataQA commented Jun 30, 2018

ravipesala commented Jun 30, 2018

CarbonDataQA commented Jun 30, 2018

CarbonDataQA commented Jun 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarbonDataQA commented Jul 2, 2018

CarbonDataQA commented Jul 2, 2018

ravipesala commented Jul 2, 2018

jackylk commented Jul 2, 2018

xuchuanyin commented Jun 24, 2018 •

edited

Loading