[CARBONDATA-1480]Min Max Index Example for DataMap #1359

sounakr · 2017-09-14T22:37:46Z

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

QACarbonData · 2017-09-14T22:52:32Z

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/32/

CarbonDataQA · 2017-09-14T22:56:34Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/153/

ravipesala · 2017-09-14T23:13:45Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/782/

ravipesala · 2017-09-15T03:48:46Z

core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapWriter.java

+  /**
+   * End of block notification when index got created.
+   */
+  void onBlockEndWithIndex(String blockId, String directoryPath);


Why is this method required, why not onBlockEnd is enough?

onBlockEnd Method is called once the block is written. onBlockEndWithIndex is called once the index is also written after the carbondata is written out.

I did not get the meaning of index. it is supposed to be independent of other indexes. I think onBlockEnd event is enough for writing the index file.

But during onBlockEnd as the carbonIndex is not yet written, we wont be able to access the carbonIndex files. In the example i am gathering informations from CarbonIndex Files too.
Better to keep hook after writing Index Files also. In future we may need some more hooks at different points.

ravipesala · 2017-09-15T03:54:41Z

core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java

@@ -31,7 +31,8 @@
  /**
   * It is called to load the data map to memory or to initialize it.
   */
-  void init(String filePath) throws MemoryException, IOException;
+  void init(String blockletIndexPath, String customIndexPath, String segmentId)


The filepath supposed to be either index folder name or index file name, so I don't think this extra information is required here.
And also blockletIndexPath is not supposed passed as we have carbonIndex exists in other datamap and we supposed to use it.

For Min Max Index creation like segment properties and other things i am taking input from regular carbonindex file too. So by design we can have one parameter as primitive index path other can be of the new custom index file path.

it should be independent of other indexes

In this example Along with Min and Max Information i am keeping few more information for building the BlockLet. Both indexes are independent but with the current example implementation i read the Min and Max index and and then read the carbonindex index also in order to get the column cardanality and segmentproperties. These values are used to form the blocklet used for pruning.

ravipesala · 2017-09-15T03:57:22Z

examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxDataMapFactory.java

+  @Override
+  public void init(AbsoluteTableIdentifier identifier, String dataMapName) {
+    this.identifier = identifier;
+    cache = CacheProvider.getInstance()


what is the use of this cache when don't use anywhere

CarbonDataQA · 2017-09-15T12:36:37Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/171/

CarbonDataQA · 2017-09-15T12:37:53Z

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/47/

ravipesala · 2017-09-15T13:28:47Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/801/

CarbonDataQA · 2017-09-19T08:52:16Z

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/105/

CarbonDataQA · 2017-09-19T08:56:34Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/229/

ravipesala · 2017-09-19T09:15:11Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/860/

CarbonDataQA · 2017-09-20T02:47:16Z

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/113/

CarbonDataQA · 2017-09-20T02:51:33Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/237/

ravipesala · 2017-09-20T03:09:02Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/868/

CarbonDataQA · 2017-09-20T04:06:28Z

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/238/

CarbonDataQA · 2017-09-20T04:07:21Z

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/114/

ravipesala · 2017-09-20T04:29:10Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/869/

ravipesala · 2017-09-20T07:02:41Z

core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java

+   * @param blockletId
+   * @return
+   */
+  List<Blocklet> pruneBlockletFromBlockId(FilterResolverIntf filterExp, int blockletId);


what is blockletId? I don't think this method is required in datamap

BlockletId is the output of Min Max DataMap and the same is passed to BlockletDataMap in order to form the complete blocklet.
Instead of declaring the method pruneBlockletFromBlockId in DataMap, the same can be made a local function to blockletId.

ravipesala · 2017-09-20T07:12:58Z

@sounakr can you make it simpler. Please add datamap that can just return blocklet details with block+blockletid. Lets work on integration on other PR.

jackylk · 2017-09-21T00:49:07Z

@sounakr I feel this same as Ravindra, let's make the example in a simplest way, so that developers can understand the concept of datamap and the usage of it in short time.

sounakr · 2017-09-21T02:31:30Z

@ravipesala and @jackylk , sure will make it simple. Will check if some more interfaces needs to be opened.

ravipesala · 2017-09-21T05:48:51Z

@sounakr Please add example based on the PR #1376 .

sounakr · 2017-09-28T19:24:25Z

Retest this please

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

CarbonDataQA · 2017-12-24T13:29:38Z

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1094/

CarbonDataQA · 2017-12-24T14:21:17Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2310/

ravipesala · 2017-12-24T15:18:32Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2545/

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

CarbonDataQA · 2018-01-28T06:49:40Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1955/

ravipesala · 2018-01-28T07:10:32Z

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3154/

CarbonDataQA · 2018-01-28T07:47:08Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3189/

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

CarbonDataQA · 2018-02-05T15:04:40Z

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2286/

CarbonDataQA · 2018-02-05T15:58:49Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3523/

ravipesala · 2018-02-05T16:55:02Z

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3369/

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

sounakr changed the title ~~[CARBONDATA-1480]Min Max DataMap~~ [CARBONDATA-1480]Min Max Index Example for DataMap Sep 14, 2017

ravipesala reviewed Sep 15, 2017

View reviewed changes

sounakr force-pushed the minmax branch from a46e3b7 to 0371d2a Compare September 15, 2017 12:21

sounakr force-pushed the minmax branch from 0371d2a to 10317b1 Compare September 19, 2017 08:37

sounakr force-pushed the minmax branch from 10317b1 to 8c4951e Compare September 20, 2017 02:32

sounakr force-pushed the minmax branch from 8c4951e to ee94296 Compare September 20, 2017 03:50

ravipesala reviewed Sep 20, 2017

View reviewed changes

sounakr force-pushed the minmax branch from ee94296 to f3847da Compare September 28, 2017 19:18

asfgit pushed a commit that referenced this pull request Dec 24, 2017

[CARBONDATA-1480]Min Max Index Example for DataMap

a8114b9

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

asfgit force-pushed the fgdatamap branch from 65d0537 to e9e85a0 Compare December 24, 2017 13:26

ravipesala pushed a commit to ravipesala/incubator-carbondata that referenced this pull request Jan 23, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

15b2abc

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

asfgit force-pushed the fgdatamap branch from e9e85a0 to af8ba6d Compare January 28, 2018 06:45

ravipesala pushed a commit to ravipesala/incubator-carbondata that referenced this pull request Feb 5, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

cae74a8

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

asfgit force-pushed the fgdatamap branch from af8ba6d to e972fd3 Compare February 5, 2018 15:02

asfgit pushed a commit that referenced this pull request Feb 8, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

bbb6b01

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

asfgit pushed a commit that referenced this pull request Feb 9, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

c5b0d3d

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

asfgit pushed a commit that referenced this pull request Feb 9, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

fce7558

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Feb 27, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

ef43759

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Feb 27, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

e870848

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

asfgit closed this Feb 27, 2018

jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Feb 28, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

0640922

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Mar 2, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

6882f73

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

sounakr added a commit to sounakr/incubator-carbondata that referenced this pull request Mar 2, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

b6153e3

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

asfgit pushed a commit that referenced this pull request Mar 4, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

3a0a9e6

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

asfgit pushed a commit that referenced this pull request Mar 4, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

f9d15a2

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

sounakr added a commit to sounakr/incubator-carbondata that referenced this pull request Mar 5, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

00e5208

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

ravipesala pushed a commit to ravipesala/incubator-carbondata that referenced this pull request Mar 8, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

fd5d3f9

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

asfgit pushed a commit that referenced this pull request Mar 8, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

ca7e2e3

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes #1359

anubhav100 pushed a commit to anubhav100/incubator-carbondata that referenced this pull request Jun 22, 2018

[CARBONDATA-1480]Min Max Index Example for DataMap

4e3b421

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning. This closes apache#1359

[CARBONDATA-1480]Min Max Index Example for DataMap #1359

[CARBONDATA-1480]Min Max Index Example for DataMap #1359

Conversation

sounakr commented Sep 14, 2017

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

QACarbonData commented Sep 14, 2017

CarbonDataQA commented Sep 14, 2017

ravipesala commented Sep 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarbonDataQA commented Sep 15, 2017

CarbonDataQA commented Sep 15, 2017

ravipesala commented Sep 15, 2017

CarbonDataQA commented Sep 19, 2017

CarbonDataQA commented Sep 19, 2017

ravipesala commented Sep 19, 2017

CarbonDataQA commented Sep 20, 2017

CarbonDataQA commented Sep 20, 2017

ravipesala commented Sep 20, 2017

CarbonDataQA commented Sep 20, 2017

CarbonDataQA commented Sep 20, 2017

ravipesala commented Sep 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravipesala commented Sep 20, 2017 • edited

jackylk commented Sep 21, 2017

sounakr commented Sep 21, 2017

ravipesala commented Sep 21, 2017

sounakr commented Sep 28, 2017

CarbonDataQA commented Dec 24, 2017

CarbonDataQA commented Dec 24, 2017

ravipesala commented Dec 24, 2017

CarbonDataQA commented Jan 28, 2018

ravipesala commented Jan 28, 2018

CarbonDataQA commented Jan 28, 2018

CarbonDataQA commented Feb 5, 2018

CarbonDataQA commented Feb 5, 2018

ravipesala commented Feb 5, 2018

ravipesala commented Sep 20, 2017 •

edited