Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-1480]Min Max Index Example for DataMap #1359

Closed
wants to merge 1 commit into from

Conversation

sounakr
Copy link
Contributor

@sounakr sounakr commented Sep 14, 2017

Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

@sounakr sounakr changed the title [CARBONDATA-1480]Min Max DataMap [CARBONDATA-1480]Min Max Index Example for DataMap Sep 14, 2017
@QACarbonData
Copy link

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/32/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/153/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/782/

/**
* End of block notification when index got created.
*/
void onBlockEndWithIndex(String blockId, String directoryPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this method required, why not onBlockEnd is enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onBlockEnd Method is called once the block is written. onBlockEndWithIndex is called once the index is also written after the carbondata is written out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not get the meaning of index. it is supposed to be independent of other indexes. I think onBlockEnd event is enough for writing the index file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But during onBlockEnd as the carbonIndex is not yet written, we wont be able to access the carbonIndex files. In the example i am gathering informations from CarbonIndex Files too.
Better to keep hook after writing Index Files also. In future we may need some more hooks at different points.

@@ -31,7 +31,8 @@
/**
* It is called to load the data map to memory or to initialize it.
*/
void init(String filePath) throws MemoryException, IOException;
void init(String blockletIndexPath, String customIndexPath, String segmentId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filepath supposed to be either index folder name or index file name, so I don't think this extra information is required here.
And also blockletIndexPath is not supposed passed as we have carbonIndex exists in other datamap and we supposed to use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Min Max Index creation like segment properties and other things i am taking input from regular carbonindex file too. So by design we can have one parameter as primitive index path other can be of the new custom index file path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be independent of other indexes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example Along with Min and Max Information i am keeping few more information for building the BlockLet. Both indexes are independent but with the current example implementation i read the Min and Max index and and then read the carbonindex index also in order to get the column cardanality and segmentproperties. These values are used to form the blocklet used for pruning.

@Override
public void init(AbsoluteTableIdentifier identifier, String dataMapName) {
this.identifier = identifier;
cache = CacheProvider.getInstance()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the use of this cache when don't use anywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/171/

@CarbonDataQA
Copy link

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/47/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/801/

@CarbonDataQA
Copy link

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/105/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/229/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/860/

@CarbonDataQA
Copy link

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/113/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/237/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/868/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/238/

@CarbonDataQA
Copy link

Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/114/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/869/

* @param blockletId
* @return
*/
List<Blocklet> pruneBlockletFromBlockId(FilterResolverIntf filterExp, int blockletId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is blockletId? I don't think this method is required in datamap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlockletId is the output of Min Max DataMap and the same is passed to BlockletDataMap in order to form the complete blocklet.
Instead of declaring the method pruneBlockletFromBlockId in DataMap, the same can be made a local function to blockletId.

@ravipesala
Copy link
Contributor

ravipesala commented Sep 20, 2017

@sounakr can you make it simpler. Please add datamap that can just return blocklet details with block+blockletid. Lets work on integration on other PR.

@jackylk
Copy link
Contributor

jackylk commented Sep 21, 2017

@sounakr I feel this same as Ravindra, let's make the example in a simplest way, so that developers can understand the concept of datamap and the usage of it in short time.

@sounakr
Copy link
Contributor Author

sounakr commented Sep 21, 2017

@ravipesala and @jackylk , sure will make it simple. Will check if some more interfaces needs to be opened.

@ravipesala
Copy link
Contributor

@sounakr Please add example based on the PR #1376 .

@sounakr
Copy link
Contributor Author

sounakr commented Sep 28, 2017

Retest this please

asfgit pushed a commit that referenced this pull request Dec 24, 2017
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes #1359
@CarbonDataQA
Copy link

Build Failed with Spark 2.2.0, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1094/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2310/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/2545/

ravipesala pushed a commit to ravipesala/incubator-carbondata that referenced this pull request Jan 23, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1955/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3154/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3189/

ravipesala pushed a commit to ravipesala/incubator-carbondata that referenced this pull request Feb 5, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/2286/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/3523/

@ravipesala
Copy link
Contributor

SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/3369/

asfgit pushed a commit that referenced this pull request Feb 8, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes #1359
asfgit pushed a commit that referenced this pull request Feb 9, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes #1359
asfgit pushed a commit that referenced this pull request Feb 9, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes #1359
jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Feb 27, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Feb 27, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
@asfgit asfgit closed this Feb 27, 2018
jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Feb 28, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
jackylk pushed a commit to jackylk/incubator-carbondata that referenced this pull request Mar 2, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
sounakr added a commit to sounakr/incubator-carbondata that referenced this pull request Mar 2, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
asfgit pushed a commit that referenced this pull request Mar 4, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes #1359
asfgit pushed a commit that referenced this pull request Mar 4, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes #1359
sounakr added a commit to sounakr/incubator-carbondata that referenced this pull request Mar 5, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
ravipesala pushed a commit to ravipesala/incubator-carbondata that referenced this pull request Mar 8, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
asfgit pushed a commit that referenced this pull request Mar 8, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes #1359
anubhav100 pushed a commit to anubhav100/incubator-carbondata that referenced this pull request Jun 22, 2018
Datamap Example. Implementation of Min Max Index through Datamap. And Using the Index while prunning.

This closes apache#1359
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants