Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

ycsb presplit strategy for hbase #142

Closed
aiyoh79 opened this Issue Sep 25, 2013 · 1 comment

Comments

Projects
None yet
3 participants

aiyoh79 commented Sep 25, 2013

Hi,

Do you guys do any presplit when doing data loading? i tried using the regionsplitter with uniformsplit (firstrow = user000 and last row =user999) to create 100 regions but i noticed the load is not balance across all the regions.

There is this Jira mentioning of using ycsbsplit algorithm but is this something already implemented or we have to do our own implementation?

https://issues.apache.org/jira/browse/HBASE-4163

Any good suggestion for this?

aiyoh79

Collaborator

busbey commented May 17, 2015

You can use the hbase shell as described in that issue to presplit. for example, with 200 splits:

create 'usertable', 'family', {SPLITS => (1..200).map {|i| "user#{1000+i*(9999-1000)/200}"}, MAX_FILESIZE => 4*1024**3}

@busbey busbey closed this May 17, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment