Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

ycsb presplit strategy for hbase #142

aiyoh79 opened this Issue Sep 25, 2013 · 1 comment


None yet
3 participants

aiyoh79 commented Sep 25, 2013


Do you guys do any presplit when doing data loading? i tried using the regionsplitter with uniformsplit (firstrow = user000 and last row =user999) to create 100 regions but i noticed the load is not balance across all the regions.

There is this Jira mentioning of using ycsbsplit algorithm but is this something already implemented or we have to do our own implementation?


Any good suggestion for this?



busbey commented May 17, 2015

You can use the hbase shell as described in that issue to presplit. for example, with 200 splits:

create 'usertable', 'family', {SPLITS => (1..200).map {|i| "user#{1000+i*(9999-1000)/200}"}, MAX_FILESIZE => 4*1024**3}

@busbey busbey closed this May 17, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment