Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
bucketing strategy #1553
An issue that we discussed in that past, but I am not sure if we ever prototyped:
I'd like to try writing parquet files from the soon to be ready ADAM dataset api, bucketed by 10 megabase genomic regions using
I'll go ahead and experiment with this - but wanted to get anyone else's thoughts on if this seems viable or worthwhile. I'm hoping to achieve the effect here of a very course index.