Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
implement batch iterators #243
referenced this pull request
Apr 12, 2018
The difference is huge:
The numbers above are with 1024 element buffer, there's investigation to be done on what the best buffer size is.
Apr 13, 2018
Yes, we should do this for some kind of typical OLAP use case like computing the sum of a column of numeric values, masked by a bitmap. Even those this iteration style is faster than the status quo, it will lead to unpredictable access patterns into such a column of numbers, which will be difficult for JIT compilers to optimise. When iterating over runs, it should be possible to get good vectorised code for the sum.
This is up to 10x faster than the standard iterator.
Sure. The batch iterator is always at least twice as fast as the standard iterator, and that is true across a range of sizes of bitmaps, proportion of run containers, array containers and bitmaps containers, and a range of size of buffer. 128-256 element buffers seem to do the best job. Depending on the contents and size of the bitmap, the batch iterator can be 10x faster. For example, a best case and worst case: