Skip to content

BSLab (including lzp-rollhash, mtf_shelwien and mtf_cuda) and Radix-sort benchmarks

Latest
Compare
Choose a tag to compare
@Bulat-Ziganshin Bulat-Ziganshin released this 18 Jun 08:18

BSL (the block-sorting lab) benchmark:

  • A lot of experiments with CUDA MTF implementations. The best one, depending on actual data, is either mtf_scalar, mtf_2buffers or mtf_4by8 (see results.txt).
  • CUDA MTF raw speed reached 700 MB/s on ENWIK8 data, that is 1.5-2 GB/s effective speed, taking into account that preceding RLE stage shaves off 60-70% of BWT output.
  • CPU MTF algorithm by Eugene Shelwien, 150-200 MB/s raw speed, i.e. 500 MB/s effective speed per core.
  • New rolling-hash based LZP preprocessing algorithm, up to 500 MB/s per core.
  • Almost complete, LZP+BWT/ST+RLE+MTF stack (only entropy coding isn't yet implemented), allowing to measure speed/ratio of various stage combinations.

Radix-sort benchmark: measures speed of the CUB radix sort with various parameters.

All GPU speeds are measured on GF560Ti overclocked to 900 MHz. All CPU speeds are measured on the Haswell i7-4770.