Support segmentation and non-segmentation in more decompression kernels #13

eyalroz · 2017-05-01T20:32:06Z

(copied from Issue #163 in the kernel testbench)

At the moment, most of our decompressors can be used with segment anchors or without them - but not both:

Scheme	Segmented	Unsegmented
BITMAP	N/A	🗹
RPE	🗹	🗷
DICT	🗷	🗹
FOR	🗹	🗷
MODEL	🗷	🗹
NS	N/A	🗹
NSV	🗹	🗷
RPE	🗹	🗷
RLE	🗹	🗷

First, segment execution is important even as a single option; so MODEL and DICT should definitely have it. Then, it would be nice to support, at least for the sake of completeness, the unsegmented versions of these schemes, especially DELTA for benchmarking purposes, and RPE for cases where the overall support of the column is so small that segmentation is mostly a hassle.

eyalroz · 2017-05-02T10:32:19Z

For the DICT scheme, we'll need to choose between uniformity and flexibility.

In the uniform extreme of the spectrum, we'll have:

Fixed size dictionaries
Fixed element size per dictionary
An actual new dictionary copied in for every segment of the compressed data (even if it's very similar to the previous segment's dictionary)

And in the flexible extreme (or close to it) we'll have:

A variable-length, and variable-width, array of dictionary entry data
For each segment, a dictionary descriptor:
- An indication of where the dictionary begins in the variable-length dictionary data
- The dictionary's length (number of entries)
- (Possibly) The dictionary index size in bytes or in bits; this could theoretically be deduced from the dictionary's length - but that depends too much, perhaps, on the decompressing software's capabilities
  ... and note that a segment might simply refer to the same dictionary as its predecessor; or we might even allow it to expand its predecessor's dictionary by starting at the same place and extend further.

I'm leaning toward the more flexible extreme.

eyalroz mentioned this issue May 2, 2017

Support changing dictionaries for the Dictionary compression scheme #7

Closed

eyalroz self-assigned this May 2, 2017

eyalroz added the enhancement label May 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support segmentation and non-segmentation in more decompression kernels #13

Support segmentation and non-segmentation in more decompression kernels #13

eyalroz commented May 1, 2017 •

edited

Loading

eyalroz commented May 2, 2017

Support segmentation and non-segmentation in more decompression kernels #13

Support segmentation and non-segmentation in more decompression kernels #13

Comments

eyalroz commented May 1, 2017 • edited Loading

eyalroz commented May 2, 2017

eyalroz commented May 1, 2017 •

edited

Loading