Feature request - Column Dictionary Compression #2684

goranschwarz · 2020-06-03T18:00:29Z

Feature request -- Column Dictionary Compression

The idea with Dictionary Compression:

"Same" kind of idea as the JVM "may" do with Strings (literal pool or interning)
Only store a column content once for every value (this technique is used by some column store databases)
- So in the case where you have many rows with the same column content, the value is only stored once!
- On update: old values might be "orphan", which might be removed on a "shutdown compress/defrag"
For "log" tables (where nearly 100% are inserts), which probably have many rows with the same column content... this will be a tremendous space saver

On a column level set a compression level, example (syntax could probably be better):

create table t1 (
    id         int                               not null,
    c1         varchar(30)                       not null,
    c2         clob            compress=dict         null,
    c3         varchar(4000)   compress=dict         null
)

or possibly

create table t1 (
    id         int             not null,
    c1         varchar(30)     not null,
    c2         clob                null,
    c3         varchar(4000)       null,
)
with dictionary_compression on (c2, c3)

Is this a good idea?

The text was updated successfully, but these errors were encountered:

katzyn · 2020-06-04T02:07:50Z

It creates a lot of complexity and reduces performance in the most of cases where people use general-purpose database systems to store their data.

If you need a simple compact storage for character data with some search capabilities, you can use something like Apache Lucene instead.

But if you need a database, I suggest you to take a fresh look of your schema and perform its normalization instead. Better database design is superior to all possible storage optimizations in DBMS itself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request - Column Dictionary Compression #2684

Feature request - Column Dictionary Compression #2684

goranschwarz commented Jun 3, 2020

katzyn commented Jun 4, 2020

Feature request - Column Dictionary Compression #2684

Feature request - Column Dictionary Compression #2684

Comments

goranschwarz commented Jun 3, 2020

katzyn commented Jun 4, 2020