MDEV-11371 - Big column compressed(innodb) #261

GCSAdmin · 2016-11-29T03:16:06Z

Some big columns(blob/text/varchar/varbinary) waste a lot of space, we introduce “compressed” into column definition when create or alter a table.
When a column was defined as a compressed column, the column data will be compressed using zlib (other compress algorithm not support yet).
We could get a better compression ratio and performance, more flexibility (vs compressed row format)
For example:
Create table tcompress (
C1 int,
C2 blob compressed,
C3 text compressed,
C4 text) engine = innodb

We achieve this 'big columns compress' function by following step:

Support 'compressed' syntax, and save this attribute in .frm file
Store the 'compressed' attribute in innodb layer, we add DATA_IS_COMPRESSED flag in prtype
If needed , we do compress in row_mysql_store_col_in_innobase_format and do decompress in row_sel_store_mysql_field
Use a compress header to control how to compress/decompress.
Compress Header is 1 Byte,
7 Bit: Always 1, mean compressed;
5-6 Bit: Compressed algorithm - Always 0, means zlib.It maybe support other compression algorithm in the future.
0-3 Bit: Bytes of "Record Original Length"
Record Original Length: 1-4 Bytes*/

We add system global variable 'field_compress_min_len' was used to control that only compress the column if the data length exceeds 'field_compress_min_len'. Default 128.
Also we add 3 error num:
ER_FIELD_TYPE_NOT_ALLOWED_AS_COMPRESSED_FIELD: support text/blob/varchar/varbinary column has compress attribute only.
ER_FIELD_CAN_NOT_COMPRESSED_AND_INDEX: column has compress attribute can not be an index
ER_FIELD_CAN_NOT_COMPRESSED_IN_CURRENT_ENGINESS: column compress be supported in innodb only

create table t1(c1 blob compressed, c2 varchar(1000) compressed); mean the table column c1,c2 have compress property. The data in table t1 have been compressed. The compress property is transparent to user. fix format

plinux · 2016-11-29T03:39:27Z

mysql-test/suite/innodb/t/innodb-blob-compressed.test

+
+set default_storage_engine = @default_storage_engine_old;
+
+drop table t1,t2,t3;


Please add more test cases about CHAR/MEDIUMBLOB/MEDIUMTEXT/LONGBLOB/LONGTEXT types.
And can you do a performance test by yourselves and show the results?

I would add more test cases.

svoj · 2016-11-29T07:49:31Z

Hi GCSAdmin,

Thanks for your contribution. JIRA task has been created to track this pull request: https://jira.mariadb.org/browse/MDEV-11371

This task was added to 10.2.4 backlog, which is planned to be handled between 2016-12-15 and 2016-12-22.

Thanks,
Sergey

laurynas-biveinis · 2016-11-29T07:54:53Z

Maybe you'd find something interesting in percona/percona-server@35d5d3f, testcases perhaps, although feature grammar is incompatible

svoj · 2016-11-29T08:02:07Z

@GCSAdmin, before we continue could you explain what benefits gives column compression over two alternatives: InnoDB native compression and COMPRESS()/UNCOMPRESS()?

Are you using indexes over compressed columns? Is it at all supported in this patch?

GCSAdmin · 2016-11-29T08:47:45Z

@svoj @plinux
The principle of column compress is equivalent to COMPRESS()/UNCOMPRESS(). The benefits of colunn compress is transparent to application layer, application layer does not require any changes can get the benefits.

Before we choose column, we had compared page compressed and column compress (use comperss()/uncompress()). Following is the test result:
Storage Testing: （The data comes from the real game DB）
origin data: 51G ( data size)
page compress data: 24G
col_compress data: 7.3G

Performance Testing（The data comes from the real game DB）

As column comperss has a better compression ratio and performance and more flexibility, we did it.

We do not support indexes over compressed columns now.

vuvova · 2016-11-29T08:50:04Z

Also, why is it better than what Percona has (commit link above)? Percona implementation is more complex, but it'll also allow much better compression with their external dictionary feature.

Why is it better than a special data type (we might have user defined types in 10.3) that is "like a blo b, but compresses everything before storage"?

And, anyway, this feature cannot possibly go into 10.2, it's too late for that. It can go into 10.3.

janlindstrom · 2016-11-29T09:00:42Z

Thank you for your contribution, however I would implement compressed columns in upper layer not inside InnoDB. Benefits would include

Reduced payload provided to InnoDB (and possible other storage engines supporting this)
Reduced payload provided out from InnoDB
This would also work for replication i.e. reduced payload on binary log as column would be compressed on binary log
Do we really need to uncompress these columns always to InnoDB buffer pool ?

willhan123 · 2016-11-29T09:18:43Z

@janlindstrom

You are right, but many users may expect to solve their problems in databases layer.
We can provide an optional storage layer solution by column compress.

Do we really need to uncompress these columns always to InnoDB buffer pool ?
We only uncompress when needed, can get a more efficient use of memory.

janlindstrom · 2016-11-29T09:33:48Z

By upper layer I meant still inside a MariaDB server i.e. implementation somewhere in sql directory.

felixliang · 2016-11-29T10:55:12Z

@laurynas-biveinis thank you for your suggestion, we are reading percona's testcases, it maybe help.

svoj · 2016-11-29T13:38:25Z

Also note column compression in AliSQL: https://jira.mariadb.org/browse/MDEV-11381

when alter table add a column with compressed, do not support inplace alter

felixliang · 2016-12-13T18:37:18Z

@vuvova hi vuvova, it is better to implement big column compressed in MariaDB server or innodb layer; because we can do some optimization about data export or import using mysqldump.

For example, when the big column is compressed, we can add a grammar to keep data compressed while exporting data out of mysql and keep data compressed while importing into mysql.

To finish this, while we use mysqldump to doing export, we use , in this case, data which is compressed does not need to be uncompress, and the general backup sqlfile, which format is , it means that when we use the backup sqlfile to do restoring job, the compressed data need not be compressed. This implementation means a lot to us.

felixliang · 2016-12-13T18:41:22Z

@janlindstrom , hi janlindstrom, the big column compressed feature can surely be implemented in the upper layer i.e. MariaDB Server, we choose to implement in innodb layer, because we think this implementation is simple enough to finish.

svoj · 2017-04-21T14:03:24Z

@felixliang one of our developers wonder if you also considered trigger (compress) + view (uncompress) solution? Why didn't it work for you? I assume because you want simpler syntax.

svoj · 2017-04-22T12:42:18Z

@felixliang, @plinux in your implementations you store compression algorithm for every row (also wrap flag in AliSQL).

Do you really need this information in every row or per-table setting is acceptable?

laurynas-biveinis · 2017-04-22T15:14:13Z

@svoj, FWIW in our implementation we decided to go with a per-row algorithm info so that we could do, in the future, if new algorithms are implemented, an ALTER TABLE (existing compressed table) (to a new algorithm), which is a metadata operation only, and rows are rewritten in the new algorithm as they are updated

svoj · 2017-04-22T15:24:01Z

@laurynas-biveinis thanks!

felixliang · 2017-05-08T12:33:15Z

@felixliang, @plinux in your implementations you store compression algorithm for every row (also wrap flag in AliSQL).

Do you really need this information in every row or per-table setting is acceptable?

@svoj
hi svoj, we store the compression algorithm for every row(it supports 4 compression algorithm in max), so that if when we do "alter table" operation to change the column's compression algorithm, we don't need to copy data, we just need to change meta data only.

but right now TMySQL hasn't support changing column's compression algorithm for instantly yet.

felixliang · 2017-05-08T12:38:05Z

@felixliang one of our developers wonder if you also considered trigger (compress) + view (uncompress) solution? Why didn't it work for you? I assume because you want simpler syntax.

@svoj
hi svoj, in our opinion the solution about using trigger(compress) + view to solute the compression problem is not good idea. it brings very complicated jobs to DBA, and the trigger will bring overload to the DBServers.

svoj · 2017-05-08T12:47:46Z

@felixliang thanks for your answers. I'm almost done porting this to SQL-layer (mostly to class Field). Will send you an email with details soon.

felixliang · 2017-05-08T13:15:14Z

@svoj

so you will pick up Tencent Game DBA Team's implementation into MariaDB 10.3, right?

in the latest meet up, monty said you will evalute our implementation and AliSQL, so i don't know which one you will choose?

svoj · 2017-05-08T13:52:18Z

@felixliang I evaluated Tencent, Alibaba and Percona code base. Unfortunately we can't take any implementation as is, because we want storage engine independent solution.

To keep things simple we won't take compression dictionary from Percona for this first implementation. It will be possible to add it later though.

Our first implementation will cover all Tencent and Alibaba requirements, except for Alibaba heap alloc (which can be added easily later anyway).

Syntax wise we will be compatible with Tencent patch, but we had to rename system variables.

.frm is not compatible with any implementation.

Same for data: generally we store the same information, but we reserve 4 bits for compression algorithm and we don't store compressed flag. In our implementation compression_algorithm == 0 means uncompressed.

svoj · 2017-05-08T14:12:40Z

One nice benefit of implementing this at SQL layer is that we can avoid data recompression in many cases when we need to copy data, like:

ALTER TABLE ALGORITHM=copy
CREATE TABLE ... SELECT
INSERT ... SELECT

vinchen · 2017-05-09T03:07:00Z

Hi, @svoj

We hope that the new implementation of blob compressed in MariaDB can be binary compatible with what Tencent's patch done.

In our Tencent implementation, for compressed header, one bit for compressed flag, 2 bits for compression algorithm, and 3 bits for Bytes of "Record Original Length"
7 Bit: Always 1, mean compressed;
5-6 Bit: Compressed algorithm - Always 0, means zlib. It maybe support other compression algorithm in the future.
0-3 Bit: Bytes of "Record Original Length"

Same for data: generally we store the same information, but we reserve 4 bits for compression algorithm and we don't store compressed flag. In our implementation compression_algorithm == 0 means uncompressed.

And does it mean that higher 4 bits means compression algorithm and lower 4 bits means Bytes of "Record Original Length"?

If so, we think it can be **binary compatible **.

**And Zlibs algorithm should be 0x08. It should be the default algorithm. **
The first header byte should be

Header Byte = (0x08 << 4) | bytes_of_original_length

What do you think?

svoj · 2017-05-09T06:45:36Z

@vinchen, I understand your wish to make it binary compatible.

Our header format is as following:

Generic compressed header format (1 byte):

Bits 1-4: algorithm specific bits
Bits 5-8: compression algorithm

If compression algorithm is 0 then header is immediately followed by
uncompressed data.

If compression algorithm is zlib:

Bits 1-2: N + 1 bytes are occupied by original data length
Bit 3: unused
Bits 4: true if zlib wrapper present
Bits 5-8: store 1 (zlib)

The difference is: in your implementation you reserve 4 bits for compressed data length, in our implementation we reserve only 2 bits. We also store bytes_of_original_length - 1.

In theory making your implementation binary compatible with ours is not that complex, but we'll have to discuss it with Monty. Adding support for Alibaba and Percona headers is a lot more complex.

svoj · 2017-05-09T13:32:40Z

@felixliang, @GCSAdmin, @vinchen, @plinux patch is in bb-10.3-svoj: 733ddb9

Note that there're still a bunch of edge cases not covered (many explained in revision comment).
Please consider this patch as prototype for now: behaviour and storage formats may change.

Your feedback will be greatly appreciated.

HugeFelix · 2017-05-11T07:24:39Z

hi @svoj

Maybe storage engine independent solution is better, because it can support any storage engines.

But it is very interesting that: the code base of Tencent, Alibaba and Percona about column compressed implementation are very similar: doing that in the InnoDB layer.

And implementation in InnoDB layer, we can also avoid data recompression in the following cases:

ALTER TABLE ALGORITHM=copy
CREATE TABLE ... SELECT
INSERT ... SELECT

The above 2 and 3 cases, we may need to use HINT to avoid data recompression.

Another thing, when we backup data in logic way, we use HINT in SELECT syntax to avoid data recompression.

svoj · 2017-05-12T17:05:36Z

@HugeFelix are there any reasons to keep it in InnoDB? The only reason I got so far is simplicity.

MariaDB compared to MySQL has a lot more storage engines available. Thus we have to care about all available storage engines equally.

HugeFelix · 2017-05-13T10:23:08Z

Yes. Simiplicity is important for us. And we really understand and support storage engine independent solution in MariaDB.

svoj · 2017-05-13T22:35:45Z

@HugeFelix Nice, thanks! It was agreed to change row storage format to be compatible with Tencent. This is generic enough and doesn't cost us much effort.

Metadata (the value stored in unireg_check) haven't been decided yet, but I guess we should come up with some nice solution too.

svoj · 2017-08-31T14:36:35Z

Pushed fdc4779

function for blob/text/varchar/varbinary compressed.

9e55229

create table t1(c1 blob compressed, c2 varchar(1000) compressed); mean the table column c1,c2 have compress property. The data in table t1 have been compressed. The compress property is transparent to user. fix format

plinux reviewed Nov 29, 2016

View reviewed changes

svoj changed the title ~~Big column compressed(innodb)~~ MDEV-11371 - Big column compressed(innodb) Nov 29, 2016

svoj self-assigned this Nov 29, 2016

willhan123 added 2 commits December 1, 2016 20:08

fix bug

e56b28c

when alter table add a column with compressed, do not support inplace alter

add more test suite

6121cd8

svoj added this to the 10.2 milestone Mar 1, 2017

svoj closed this Aug 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MDEV-11371 - Big column compressed(innodb) #261

MDEV-11371 - Big column compressed(innodb) #261

GCSAdmin commented Nov 29, 2016

plinux Nov 29, 2016

GCSAdmin Nov 29, 2016

svoj commented Nov 29, 2016

laurynas-biveinis commented Nov 29, 2016

svoj commented Nov 29, 2016

GCSAdmin commented Nov 29, 2016 •

edited

Loading

vuvova commented Nov 29, 2016

janlindstrom commented Nov 29, 2016 •

edited

Loading

willhan123 commented Nov 29, 2016 •

edited

Loading

janlindstrom commented Nov 29, 2016

felixliang commented Nov 29, 2016

svoj commented Nov 29, 2016

felixliang commented Dec 13, 2016

felixliang commented Dec 13, 2016

svoj commented Apr 21, 2017

svoj commented Apr 22, 2017

laurynas-biveinis commented Apr 22, 2017

svoj commented Apr 22, 2017

felixliang commented May 8, 2017

felixliang commented May 8, 2017

svoj commented May 8, 2017

felixliang commented May 8, 2017

svoj commented May 8, 2017

svoj commented May 8, 2017 •

edited

Loading

vinchen commented May 9, 2017

svoj commented May 9, 2017

svoj commented May 9, 2017

HugeFelix commented May 11, 2017 •

edited

Loading

svoj commented May 12, 2017

HugeFelix commented May 13, 2017

svoj commented May 13, 2017

svoj commented Aug 31, 2017


		set default_storage_engine = @default_storage_engine_old;

		drop table t1,t2,t3;

MDEV-11371 - Big column compressed(innodb) #261

MDEV-11371 - Big column compressed(innodb) #261

Conversation

GCSAdmin commented Nov 29, 2016

plinux Nov 29, 2016

Choose a reason for hiding this comment

GCSAdmin Nov 29, 2016

Choose a reason for hiding this comment

svoj commented Nov 29, 2016

laurynas-biveinis commented Nov 29, 2016

svoj commented Nov 29, 2016

GCSAdmin commented Nov 29, 2016 • edited Loading

vuvova commented Nov 29, 2016

janlindstrom commented Nov 29, 2016 • edited Loading

willhan123 commented Nov 29, 2016 • edited Loading

janlindstrom commented Nov 29, 2016

felixliang commented Nov 29, 2016

svoj commented Nov 29, 2016

felixliang commented Dec 13, 2016

felixliang commented Dec 13, 2016

svoj commented Apr 21, 2017

svoj commented Apr 22, 2017

laurynas-biveinis commented Apr 22, 2017

svoj commented Apr 22, 2017

felixliang commented May 8, 2017

felixliang commented May 8, 2017

svoj commented May 8, 2017

felixliang commented May 8, 2017

svoj commented May 8, 2017

svoj commented May 8, 2017 • edited Loading

vinchen commented May 9, 2017

svoj commented May 9, 2017

svoj commented May 9, 2017

HugeFelix commented May 11, 2017 • edited Loading

svoj commented May 12, 2017

HugeFelix commented May 13, 2017

svoj commented May 13, 2017

svoj commented Aug 31, 2017

GCSAdmin commented Nov 29, 2016 •

edited

Loading

janlindstrom commented Nov 29, 2016 •

edited

Loading

willhan123 commented Nov 29, 2016 •

edited

Loading

svoj commented May 8, 2017 •

edited

Loading

HugeFelix commented May 11, 2017 •

edited

Loading