Skip to content

Commit

Permalink
Implemented "Column Compression with optional Predefined Dictionary",
Browse files Browse the repository at this point in the history
Blueprint:
https://blueprints.launchpad.net/percona-server/+spec/compressed-columns

Weixiang Zhai patch
https://groups.google.com/forum/#!topic/percona-discussion/NUqSFht1x0c
was taken as a base.

MySQL syntax extended with 'CREATE COMPRESSION_DICTIONARY <dic>('<data>')' /
'DROP COMPRESSION_DICTIONARY <dic>' commands.

Added new InnoDB information_schema pseudo-table 'xtradb_zip_dict'.

Added persistent storage for compression dictionaries based on
InnoDB internal table. The new table 'SYS_ZIP_DICT' has the
following definition:
CREATE TABLE SYS_ZIP_DICT(
  ID INT UNSIGNED NOT NULL,
  NAME CHAR(64) NOT NULL,
  DATA BLOB NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX SYS_ZIP_DICT_ID
  ON SYS_ZIP_DICT (ID);
CREATE UNIQUE INDEX SYS_ZIP_DICT_NAME
  ON SYS_ZIP_DICT (NAME);

"handler" class ("handler.h") extended with "create_zip_dict" method.
Added its implementation for the InnoDB engine "innobase_create_zip_dict"
("ha_innodb.cc").

Implemented automatic generation of compression dictionary ids.
Implied compression dictionary name uniqueness constraint.

Extended SQL parser with support for
"... COLUMN_FORMAT COMPRESSED WITH COMPRESSION_DICTIONARY <dict>".

Added new Innodb sys table 'SYS_ZIP_DICT_COLS' which is
supposed to store information about compression dictionaries
used for particular table columns.
CREATE TABLE SYS_ZIP_DICT_COLS(
  TABLE_ID INT UNSIGNED NOT NULL,
  COLUMN_POS INT UNSIGNED NOT NULL,
  DICT_ID INT UNSIGNED NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX SYS_ZIP_DICT_COLS_COMPOSITE
  ON SYS_ZIP_DICT_COLS (TABLE_ID, COLUMN_POS);

Added "information_schema.xtradb_zip_dict_cols" read-only view
for this table.

Implemented "dict_create_add_zip_dict_reference()" function which
registers compression dictionary <-> compressed table column link
in 'SYS_ZIP_DICT_COLS' InnoDB system table.

"dict_create_add_zip_dict_reference()" function integrated
with
"CREATE TABLE ... (... COLUMN_FORMAT COMPRESSED WITH COMPRESSION_DICTIONARY <dict> ...)".

Added proper error handling for referencing non-existing
compression dictionaries.

"ha_innobase::open()" handler extended with additional logic
which populates compression dictionary data for table fields.

Implemented two new low level functions for fetching data from
'SYS_ZIP_DICT' / 'SYS_ZIP_DICT_COLS' InnoDB system tables:
- "dict_create_get_zip_dict_id_by_reference()"
- "dict_create_get_zip_dict_data_by_id()"
Added two transaction wrappers for them:
- "dict_get_dictionary_id_by_key()"
- "dict_get_dictionary_data_by_id()"

"compress_column.test" from the original patch renamed to
"xtradb_compressed_columns" and moved to "innodb" test suite.

Added new MTR test case "xtradb_compressed_columns_with_dictionaries"
into "innodb" test suite which checks basic compression dictionary DDL
statements along with "information_schema.xtradb_zip_dict%" extensions.

Added support for large (32K) dictionaries in "SYS_ZIP_DICT" system
table.

Added proper error handling for "CREATE COMPRESSION_DICTIONARY <dict>(...)"
statement when the length of the "<dict>" identifier exceeds 64 characters:
"Compression dictionary name '<dict>' is too long (max length = 64)".

Added proper error handling for
"CREATE COMPRESSION_DICTIONARY <dict>('<data>')" statement when the length
of the "<data>" constant exceeds 32506 bytes:
"Data for compression dictionary '<dict>' is too long (max length = 32506)".

"innodb.xtradb_compressed_columns_with_dictionaries" extended with
additional checks for compression dictionary max name/data lengths.

Extended allowed SQL syntax for "CREATE COMPRESSION_DICTIONARY" statement.
It is now possible to pass both text literals ('<data>') and
variable references (@<var>) as parameters for this statement.

"innodb.xtradb_compressed_columns_with_dictionaries" MTR test file
extended with additional checks for variable parameters of different
types.

Implemented support for "DROP COMPRESSION_DICTIONARY <dict>" statement.
It is now possible to remove compression dictionaries if they are
not in use.

"handler" class extended with "drop_zip_dict" method which is supposed
to remove compression dictionaries. Added its InnoDB implementation
"innobase_drop_zip_dict()".

Added proper error hanling for "DROP COMPRESSION_DICTIONARY <dict>"
when "<dict>" does not exist:
"Compression dictionary <dict> does not exist".

Added proper error handling for "DROP COMPRESSION_DFICTIONARY <dict>"
for the case when there are existing tables with compressed column(s)
referencing "<dict>": "Compression dictionary <dict> is in use".

Enforced dictionary data checks in 'row_decompress_column()'.
When 'row_decompress_column()' detects that it needs compression
dictionary data and no such data is provided, server now asserts
rather than silently ignore this problem.

"DROP TABLE" handler extended with additional cleanup for
'SYS_ZIP_DICT_COLS' InnoDB system table.

Implemented "dict_create_remove_zip_dict_references_for_table()"
function which deletes all compression dictionary references
from the 'SYS_ZIP_DICT_COLS' for the given table ID.

"xtradb_compressed_columns_with_dictionaries" MTR test case extended
with additional checks for proper cleanup in
"information_schema.xtradb_zip_dict_cols" after "DROP TABLE xxx" for tables
with references to compression dictionaries.
In addition, checks for "DROP DATABASE xxx", which implicitly performs
"DROP TABLE xxx", were also added.

MTR framework extended with checks for remaining compression dictionaries.
We now perform
"SELECT * FROM information_schema.xtradb_zip_dict ORDER BY name;" inside
"mtr.check_testcase()" stored procedure to make sure that all compression
dictionaries created durring the test are properly cleaned.

"ALTER TABLE" handler extended to support dictionary-based compressed columns.

Field definition comparison logic for BLOB ("Field_blob" class) and TEXT fields
("Field_varstring" class) extended to take into account differences in
associated compression dictionaries.

"mysql_alter_table()" (similar to "ha_create_table()") now also updates
"zip_dict"-related info in table fields definition so that it would propagate
into InnoDB handler, where required compression dictionary references in
'SYS_ZIP_DICT_COLS' would be created.

"CREATE TABLE <new> LIKE <existing>" statement now also takes into account
associated compression dictionaries in the fields of the "<existing>" table
and replicates them in "<new>".
Moreover, for "CREATE TABLE <new> AS SELECT * FROM <existing>" statements,
compression dictionary associations in the "<new>" table will also be created.

Added "innodb.xtradb_compressed_columns_alter_table" MTR test case which
performs checks accorting to the test plan below.

For all 3 types of the ALGORITHM parameter (DEFAULT, INPLACE and COPY) check
the following:
1. Add a column with associated compression dictionary to an existing table.
2. Remove a column with associated compression dictionary from an existing
   table.
3. Add a new column to an existing table so that a column with associated
   compression dictionary would get shifted.
4. Remove a column from an existing table so that a column with associated
   compression dictionary would get shifted.
6. Explicitly move a column with associated compression dictionary to a new
   position.
   ALTER TABLE t CHANGE a a BLOB COLUMN_FORMAT COMPRESSED
     WITH COMPRESSION_DICTIONARY dict AFTER b
   Check both moving forward and backward.
7. Rename table
   ALTER TABLE t RENAME TO tt
8. Change column format:
   a. uncompressed                -> compressed
   b. uncompressed                -> compressed with dictionary
      (including non-existing)
   c. compressed                  -> uncompressed
   d. compressed                  -> compressed with dictionary
      (including non-existing)
   e. compressed with dictionary  -> uncompressed
   f. compressed with dictionary  -> compressed
   g. compressed with dictionary1 -> compressed with dictionary2
      (including non-existing)

"SHOW CREATE TABLE" handler now takes into account compression dictionaries
associated with fields.

"innodb.xtradb_compressed_column_with_dictionaries" MTR test case extended with
 checks for proper "SHOW CREATE TABLE" output.

"innodb.xtradb_compressed_columns_with_dictionaries" MTR test case extended with
checks that "CREATE TABLE" statement referencing a non-existing compression
dictionary must generate "ER_COMPRESSION_DICTIONARY_DOES_NOT_EXIST" error.

"innodb.xtradb_compressed_columns_alter_table" MTR test case also extended with
additional checks for referencing non-existing compression dictionaries in
various "ALTER TABLE" statements using all 3 values for ALGORITHM parameter
(DEFAULT, COPY and INPLACE).

"innidb.xtradb_compressed_columns_with_dictionaries" MTR test case extended with
additional checks for "CREATE TABLE ... LIKE ..." and
"CREATE TABLE ... AS SELECT * FROM ..." statements involving tables with
compession dictionaries.

We use native MySQL "TABLE_SHARE" mechanism to populate
"zip_dict_name" / "zip_dict_data" in "open_binary_frm()" via handler call to
"handler::update_field_defs_with_zip_dict_info()". These members are cleaned
in "TABLE_SHARE::destroy()".

Introduced new "handler::update_field_defs_with_zip_dict_info()" virtual
method which is overloaded in "ha_innobase" class and which is supposed to
fill table field definitions ("Field" class instances) with compression
dictionaries info.

Introduced new helper method "Field::has_associated_compression_dictionary()".

Minimal supported version in "SHOW CREATE TABLE" hint for compressed columns
set to "/*!50632".

Spicifying "COLUMN FORMAT COMPRESSED" attribute for an unsupported type
generates an error 'ER_UNSUPPORTED_COMPRESSED_COLUMN_TYPE'
"Can not define column '%-.192s' in compressed format"

A compressed column cannot be used as a part of a key.
'ER_COMPRESSED_COLUMN_USED_AS_KEY' error is triggered on attempt to do so.

Created new MTR test case 'innodb.xtradb_compressed_columns_ibd_sizes' which
shows that it is possible to achieve a better compression ratio using
compression dictionaries.

Added two new server variables:
- compressed_columns_zip_level -
  "Compression level used for compressed columns.  0 is no compression,"
  "1 is fastest and 9 is best compression. Default is 6."
- compressed_columns_threshold -
  "Compress column data if its length exceeds this value. Default is 96"

Created a new 'innodb.xtradb_compressed_columns_sp' MTR test case which checks
if compression dictionary-related statements can be used iside stored
procedures and prepared statements.

The following statements are checked:
- 'CREATE COMPRESSION_DICTIONARY'
- 'DROP CREATE COMPRESSION_DICTIONARY'
- 'CREATE TABLE' (with a reference to a compression dictionary)
- 'DROP TABLE'  (with a reference to a compression dictionary)

Created a new 'innodb.xtradb_compressed_columns_read_only' MTR test case which
checks if data can be read from a table with compressed columns when server
is started with '--innodb-read-only'.

'innodb.xtradb_compressed_columns_sp' MTR test case extended with checks
for DML statements inside stored procedures and prepared statements.
We now check 'INSERT', 'DELETE' and cursor-based 'SELECT' in stored
procedures / functions.
We also check 'INSERT', 'DELETE' and 'SELECT' in parameterized
prepared statements.

Created a new 'innodb.xtradb_compressed_columns_unsupported_se' MTR test case
which checks for attempts to use compressed columns features on unsupported
storage engines.

'CREATE COMPRESSION_DICTIONARY' /  'DROP COMPRESSION_DICTIONARY' now return
'ER_ILLEGAL_HA_CREATE_OPTION' when these statements are executed with
unsupported default storage engine.

'check_engine()' static function in 'sql_table.cc' extended with additional
checks for compressed columns - if at least one field definition has
'compressed' attribute, then handletron must support compressed columns.
This helps to intercept attempts to create tables with compressed columns
using non-InnoDB engine and to add column compression attributes via
"ALTER TABLE" statements in existing tables created using non-InnoDB storage
engine.

Compressed columns MTR test cases extended with additional checks for changing
Storage Enginne on tables with compressed columns.
'innodb.xtradb_compressed_columns_alter_table' now also checks for various
combinations of 'ALTER TABLE <table> ENGINE=<other_engine>'.
'innodb.xtradb_compressed_columns_with_dictionaries' now also checks for
various combinations of 'CREATE TABLE <new_table> LIKE <old_table>' and
'CREATE TABLE <new_table> AS SELECT ... FROM <old_table>' with
implicitly/explicitly specified storage engine.

Disabled "ALTER TABLE ... DISCARD/IMPORT TABLESPACE" statements for tables
with compressed columns.
"ER_ILLEGAL_HA" is returned for such operations.
"innodb.xtradb_compressed_columns_alter_table" MTR test case extended with
additional checks for transportable tablespace operations.

"TRUNCATE TABLE ..." handler extended with updating compression dictionary
references in 'SYS_ZIP_DICT_COLS' after changing table ID.
Corresponding checks added to the
"innodb.xtradb_compressed_columns_with_dictionaries" MTR test case.

Added new helper method "has_compressed_columns()" to the "TABLE" class.

It is now possible to determine if a columns is compressed -
check if 'ptrype' field in 'information_schema.innodb_sys_columns' has
bit 14 set (ptrype & 16384 != 0).

Added 2 new 'sys_vars' MTR test cases for
'innodb_compressed_columns_zip_level' and
'innodb_compressed_columns_threshold'.

Added a new MTR test case 'rpl.rpl_xtradb_compressed_columns' which checks
if compressed columns extensions (including compression dictionaries) work
properly in different replication modes.
  • Loading branch information
percona-ysorokin committed Oct 12, 2016
1 parent 90cac65 commit 35d5d3f
Show file tree
Hide file tree
Showing 110 changed files with 9,852 additions and 130 deletions.
3 changes: 3 additions & 0 deletions client/client_priv.h
Expand Up @@ -108,6 +108,9 @@ enum options_client
OPT_LOCK_FOR_BACKUP,
OPT_CONNECTION_SERVER_ID,
OPT_SSL_MODE,
OPT_ENABLE_COMPRESSED_COLUMNS,
OPT_ENABLE_COMPRESSED_COLUMNS_WITH_DICTIONARIES,
OPT_DROP_COMPRESSION_DICTIONARY,
OPT_MAX_CLIENT_OPTION
};

Expand Down

0 comments on commit 35d5d3f

Please sign in to comment.