-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support index-only scans for collations other than _bin #28
Comments
Comment by yoshinorim I like space optimization for cs collation, that is:
|
Comment by spetrunia Exploring how to make a bi-directional mapping
latin1_swedish_ci, latin1_general_ci, (and other collations) have these properties:
in latin1_swedish_ci: 114 characters have weight conflicts. They form 31 weight groups, size of the group varies between two members (19 groups) and ten (4 groups) in latin1_general_ci: 56 characters have weight conflicts. They form 28 groups of two members each. latin1_general_cs has no weight conflicts (we can just enable index_only for it). Constructing restore_datavalue <=> (mem_comparable_form, restore_data). Let's take one character of the "value". If its weight is shared with a set of characters $WEIGHT_GROUP, we can assign (statically) a number to each member of the set. The number can be stored in restore_data.
Other 1-byte charsetsSome 1-byte charsets like latin1_german2_ci may map a single character into two bytes. (most of characters have 1-byte mem_comparable_form, but some have 2-byte). It looks like our approach could be extended to handle those, too. |
Comment by spetrunia Unicode// our charsets guru is currently not available, but we've had a discussion about this before and I've took another look now. Most important
utf8_general_ciSorts about 64K characters, non-trivial sorting provided for 2816 characters Basically, it extends 1-byte charsets approach into using multiple "pages". Pages with non-trivial case conversions can be handled in the same way as was proposed for 1-byte charsets. utf8_unicode_ciThis is more complex collation as it does things like Beta='ss' for German, etc. |
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Test Plan: Manually change latest to `PRIMARY_FORMAT_VERSION_UNPACK_INFO` and run: mysqltest.sh --testset=RocksDB Create database with old version. Recompile code with new version set, and see if it starts up and runs correctly. Reviewers: hermanlee4, spetrunia, jkedgar Reviewed By: jkedgar Subscribers: vasilep, webscalesql-eng Differential Revision: https://reviews.facebook.net/D58503
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Test Plan: Manually change latest to `PRIMARY_FORMAT_VERSION_UNPACK_INFO` and run: mysqltest.sh --testset=RocksDB Create database with old version. Recompile code with new version set, and see if it starts up and runs correctly. Reviewers: hermanlee4, spetrunia, jkedgar Reviewed By: jkedgar Subscribers: vasilep, webscalesql-eng Differential Revision: https://reviews.facebook.net/D58503 Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Test Plan: Manually change latest to `PRIMARY_FORMAT_VERSION_UNPACK_INFO` and run: mysqltest.sh --testset=RocksDB Create database with old version. Recompile code with new version set, and see if it starts up and runs correctly. Reviewers: hermanlee4, spetrunia, jkedgar Reviewed By: jkedgar Subscribers: vasilep, webscalesql-eng Differential Revision: https://reviews.facebook.net/D58503 Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 9dec4e2
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 9dec4e2
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 9dec4e2
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 9dec4e2
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 9dec4e2
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 9dec4e2
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 3bb23231c78
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 3bb23231c78
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 3bb23231c78
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 3bb23231c78
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 3bb23231c78
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469 fbshipit-source-id: 3bb23231c78
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469
Summary: This changes the PK storage format so that it can contain unpack_info. Currently, the storage format looks like this: | null bits | records for non-packable or non-PK columns | checksum | I am changing the format so that it looks like this: | null bits | unpack_info | records for non-packable or non-PK columns | checksum | The `unpack_info` field looks exactly the same as the one in secondary keys. It is a one-byte tag, followed by two bytes of length information, followed by the actual unpack information. Note that this field may be missing if none of the key columns have a make unpack function set. The main change that needed to be done to support this is to have the `m_make_unpack_info_func` function be available in both `Rdb_field_packing` for `Rdb_key_def::pack_record` and `Rdb_field_encoder` for `ha_rocksdb::convert_record_to_storage_format`. Note that this change isn't active since `PRIMARY_FORMAT_VERSION_UNPACK_INFO` is not the latest. Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D59469
Issue by spetrunia
Friday Jan 30, 2015 at 12:54 GMT
Originally opened as MySQLOnRocksDB#25
Currently, index-only scans are supported for
for other collations (eg. case-insensitive, _ci collations), index-only scans are not supported. The reason for this is that is not possible to restore the original column value mem-comparable key. For example, in latin_general_ci both 'foo', 'Foo', and 'FOO' have mem-comparable form 'FOO'.
A possible solution could work like this:
See also:
Diffs:
https://reviews.facebook.net/D58269
https://reviews.facebook.net/D58503
https://reviews.facebook.net/D58875
The text was updated successfully, but these errors were encountered: