Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Clone in Desktop Download ZIP

Loading…

Support index-only scans for collations other than _bin #25

Closed
spetrunia opened this Issue · 3 comments

3 participants

@spetrunia
Collaborator

Currently, index-only scans are supported for

  • numeric columns
  • varchar columns with BINARY, latin1_bin, utf8_bin collation.

for other collations (eg. case-insensitive, _ci collations), index-only scans are not supported. The reason for this is that is not possible to restore the original column value mem-comparable key. For example, in latin_general_ci both 'foo', 'Foo', and 'FOO' have mem-comparable form 'FOO'.

A possible solution could work like this:
1. In addition to value->mem_comparable_form function, develop value->(mem_comparable_form, restore_data) function. This is easy for some charsets.
2. store restore_data in RocksDB's value part of the key-value pair. We already used to store information about VARCHAR field length there, but now the value part is unused.

See also:

@yoshinorim
Owner

I like space optimization for cs collation, that is:

  • (mem_comparable_form) for cs collation, not storing any duplicates for cs collation
  • (mem_comparable_form, restore_data) for ci collation, and storing restore_data into value part of the key-value pair.
@spetrunia
Collaborator

Exploring how to make a bi-directional mapping

 value <=> (mem_comparable_form, restore_data).

latin1_swedish_ci, latin1_general_ci, (and other collations) have these properties:

  • strxfrm(char c) returns a character (a value between 0x00-0xFF).
  • The problem is that there are sets of characters X,Y,Z, ... where X!=Y, X!=Z, but strxfrm(X)==strxfrm(Y)==strxfrm(Z). In charset terminology, these characters have "the same weight" (let's call them a "weight group")

in latin1_swedish_ci: 114 characters have weight conflicts. They form 31 weight groups, size of the group varies between two members (19 groups) and ten (4 groups)

in latin1_general_ci: 56 characters have weight conflicts. They form 28 groups of two members each.

latin1_general_cs has no weight conflicts (we can just enable index_only for it).

Constructing restore_data

value <=> (mem_comparable_form, restore_data).

Let's take one character of the "value".
If its weight is unique, it can be restored from its mem-comparable form.

If its weight is shared with a set of characters $WEIGHT_GROUP, we can assign (statically) a number to each member of the set. The number can be stored in restore_data.

  • $WEIGHT_GROUP is usually small, so the number will only occupy a few bits.
  • by looking at restore_data, we can find its $WEIGHT_GROUP, and know how many bits (if any) are used to store the number of the value in the group.

Other 1-byte charsets

Some 1-byte charsets like latin1_german2_ci may map a single character into two bytes. (most of characters have 1-byte mem_comparable_form, but some have 2-byte). It looks like our approach could be extended to handle those, too.

@spetrunia
Collaborator

Unicode

// our charsets guru is currently not available, but we've had a discussion about this before and I've took another look now.

Most important

  • utf_bin - Already handled
  • utf8_general_ci - the "simpler" collation, provides basic sorting
  • utf8_unicode_ci - more complex
  • utf8mb4 - ??

utf8_general_ci

Sorts about 64K characters, non-trivial sorting provided for 2816 characters
(AFAIU, "alphabets").

Basically, it extends 1-byte charsets approach into using multiple "pages".
There are 11 256-element pages that have non-trivial case conversions or sorting rules such that weight(X)!=X.
For other pages, weight(X) = two_byte_form(X).

Pages with non-trivial case conversions can be handled in the same way as was proposed for 1-byte charsets.

utf8_unicode_ci

This is more complex collation as it does things like Beta='ss' for German, etc.
We may have difficulties with fully supporting it, because some languages have rather complex rules.
OTOH, these languages are not that common, so for those we could just store the source character in the restore_data. This will double the used space.

@spetrunia spetrunia referenced this issue from a commit
@spetrunia spetrunia Issue #25: Make ORDER BY DESC faster
Added support for reverse-ordered Column Families. If a Column Family
name starts with "rev:", then the data in it is stored in the reverse
order.
94fec57
@hermanlee hermanlee closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.