Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky col…
…lations - Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars() and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES. The flag defines if strnncollsp_nchars() should emulate trailing spaces which were possibly trimmed earlier (e.g. in InnoDB CHAR compression). This is important for NOPAD collations. For example, with this input: - str1= 'a ' (Latin letter a followed by one space) - str2= 'a ' (Latin letter a followed by two spaces) - nchars= 3 if the flag is given, strnncollsp_nchars() will virtually restore one trailing space to str1 up to nchars (3) characters and compare two strings as equal: - str1= 'a ' (one extra trailing space emulated) - str2= 'a ' (as is) If the flag is not given, strnncollsp_nchars() does not add trailing virtual spaces, so in case of a NOPAD collation, str1 will be compared as less than str2 because it is shorter. - Field_string::cmp_prefix() now passes the new flag. Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do not pass the new flag. - The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc (which handles the CHAR data type) now also passed the new flag. - Fixing UCA collations to respect the new flag. Other collations are possibly also affected, however I had no success in making an SQL script demonstrating the problem. Other collations will be extended to respect this flags in a separate patch later. - Changing the meaning of the last parameter of Field::cmp_prefix() from "number of bytes" (internal length) to "number of characters" (user visible length). The code calling cmp_prefix() from handler.cc was wrong. After this change, the call in handler.cc became correct. The code calling cmp_prefix() from key_rec_cmp() in key.cc was adjusted according to this change. - Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c now pass the new flag. A few new tests also were added, without the flag.
- Loading branch information
Showing
18 changed files
with
659 additions
and
237 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
--echo # | ||
--echo # MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations | ||
--echo # | ||
|
||
# TEXT | ||
|
||
if (`SELECT UPPER(@@storage_engine) != 'MEMORY'`) | ||
{ | ||
EXECUTE IMMEDIATE REPLACE( | ||
'CREATE TABLE t1 ( ' | ||
' a TEXT COLLATE <COLLATION>,' | ||
'UNIQUE(a(3)))', | ||
'<COLLATION>', @@collation_connection); | ||
SHOW CREATE TABLE t1; | ||
INSERT INTO t1 VALUES ('ss '); | ||
--error ER_DUP_ENTRY | ||
INSERT INTO t1 VALUES (_utf8mb3 0xC39F20)/*SZ+SPACE*/; | ||
DROP TABLE t1; | ||
|
||
|
||
EXECUTE IMMEDIATE REPLACE( | ||
'CREATE TABLE t1 ( ' | ||
' a TEXT COLLATE <COLLATION>,' | ||
'UNIQUE(a(3)) USING HASH)', | ||
'<COLLATION>', @@collation_connection); | ||
SHOW CREATE TABLE t1; | ||
INSERT INTO t1 VALUES ('ss '); | ||
--error ER_DUP_ENTRY | ||
INSERT INTO t1 VALUES (_utf8mb3 0xC39F20)/*SZ+SPACE*/; | ||
DROP TABLE t1; | ||
} | ||
|
||
|
||
# VARCHAR | ||
|
||
EXECUTE IMMEDIATE REPLACE( | ||
'CREATE TABLE t1 ( ' | ||
' a VARCHAR(2000) COLLATE <COLLATION>,' | ||
'UNIQUE(a(3)))', | ||
'<COLLATION>', @@collation_connection); | ||
SHOW CREATE TABLE t1; | ||
INSERT INTO t1 VALUES ('ss '); | ||
--error ER_DUP_ENTRY | ||
INSERT INTO t1 VALUES (_utf8mb3 0xC39F20)/*SZ+SPACE*/; | ||
DROP TABLE t1; | ||
|
||
|
||
EXECUTE IMMEDIATE REPLACE( | ||
'CREATE TABLE t1 ( ' | ||
' a VARCHAR(2000) COLLATE <COLLATION>,' | ||
'UNIQUE(a(3)) USING HASH)', | ||
'<COLLATION>', @@collation_connection); | ||
SHOW CREATE TABLE t1; | ||
INSERT INTO t1 VALUES ('ss '); | ||
--error ER_DUP_ENTRY | ||
INSERT INTO t1 VALUES (_utf8mb3 0xC39F20)/*SZ+SPACE*/; | ||
DROP TABLE t1; | ||
|
||
# CHAR | ||
|
||
# MyISAM is buggy on CHAR+BTREE+UNIQUE+PREFIX (see MDEV-30048), disable for now | ||
# Other engines work fine | ||
|
||
if (`SELECT UPPER(@@storage_engine) != 'MYISAM'`) | ||
{ | ||
EXECUTE IMMEDIATE REPLACE( | ||
'CREATE TABLE t1 ( ' | ||
' a CHAR(20) COLLATE <COLLATION>,' | ||
'UNIQUE(a(3)))', | ||
'<COLLATION>', @@collation_connection); | ||
SHOW CREATE TABLE t1; | ||
INSERT INTO t1 VALUES ('ss '); | ||
INSERT INTO t1 VALUES (_utf8mb3 0xC39F20)/*SZ+SPACE*/; | ||
DROP TABLE t1; | ||
} | ||
|
||
EXECUTE IMMEDIATE REPLACE( | ||
'CREATE TABLE t1 ( ' | ||
' a CHAR(20) COLLATE <COLLATION>,' | ||
'UNIQUE(a(3)) USING HASH)', | ||
'<COLLATION>', @@collation_connection); | ||
SHOW CREATE TABLE t1; | ||
INSERT INTO t1 VALUES ('ss '); | ||
INSERT INTO t1 VALUES (_utf8mb3 0xC39F20)/*SZ+SPACE*/; | ||
DROP TABLE t1; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.