Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: OnlineDDL PK conversion results in table scans #16023

Closed
mattlord opened this issue May 29, 2024 · 22 comments · Fixed by #16597
Closed

Bug Report: OnlineDDL PK conversion results in table scans #16023

mattlord opened this issue May 29, 2024 · 22 comments · Fixed by #16597
Assignees
Labels

Comments

@mattlord
Copy link
Contributor

mattlord commented May 29, 2024

Overview of the Issue

When changing the character set and/or collation of a PK column, OnlineDDL creates a VReplication binlog filter which uses the from (original) charset rather than the to charset. This results in not being able to use the primary key on the shadow table when applying replicated events and can cause vplayer stalls (see #15974).

Reproduction Steps

git checkout main && make build

pushd examples/local

./101_initial_cluster.sh

primaryuid=$(vtctldclient GetTablets --keyspace commerce --tablet-type primary --shard "0" | awk '{print $1}' | cut -d- -f2 | bc)

mysql -e "CREATE TABLE t1 (id varchar(128), id2 varchar(32) NOT NULL, updated_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, created_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, colstuff varchar(64) NOT NULL, PRIMARY KEY (id,id2))"

for _ in {1..1000}; do mysql commerce -e "insert into t1 (id, id2, colstuff) values ('${RANDOM}${RANDOM}${RANDOM}_id', '${RANDOM}${RANDOM}${RANDOM}_id2', '${RANDOM}${RANDOM}${RANDOM}_colstuff')"; done

command mysql -u root --socket ${VTDATAROOT}/vt_0000000${primaryuid}/mysql.sock -e "set @@global.general_log=ON"

# Change the charset of a PK column.
uuid=$(vtctldclient ApplySchema --ddl-strategy "vitess --postpone-completion" --sql "ALTER TABLE t1 MODIFY id varchar(128) character set ascii" commerce)

sleep 5

command mysql -u root --socket ${VTDATAROOT}/vt_0000000${primaryuid}/mysql.sock -e "select id, workflow, source, pos from _vt.vreplication where workflow = '${uuid}'\G"

command mysql -u root --socket ${VTDATAROOT}/vt_0000000${primaryuid}/mysql.sock -e "update vt_commerce.t1 set colstuff = 'foobar' order by rand() limit 20"

sleep 5

grep _vt_vrp_ ${VTDATAROOT}/vt_0000000${primaryuid}/data/$(hostname -s).log | grep -i update

Example output:

❯ command mysql -u root --socket ${VTDATAROOT}/vt_0000000${primaryuid}/mysql.sock -e "select id, workflow, source, pos from _vt.vreplication where workflow = '${uuid}'\G"
*************************** 1. row ***************************
      id: 1
workflow: 7f8b11f8_1dc8_11ef_a47d_920702940ee1
  source: keyspace:"commerce" shard:"0" filter:{rules:{match:"_vt_vrp_7f8b11f81dc811efa47d920702940ee1_20240529103405_" filter:"select convert(`id` using utf8mb4) as `id`, `id2` as `id2`, `updated_at` as `updated_at`, `created_at` as `created_at`, `colstuff` as `colstuff` from `t1`" convert_charset:{key:"id" value:{from_charset:"utf8mb4" to_charset:"ascii"}} source_unique_key_columns:"id,id2" target_unique_key_columns:"id,id2" source_unique_key_target_columns:"id,id2" force_unique_key:"PRIMARY"}}
     pos: MySQL56/72d489da-1dc8-11ef-8b36-4ae7a99a0651:1-1060

❯ command mysql -u root --socket ${VTDATAROOT}/vt_0000000${primaryuid}/mysql.sock vt_commerce -e "explain update _vt_vrp_7f8b11f81dc811efa47d920702940ee1_20240529103405_ set updated_at='2024-05-29 10:34:09', created_at='2024-05-29 10:33:52', colstuff='foobar' where id=(convert('12927272832499_id' using utf8mb4)) and id2='3024566117854_id2'\G"
*************************** 1. row ***************************
           id: 1
  select_type: UPDATE
        table: _vt_vrp_7f8b11f81dc811efa47d920702940ee1_20240529103405_
   partitions: NULL
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 260
          ref: NULL
         rows: 1000
     filtered: 100.00
        Extra: Using where

Binary Version

❯ vtgate --version
vtgate version Version: 20.0.0-SNAPSHOT (Git revision c666b1442b216413bd36f31926489cc0581af511 branch 'vplayer_batch_trx_timeout') built on Wed May 29 10:36:27 EDT 2024 by matt@pslord.local using go1.22.3 darwin/arm64

Operating System and Environment details

N/A

Log Fragments

N/A
@shlomi-noach
Copy link
Contributor

Thank you, assigned myself to look into it.

@shlomi-noach
Copy link
Contributor

OnlineDDL creates a VReplication binlog filter which uses the from (original) charset rather than the to charset.

I'd like to clarify a bit. The problem is not with the binlog filter: the filter tells vreplication how to read rows from the source table. We convert the source string to utf8 (mb4) irrespective of the source/target charsets because that aligns with how we read changes from the binlog (binlog entries are always utf8). So this conversion consolidates logic between vcopier and vplayer.

The only scenarios where we don't convert is when both source and target are compatible utf8 flavors or can trivially convert from one to another -- but that's just a side note, let's move on.

What needs to change is the way vplayer events are applied to the table. We know vplayer reads all strings as utf8 from the binary log (because MySQL always converts to utf8 in the binary logs). It then needs to un-convert them in the WHERE clause.

I actually thought that was already done in the past, but I will now take another look. I know for certainty that we do un-convert string values in the INSERT/DELETE statement column values. So if anything, we'd need to apply the same to the WHERE clause columns.

@shlomi-noach
Copy link
Contributor

Logging the statement generated by tablePlanBuilder for the above example:

update _vt_vrp_e0497d27226c11ef89c00a43f95f28a3_20240604122049_ set updated_at=:a_updated_at, created_at=:a_created_at, colstuff=:a_colstuff where id=(convert(:b_id using utf8mb4)) and id2=:b_id2

So it is in where id=(convert(:b_id using utf8mb4)) that we need to un-convert to the actual column charset/collation.

@mattlord
Copy link
Contributor Author

mattlord commented Jun 4, 2024

I'd like to clarify a bit. The problem is not with the binlog filter: the filter tells vreplication how to read rows from the source table. We convert the source string to utf8 (mb4) irrespective of the source/target charsets because that aligns with how we read changes from the binlog (binlog entries are always utf8). So this conversion consolidates logic between vcopier and vplayer.

It defines both what we want from the source and how we want to transform it for the target. It's the latter part that is the issue in this case. Binlog entries are binary, go strings are always unicode/utf8 — perhaps that's what you mean here? My understanding is this... We end up with the update statement, which is a sequence of bytes interpreted in vcopier (go) as a unicode string but then sent over the wire to MySQL as a sequence of bytes again. In MySQL that sequence of bytes is interpreted using the connection charset / character_set_client — and then the individual values converted to the character sets in the table definition for storage as needed. It's not clear to me why we need to explicitly tell MySQL to convert the incoming value to the schema character set. Perhaps we're doing that to deal with any edge cases and any connection charset / character_set_client in use? I'm assuming that there are good reasons and I just don't have them in my mind yet. 🙂

What needs to change is the way vplayer events are applied to the table. We know vplayer reads all strings as utf8 from the binary log (because MySQL always converts to utf8 in the binary logs). It then needs to un-convert them in the WHERE clause.

I don't yet see why we need to do any conversions. All strings in go are also unicode/utf8. We do not do any such conversions in VReplication — these are only done when explicitly specified by OnlineDDL today AFAIK (I could of course be wrong).

It's certainly possible that there's some VReplication side bug/gap in here, but I don't see that yet. Thanks!

@shlomi-noach
Copy link
Contributor

We end up with the update statement, which is a sequence of bytes interpreted in vcopier (go) as a unicode string but then sent over the wire to MySQL as a sequence of bytes again. In MySQL that sequence of bytes is interpreted using the connection charset / character_set_client

I argue that this is immaterial. VCopier can do all the conversions it wants, it's just INSERTs and does not cause full table scans. The problem is with VPlayer, and the way it converts (or does not convert) utf8 values as seen in the binary log, into the actual charset, for columns that are part of the PKE.

@shlomi-noach
Copy link
Contributor

We end up with the update statement, which is a sequence of bytes interpreted in vcopier (go) as a unicode string but then sent over the wire to MySQL as a sequence of bytes again. In MySQL that sequence of bytes is interpreted using the connection charset / character_set_client

Because the target charset is ascii. And so vplayer attempts to run:

update ... set ... where id=(convert(:b_id using utf8mb4)) and id2=:b_id2

But id on target is ascii and therefore MySQL casts id to utf8 to compute the comparison, which in turn negates the use of the index.

@mattlord
Copy link
Contributor Author

mattlord commented Jun 4, 2024

I argue that this is immaterial. VCopier can do all the conversions it wants, it's just INSERTs and does not cause full table scans. The problem is with VPlayer, and the way it converts (or does not convert) utf8 values as seen in the binary log, into the actual charset, for columns that are part of the PKE.

My argument is that MySQL uses schema on write — which includes the character set. Why are we doing the CONVERT in the SQL statement itself?

We end up with the update statement, which is a sequence of bytes interpreted in vcopier (go) as a unicode string but then sent over the wire to MySQL as a sequence of bytes again. In MySQL that sequence of bytes is interpreted using the connection charset / character_set_client

Because the target charset is ascii. And so vplayer attempts to run:

update ... set ... where id=(convert(:b_id using utf8mb4)) and id2=:b_id2

The vplayer is only doing that because it's been told to in the binlog source definition.

But id on target is ascii and therefore MySQL casts id to utf8 to compute the comparison, which in turn negates the use of the index.

MySQL is only doing that conversion because we're explicitly telling it to. Which is the problem as I see it. 🙂

@mattlord
Copy link
Contributor Author

mattlord commented Jun 4, 2024

If it's not clear, my main argument is that there's an incoming sequence of bytes defining the SQL statement. MySQL will coerce the individual values in that statement — if possible — from the client/connection charset to the column charset. It's not clear to me why we're trying to explicitly tell MySQL to do anything here when in the end we always want it to do what it should, no? Meaning to convert the value to the charset defined for the column. There may very well be good reason(s), mind you, I'm just trying to understand them.

@shlomi-noach
Copy link
Contributor

@mattlord we read column values from MySQL into golang space. We do that in two ways:

  1. Via vcopier (select ... from the table)
  2. Via Vplayer (reading the binlog event)

We then have a single logic that applies to those values. When I began writing OnlineDDL for VReplication I found that this logic mostly assumed utf8. The problem is that in (2) you always read utf8 because binary logs. But in (1) you get whatever charset you have on the table.

So values were then written to the backend table inconsistently and that was seen in various charset validation test cases for Online DDL, which we originally imported from gh-ost. vreplication was broken in this scenario.

This is why we now convert(col using utf8mb4) when reading string or enum values from the source table. Now, we have both vcopier and vplayer read consistently the same values.

We absolutely have to deal with conversions because the binary log is utf8 irrespective of the real table charset. We furthermore need to overcome the MySQL connection charset transformation (because it does).

OK so now that both use utf8, we need to make sure the value is converted back to the real charset in MySQL table columns. We do that correctly when we construct INSERT queries. We do that correctly when we construct UPDATE queries, but it looks to me like we skip the WHERE clause part. Similarly in DELETE we skip the WHERE clause part.

@mattlord
Copy link
Contributor Author

mattlord commented Jun 4, 2024

We absolutely have to deal with conversions because the binary log is utf8 irrespective of the real table charset. We furthermore need to overcome the MySQL connection charset transformation (because it does).

OK, issues relating to the connection/client charset (set names foo) and conversions/coercions between that and the column character sets certainly make sense. That's what I suspected was at the heart of this. I suspect that a big part of this was also that at the time Vitess had no collation support (only unicode / utf8). The other points don't make sense to me, but I believe you. 🙂

@shlomi-noach
Copy link
Contributor

The other points don't make sense to me,

Which of the points? vcopier vs. vplayer? Those I can attest to with full confidence. Again, if you SELECT col FROM tbl and col is latin1, you get some value. That value is stored in a golang utf string, but the bytes have form A. golang doesn't convert anything on your behalf. You just read some bytes.

When you read an INSERT INTO tbl (... col ...) VALUE ( ... thevalue ...) event from the binary log, the value you read is originally encoded in utf8. You read that value into a golang utf string. Again golang does nothing on your behalf, you just read some bytes. this is form B

I can tell you with absolute certainty that form A != form B. You get a different sequence of bytes. Form B, BTW, is correct, because you're reading utf8 into utf8.

You now have two different actual binary values, and in both cases you wish to INSERT those onto tbl. There is only one insert logic. The values you INSERT into the table will be different. It's a bug.

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Jun 4, 2024

@mattlord proof is super easy by the way. Remove the convert( ... ) part in vrepl.go, create a draft PR, see multiple onlineddl_vrepl_suite tests failing.

@mattlord
Copy link
Contributor Author

mattlord commented Jun 4, 2024

The other points don't make sense to me,

Which of the points? vcopier vs. vplayer? Those I can attest to with full confidence. Again, if you SELECT col FROM tbl and col is latin1, you get some value. That value is stored in a golang utf string, but the bytes have form A. golang doesn't convert anything on your behalf. You just read some bytes.

When you read an INSERT INTO tbl (... col ...) VALUE ( ... thevalue ...) event from the binary log, the value you read is originally encoded in utf8.

Most of the time we're reading ROW events, but sometimes statements. I don't see where the bytes are serialized with a unicode encoding in the MySQL source (I may just be missing it). But that's also immaterial here I think.

You read that value into a golang utf string. Again golang does nothing on your behalf, you just read some bytes. this is form B

OK.

I can tell you with absolute certainty that form A != form B. You get a different sequence of bytes. Form B, BTW, is correct, because you're reading utf8 into utf8.

The byte representation for A is the same in latin1 and utf8. It's 1000001 (65 in decimal). How that is compared to other things is determined by the collation (the default collation for the character set if none is specified). I suspect this is more about comparisons (whether a == A or a == å is determined by the collation).

mysql> select convert("A" using latin1), convert("A" using utf8mb4);
+---------------------------+----------------------------+
| convert("A" using latin1) | convert("A" using utf8mb4) |
+---------------------------+----------------------------+
| A                         | A                          |
+---------------------------+----------------------------+
1 row in set (0.00 sec)

mysql> select hex(convert("A" using latin1)), hex(convert("A" using utf8mb4));
+--------------------------------+---------------------------------+
| hex(convert("A" using latin1)) | hex(convert("A" using utf8mb4)) |
+--------------------------------+---------------------------------+
| 41                             | 41                              |
+--------------------------------+---------------------------------+
1 row in set (0.01 sec)

mysql> select convert("A" using latin1) = convert("A" using utf8mb4);
+--------------------------------------------------------+
| convert("A" using latin1) = convert("A" using utf8mb4) |
+--------------------------------------------------------+
|                                                      1 |
+--------------------------------------------------------+
1 row in set (0.00 sec)

You now have two different actual binary values, and in both cases you wish to INSERT those onto tbl. There is only one insert logic. The values you INSERT into the table will be different. It's a bug.

I still don't understand how we have two different binary values. It's a matter of how those bytes are interpreted at the point of storage or comparison.

@mattlord proof is super easy by the way. Remove the convert( ... ) part in vrepl.go, create a draft PR, see multiple onlineddl_vrepl_suite tests failing.

I believe that, I'm just trying to understand where the problem really lies and if there's a better way to deal with it today. I'm not disagreeing, I'm merely trying to understand. I'll run a local test that way to try and get a better feel. This may very well end up being considered a real/valid VReplication side bug and I want to understand it better so that I can think about how best to fix it. 🙂

@shlomi-noach
Copy link
Contributor

However - what I'd like to reexamine is, once we have everything as utf8 in golang space (having read it as utf8 from the original table or from binary log), why do we need to convert(... using utf8) again when writing to the target table.

@shlomi-noach
Copy link
Contributor

Here's an example. I rewrote table_plan_builder to not convert(...) when writing to the target table (rowstream still converts to utf8mb4 when reading from the table). Running through onlineddl_vrepl_suite, the test TestSchemaChange/alter-charset-non-utf8-80 fails. The characters look different:

Jun 04 15:06:26         	            	-21	átesting-binlog	átesting-binlog	átesting-binlog	átesting-binlog	átesting-binlog
Jun 04 15:06:26         	            	-22	testátest-binlog	testátest-binlog	testátest-binlog	🍻😀	átesting-binlog
Jun 04 15:06:26         	            	-23	átesting-bnull	átesting-bnull	átesting-bnull	NULL	NULL
Jun 04 15:06:26         	            	+21	átesting-binlog	ĂĄtesting-binlog	átesting-binlog	átesting-binlog	átesting-binlog
Jun 04 15:06:26         	            	+22	testátest-binlog	testĂĄtest-binlog	testátest-binlog	🍻😀	átesting-binlog
Jun 04 15:06:26         	            	+23	átesting-bnull	ĂĄtesting-bnull	átesting-bnull	NULL	NULL

If I edit table_plan_builder to convert(...) into the target charset (in your example this was ascii) then again this test fails:

Jun 04 15:06:17         	            	-10	átesting-binlog	átesting-binlog	átesting-binlog	átesting-binlog	átesting-binlog
Jun 04 15:06:17         	            	-11	átesting-bnull	átesting-bnull	átesting-bnull	NULL	NULL
Jun 04 15:06:17         	            	+10	átesting-binlog	ĂĄtesting-binlog	átesting-binlog	átesting-binlog	átesting-binlog
Jun 04 15:06:17         	            	+11	átesting-bnull	ĂĄtesting-bnull	átesting-bnull	NULL	NULL
Jun 04 15:06:17         	            	 12	43f2d06c28c70faf9a8ede73745a1781	66bea8d0cbfe4ad13293e47732b39051	0c2f0eac2f84c9485cfcfb3c4

@shlomi-noach
Copy link
Contributor

I'm not sure what to make of it anymore, this hurts my brain.

@shlomi-noach
Copy link
Contributor

Side note, but maybe more important than this entire discussion: it may be the case that we should reject this type of schema migration, because the source and target table do not share identifiable column values in the PKE. That is, the id value in the source table is not identifiable with the id value in the target table.

@mattlord
Copy link
Contributor Author

mattlord commented Jun 4, 2024

I think you're looking at the results of selecting those rows/values? I'll have to look. What you see on SELECT or generally on "display" is determined by the connection/client charset:

mysql> select * from chartest;
+----+------------+
| id | name       |
+----+------------+
|  1 | matt lord  |
|  2 | mått lord  |
|  3 | ⭐          |
+----+------------+
3 rows in set (0.00 sec)

mysql> set names latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from chartest;
+----+-----------+
| id | name      |
+----+-----------+
|  1 | matt lord |
|  2 | m�tt lord  |
|  3 | ?         |
+----+-----------+
3 rows in set (0.00 sec)

In any event, I'm fine moving this to a VReplication bug and I'll put it in the prioritized queue. I AGREE, it gets very complicated and hairy. That's why I'm trying to look at it with fresh eyes rather than building on determinations/assumptions made years ago (which may of course turn out to be right/true).

@shlomi-noach
Copy link
Contributor

New findings!

Consider this table:

drop table if exists tbl;
CREATE TABLE `tbl` (
`c1` varchar(12) NOT NULL DEFAULT '',
`c2` varchar(12) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL DEFAULT '',
`ts` datetime DEFAULT current_timestamp(),
UNIQUE KEY `c1c2_uidx` (`c1`, `c2`)
) ENGINE InnoDB,
CHARSET utf8mb4,
COLLATE utf8mb4_unicode_ci;

insert into tbl (c1, c2) values (left(sha1(rand()),12), left(sha1(rand()),12));
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;
insert into tbl (c1, c2) select left(sha1(rand()),12), left(sha1(rand()),12) from tbl;

Because c2 is ascii, vstreamer uses convert(c2 using utf8mb4 as follows:

select /*+ MAX_EXECUTION_TIME(3600000) */ c1, convert(c2 using utf8mb4) as c2, ts from tbl force index (`c1c2_uidx`) order by c1, c2;

Let's EXPLAIN this:

mysql> explain select /*+ MAX_EXECUTION_TIME(3600000) */ c1, convert(c2 using utf8mb4) as c2, ts from tbl force index (`c1c2_uidx`) order by c1, c2;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
|  1 | SIMPLE      | tbl   | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 1910748 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+

Using filesort is obviously bad, and the copy performance will tank.

But, why is this query tanking? Didn't we ask it to force index (c1c2_uidx) order by c1, c2?

This is where the fun starts. The problem is that we've aliased convert(c2 using utf8mb4) as c2. To prove it, see what happens if we do not make that alias:

mysql> explain select /*+ MAX_EXECUTION_TIME(3600000) */ c1, convert(c2 using utf8mb4), ts from tbl force index (`c1c2_uidx`) order by c1, c2;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+--------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows   | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+--------+----------+-------+
|  1 | SIMPLE      | tbl   | NULL       | index | NULL          | c1c2_uidx | 64      | NULL | 950012 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+--------+----------+-------+

So, in the aliased variation, order by c1, c2 really means order by c1, convert(c2 using utf8mb4), which then cannot use the index.

However, that aliasing is really useful, as it hides away the complexity of the conversion. I was thinking we could alias into something else, to then re-alias in code. However, I realized this is easily solved like so:

mysql> explain select /*+ MAX_EXECUTION_TIME(3600000) */ c1, convert(c2 using utf8mb4) as c2, ts from tbl force index (`c1c2_uidx`) order by tbl.c1, tbl.c2;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+---------+----------+-------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows    | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+---------+----------+-------+
|  1 | SIMPLE      | tbl   | NULL       | index | NULL          | c1c2_uidx | 64      | NULL | 1910748 |   100.00 | NULL  |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+---------+----------+-------+

Qualifying the columns with the table name un-aliases them and uses their original values, and MySQL is then happy to use the index.

Which means the fix is to merely qualify the sorting columns with the table name.

WDYT?

@mattlord
Copy link
Contributor Author

mattlord commented Aug 13, 2024

To be clear, rowstreamer is not choosing to do the conversion, OnlineDDL is telling it to do that. I still don't understand why we are converting a column from its character set to UTF8MB4. Perhaps it was done to catch the edge case where the column was previously a superset character set (implicitly from the table, database, or server default or explicitly for the column), then an ALTER TABLE was done to change the column's character set to a subset (e.g. UTF8[MB4] to ASCII or latin1) — which impacts new records but does not change existing ones — but an ALTER TABLE CONVERT TO was not subsequently done to convert the existing values? Meaning that while the column definition is ASCII it could have some values that contain code points beyond the ASCII plane? That would be my guess, but I don't know. IMO this would only carry the underlying problem forward to the new table, assuming we aren't moving the column from ASCII to UTF8MB4 (maybe we are here). Anyway, we can assume there are good and valid reasons for doing so for now.

We could qualify the table name in the ORDER BY, but currently the ORDER BY is supposed to refer to the column value in the SELECT. If it's necessary to do the conversion for the field, then I would think it's also necessary to do it for the ORDER BY as that is critical for being able to correctly copy all of the initial data (we need to guarantee a consistent order to pick up where we left off using the LastPK value). So I'm hesitant to say that's a general solution.

@shlomi-noach
Copy link
Contributor

To be clear, rowstreamer is not choosing to do the conversion, OnlineDDL is telling it to do that.

Agreed. It is relevant to the grand discussion in this issue, but it's not as relevant to the specific solution proposed in #16023 (comment).

I still don't understand why we are converting a column from its character set to UTF8MB4.

This circles back to the greater discussion, and we will need to understand the entire charset/collation issue better.

but an ALTER TABLE CONVERT TO was not subsequently done to convert the existing values?

I'm not sure I follow this specific example, but would like to keep focusing for now on the solution proposed in #16023 (comment).

Meaning that while the column definition is ASCII

ASCII is truly the exception to the charset issue, because ASCII is contained within UTF8 and doesn't actually need conversion. So for the purpose of this discussion let's assume a latin1 rather than ASCII.

but currently the ORDER BY is supposed to refer to the column value in the SELECT. If it's necessary to do the conversion for the field, then I would think it's also necessary to do it for the ORDER BY as that is critical for being able to correctly copy all of the initial data

we need to guarantee a consistent order to pick up where we left off using the LastPK value

That is correct. Bummer. Back to the drawing board.

@shlomi-noach
Copy link
Contributor

We're having an internal discussion around this issue and have made some progress. Some highlights:

  • vreplication uses set names 'binary' at all places, including in vstreamer, which is why we need to convert textual values to utf8mb4 in vcopier.
  • vplayer does not use set names ... and just gets a utf8mb4 connection.
  • Contrary to what I used to believe, textual values in the binary log are stored in their original charsets, I always thought they were always utf8 in the binlog events. That was my assessment when I authored gh-ost, but let's put that investigation aside.
  • We therefore need to programmatically convert vplayer texts into utf8.
  • If we remove set names 'binary', then we don't need SQL convert() and we don't need to programmatically convert charsets on vstreamer/vcopier data.
  • We're looking into how to make this backwards compatible in a single version change (otherwise we need two versions to fix this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants