Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIDB and MySQL implicit conversion results are different #37260

Closed
yilongrong opened this issue Aug 22, 2022 · 4 comments
Closed

TIDB and MySQL implicit conversion results are different #37260

yilongrong opened this issue Aug 22, 2022 · 4 comments
Labels
affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 found/community The community has encountered this bug. severity/minor sig/execution SIG execution type/bug This issue is a bug.

Comments

@yilongrong
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE tmp_t1 (
id bigint(20) NOT NULL,
PRIMARY KEY (id) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_bin ROW_FORMAT = Compact;
INSERT INTO tmp_t1 VALUES (1295772036504131620);
INSERT INTO tmp_t1 VALUES (1295772036504131621);

CREATE TABLE tmp_t2 (
id bigint(20) NOT NULL,
id_f varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NULL DEFAULT NULL,
PRIMARY KEY (id) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_bin ROW_FORMAT = Compact;
INSERT INTO tmp_t2 VALUES (11, '1295772036504131620');

SELECT * from tmp_t2 t2 LEFT JOIN tmp_t1 t1 on t2.id_f = t1.id;

2. What did you expect to see? (Required)

+----+---------------------+---------------------+
| id | id_f | id |
+----+---------------------+---------------------+
| 11 | 1295772036504131620 | 1295772036504131620 |
+----+---------------------+---------------------+

mysql> explain analyze SELECT * from tmp_t2 t2 LEFT JOIN tmp_t1 t1 on t2.id_f = t1.id;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Nested loop left join (cost=0.70 rows=1) (actual time=0.095..0.103 rows=1 loops=1)
-> Table scan on t2 (cost=0.35 rows=1) (actual time=0.050..0.057 rows=1 loops=1)
-> Filter: (cast(t2.id_f as double) = cast(t1.id as double)) (cost=0.35 rows=1) (actual time=0.041..0.041 rows=1 loops=1)
-> Single-row covering index lookup on t1 using PRIMARY (id=t2.id_f) (cost=0.35 rows=1) (actual time=0.031..0.031 rows=1 loops=1)
|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

3. What did you see instead (Required)

+----+---------------------+---------------------+
| id | id_f | id |
+----+---------------------+---------------------+
| 11 | 1295772036504131620 | 1295772036504131621 |
| 11 | 1295772036504131620 | 1295772036504131620 |
+----+---------------------+---------------------+

mysql> explain analyze SELECT * from tmp_t2 t2 LEFT JOIN tmp_t1 t1 on t2.id_f = t1.id;
+------------------------------+---------+---------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+------------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+---------+---------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+------------+---------+
| HashJoin_12 | 1.25 | 2 | root | | time:11ms, loops:2, build_hash_table:{total:10.6ms, fetch:10.6ms, build:14.9µs}, probe:{concurrency:5, total:53.2ms, max:10.7ms, probe:28.4µs, fetch:53.1ms} | left outer join, equal:[eq(Column#4, Column#5)] | 10.4 KB | 0 Bytes |
| ├─Projection_13(Build) | 1.00 | 1 | root | | time:10.5ms, loops:2, Concurrency:OFF | test.tmp_t2.id, test.tmp_t2.id_f, cast(test.tmp_t2.id_f, double BINARY)->Column#4 | 1016 Bytes | N/A |
| │ └─TableReader_15 | 1.00 | 1 | root | | time:10.5ms, loops:2, cop_task: {num: 1, max: 10.6ms, proc_keys: 1, rpc_num: 1, rpc_time: 10.6ms, copr_cache_hit_ratio: 0.00} | data:TableFullScan_14 | 250 Bytes | N/A |
| │ └─TableFullScan_14 | 1.00 | 1 | cop[tikv] | table:t2 | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 55, total_keys: 2, rocksdb: {delete_skipped_count: 0, key_skipped_count: 1, block: {cache_hit_count: 0, read_count: 0, read_byte: 0 Bytes}}} | keep order:false, stats:pseudo | N/A | N/A |
| └─Projection_16(Probe) | 2.00 | 2 | root | | time:10.6ms, loops:2, Concurrency:OFF | test.tmp_t1.id, cast(test.tmp_t1.id, double BINARY)->Column#5 | 380 Bytes | N/A |
| └─TableReader_18 | 2.00 | 2 | root | | time:10.6ms, loops:2, cop_task: {num: 1, max: 10.7ms, proc_keys: 2, rpc_num: 1, rpc_time: 10.7ms, copr_cache_hit_ratio: 0.00} | data:TableFullScan_17 | 215 Bytes | N/A |
| └─TableFullScan_17 | 2.00 | 2 | cop[tikv] | table:t1 | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 2, total_process_keys_size: 54, total_keys: 3, rocksdb: {delete_skipped_count: 0, key_skipped_count: 2, block: {cache_hit_count: 0, read_count: 0, read_byte: 0 Bytes}}} | keep order:false, stats:pseudo | N/A | N/A |
+------------------------------+---------+---------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+------------+---------+
7 rows in set (0.02 sec)

4. What is your TiDB version? (Required)

6.1.0

@yilongrong yilongrong added the type/bug This issue is a bug. label Aug 22, 2022
@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.0 may-affects-6.1 may-affects-6.2 labels Aug 23, 2022
@aytrack aytrack added affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 labels Aug 23, 2022
@ti-chi-bot ti-chi-bot removed may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-6.0 may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.2 labels Aug 23, 2022
@zanmato1984
Copy link
Contributor

What happens in this query is that:

  1. Both the bigint and the varchar are implicitly converted to double, and then the converted double are used to do the join comparison;
  2. Double type contains at most 17 digits precision so the value 1295772036504131620 and 1295772036504131621 lost their last 2 digits.

What's different between MySQL and TiDB is:

  1. TiDB faithfully uses the converted double to do the join comparison;
  2. Whereas MySQL logically uses the converted double, as shown in the execution plan, but there exists some unknown magic that makes the join take count of the last 2 (already lost) digits.

So who's correct? There is no simple answer. Actually this is one of the MySQL inconsistencies against itself. Let's change the query a little bit:

SELECT * from tmp_t2 t2 LEFT JOIN tmp_t1 t1 on cast(t2.id_f as double) = cast(t1.id as double);

in which we explicitly ask MySQL/TiDB to do the join using the converted double.

TiDB gives the same result as the original query, whereas MySQL is different:

+----+---------------------+---------------------+
| id | id_f                | id                  |
+----+---------------------+---------------------+
| 11 | 1295772036504131620 | 1295772036504131620 |
| 11 | 1295772036504131620 | 1295772036504131621 |
+----+---------------------+---------------------+

In the sense of self-consistency, I think TiDB performs better than MySQL. Therefore we may not take this as a bug.

And more importantly, we suggest using more accurate SQL statement to represent what you are trying to do, for example:

SELECT * from tmp_t2 t2 LEFT JOIN tmp_t1 t1 on cast(t2.id_f as signed) = t1.id;

to avoid the gray area of the MySQL inconsistency.

@zanmato1984
Copy link
Contributor

zanmato1984 commented Aug 23, 2022

I'm closing this issue for now as analyzed above. Feel free to reopen if you have new findings.

@chrysan
Copy link
Contributor

chrysan commented Dec 11, 2023

The reason of MySQL's inconsistency is that its plan uses "Single-row covering index lookup on t1 using PRIMARY".

EXPLAIN
-> Nested loop left join  (cost=0.70 rows=1) (actual time=0.035..0.043 rows=1 loops=1)
    -> Table scan on t2  (cost=0.35 rows=1) (actual time=0.012..0.015 rows=1 loops=1)
        -> Filter: (cast(t2.id_f as double) = cast(t1.id as double))  (cost=0.35 rows=1) (actual time=0.017..0.020 rows=1 loops=1)
                -> Single-row covering index lookup on t1 using PRIMARY (id=t2.id_f)  (cost=0.35 rows=1) (actual time=0.010..0.011 rows=1 loops=1)

If t1.id is not the primary key, the result is inconsistent.

@seiya-annie
Copy link

/found community

@ti-chi-bot ti-chi-bot bot added the found/community The community has encountered this bug. label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 found/community The community has encountered this bug. severity/minor sig/execution SIG execution type/bug This issue is a bug.
Projects
None yet
Development

No branches or pull requests

6 participants