In the GROUP BY clause, TiDB incorrectly handled ' ' and NULL #52938

sjyango · 2024-04-27T06:31:30Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE t0 (
c0 TEXT NOT NULL
);
INSERT INTO t0 VALUES (' '), ('dadfad'), ('2342dfad'), ('2dfad');

CREATE TABLE t1 (
c0 TEXT NOT NULL
);
INSERT INTO t1 VALUES ('xxx'), ('3gf'), (''), ('dddd');

SELECT count(t1.c0) FROM t1 LEFT OUTER JOIN t0 ON t0.c0 = t1.c0 GROUP BY t0.c0;

2. What did you expect to see? (Required)

+--------------+
| count(t1.c0) |
+--------------+
|            4 |
+--------------+

3. What did you see instead (Required)

MySQL> SELECT count(t1.c0) FROM t1 LEFT OUTER JOIN t0 ON t0.c0 = t1.c0 GROUP BY t0.c0;
+--------------+
| count(t1.c0) |
+--------------+
|            3 |
|            1 |
+--------------+
2 rows in set (0.004 sec)

4. What is your TiDB version? (Required)

MySQL> select tidb_version();
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version()                                                                                                                                                                                                                               |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v8.0.0
Edition: Community
Git Commit Hash: 8ba1fa452b1ccdbfb85879ea94b9254aabba2916
Git Branch: HEAD
UTC Build Time: 2024-03-28 14:22:04
GoVersion: go1.21.6
Race Enabled: false
Check Table Before Drop: false
Store: tikv |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.002 sec)

yibin87 · 2024-04-29T06:44:33Z

The comparison with text type column seems ignoring white spaces. Not common cases. Lower priority to major.

mysql> select * from t1 where t1.c0 = '                 ';
+----+
| c0 |
+----+
|    |
+----+
1 row in set (0.00 sec)

yibin87 · 2024-04-29T06:46:01Z

/remove-severity critical

yibin87 · 2024-04-29T06:46:08Z

/severity major

fanrenhoo · 2024-04-29T13:48:44Z

/assign

xzhangxian1008 · 2024-05-08T02:03:04Z

select * from t1 left outer join t0 on t0.c0 = t1.c0;
Result in tidb:

+------+------+
| c0   | c0   |
+------+------+
| xxx  | NULL |
| 3gf  | NULL |
|      |      |
| dddd | NULL |
+------+------+

Result in mysql:

+------+------+
| c0   | c0   |
+------+------+
| xxx  | NULL |
| 3gf  | NULL |
|      | NULL |
| dddd | NULL |
+------+------+

fanrenhoo · 2024-05-08T13:00:19Z

This issue do not need fix, cause create table with collate utf8mb4_unicode_ci in mysql, you will get the same result with tidb.

xzhangxian1008 · 2024-05-09T03:05:53Z

/close

ti-chi-bot · 2024-05-09T03:05:56Z

@xzhangxian1008: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

xzhangxian1008 · 2024-05-09T03:07:51Z

We will get same results between tidb and mysql with the following statements with all collations in tidb. So the behaviour in tidb is correct. Not a bug.

drop table t0;
CREATE TABLE t0 (
c0 TEXT NOT NULL COLLATE xxx
) collate=xxx;
INSERT INTO t0 VALUES (' '), ('dadfad'), ('2342dfad'), ('2dfad');
select * from t0 where c0='   ';

fanrenhoo · 2024-05-09T03:54:07Z

We will get same results between tidb and mysql with the following statements with all collations in tidb. So the behaviour in tidb is correct. Not a bug.
drop table t0;
CREATE TABLE t0 (
c0 TEXT NOT NULL COLLATE xxx
) collate=xxx;
INSERT INTO t0 VALUES (' '), ('dadfad'), ('2342dfad'), ('2dfad');
select * from t0 where c0='   ';

correct. strictly, we could try test with hashjoin case like the issue states also, cause it goes different func when running

sjyango added the type/bug This issue is a bug. label Apr 27, 2024

jebter added sig/execution SIG execution severity/critical labels Apr 28, 2024

ti-chi-bot bot added may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 may-affects-7.1 may-affects-7.5 may-affects-8.1 labels Apr 28, 2024

ti-chi-bot added affects-8.1 and removed may-affects-8.1 labels Apr 28, 2024

ti-chi-bot bot removed the severity/critical label Apr 29, 2024

ti-chi-bot bot added the severity/major label Apr 29, 2024

ti-chi-bot bot assigned fanrenhoo Apr 29, 2024

fanrenhoo mentioned this issue May 6, 2024

util: the incorrectly handle string compare with whitespace #53036

Closed

13 tasks

ti-chi-bot bot closed this as completed May 9, 2024

zanmato1984 removed affects-8.1 may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels May 9, 2024

zanmato1984 removed may-affects-7.1 may-affects-7.5 labels May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In the GROUP BY clause, TiDB incorrectly handled ' ' and NULL #52938

In the GROUP BY clause, TiDB incorrectly handled ' ' and NULL #52938

sjyango commented Apr 27, 2024

yibin87 commented Apr 29, 2024 •

edited

yibin87 commented Apr 29, 2024

yibin87 commented Apr 29, 2024

fanrenhoo commented Apr 29, 2024

xzhangxian1008 commented May 8, 2024

fanrenhoo commented May 8, 2024

xzhangxian1008 commented May 9, 2024

ti-chi-bot bot commented May 9, 2024

xzhangxian1008 commented May 9, 2024

fanrenhoo commented May 9, 2024

In the GROUP BY clause, TiDB incorrectly handled ' ' and NULL #52938

In the GROUP BY clause, TiDB incorrectly handled ' ' and NULL #52938

Comments

sjyango commented Apr 27, 2024

Bug Report

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

yibin87 commented Apr 29, 2024 • edited

yibin87 commented Apr 29, 2024

yibin87 commented Apr 29, 2024

fanrenhoo commented Apr 29, 2024

xzhangxian1008 commented May 8, 2024

fanrenhoo commented May 8, 2024

xzhangxian1008 commented May 9, 2024

ti-chi-bot bot commented May 9, 2024

xzhangxian1008 commented May 9, 2024

fanrenhoo commented May 9, 2024

yibin87 commented Apr 29, 2024 •

edited