Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the GROUP BY clause, TiDB incorrectly handled ' ' and NULL #52938

Closed
sjyango opened this issue Apr 27, 2024 · 10 comments
Closed

In the GROUP BY clause, TiDB incorrectly handled ' ' and NULL #52938

sjyango opened this issue Apr 27, 2024 · 10 comments
Assignees
Labels
severity/major sig/execution SIG execution type/bug This issue is a bug.

Comments

@sjyango
Copy link

sjyango commented Apr 27, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE t0 (
c0 TEXT NOT NULL
);
INSERT INTO t0 VALUES (' '), ('dadfad'), ('2342dfad'), ('2dfad');

CREATE TABLE t1 (
c0 TEXT NOT NULL
);
INSERT INTO t1 VALUES ('xxx'), ('3gf'), (''), ('dddd');

SELECT count(t1.c0) FROM t1 LEFT OUTER JOIN t0 ON t0.c0 = t1.c0 GROUP BY t0.c0;

2. What did you expect to see? (Required)

+--------------+
| count(t1.c0) |
+--------------+
|            4 |
+--------------+

3. What did you see instead (Required)

MySQL> SELECT count(t1.c0) FROM t1 LEFT OUTER JOIN t0 ON t0.c0 = t1.c0 GROUP BY t0.c0;
+--------------+
| count(t1.c0) |
+--------------+
|            3 |
|            1 |
+--------------+
2 rows in set (0.004 sec)

4. What is your TiDB version? (Required)

MySQL> select tidb_version();
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tidb_version()                                                                                                                                                                                                                               |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Release Version: v8.0.0
Edition: Community
Git Commit Hash: 8ba1fa452b1ccdbfb85879ea94b9254aabba2916
Git Branch: HEAD
UTC Build Time: 2024-03-28 14:22:04
GoVersion: go1.21.6
Race Enabled: false
Check Table Before Drop: false
Store: tikv |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.002 sec)
@yibin87
Copy link
Contributor

yibin87 commented Apr 29, 2024

The comparison with text type column seems ignoring white spaces. Not common cases. Lower priority to major.

mysql> select * from t1 where t1.c0 = '                 ';
+----+
| c0 |
+----+
|    |
+----+
1 row in set (0.00 sec)

@yibin87
Copy link
Contributor

yibin87 commented Apr 29, 2024

/remove-severity critical

@yibin87
Copy link
Contributor

yibin87 commented Apr 29, 2024

/severity major

@fanrenhoo
Copy link
Contributor

/assign

@xzhangxian1008
Copy link
Contributor

select * from t1 left outer join t0 on t0.c0 = t1.c0;
Result in tidb:

+------+------+
| c0   | c0   |
+------+------+
| xxx  | NULL |
| 3gf  | NULL |
|      |      |
| dddd | NULL |
+------+------+

Result in mysql:

+------+------+
| c0   | c0   |
+------+------+
| xxx  | NULL |
| 3gf  | NULL |
|      | NULL |
| dddd | NULL |
+------+------+

@fanrenhoo
Copy link
Contributor

This issue do not need fix, cause create table with collate utf8mb4_unicode_ci in mysql, you will get the same result with tidb.

@xzhangxian1008
Copy link
Contributor

/close

Copy link

ti-chi-bot bot commented May 9, 2024

@xzhangxian1008: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot closed this as completed May 9, 2024
@xzhangxian1008
Copy link
Contributor

We will get same results between tidb and mysql with the following statements with all collations in tidb. So the behaviour in tidb is correct. Not a bug.

drop table t0;
CREATE TABLE t0 (
c0 TEXT NOT NULL COLLATE xxx
) collate=xxx;
INSERT INTO t0 VALUES (' '), ('dadfad'), ('2342dfad'), ('2dfad');
select * from t0 where c0='   ';

@fanrenhoo
Copy link
Contributor

We will get same results between tidb and mysql with the following statements with all collations in tidb. So the behaviour in tidb is correct. Not a bug.

drop table t0;
CREATE TABLE t0 (
c0 TEXT NOT NULL COLLATE xxx
) collate=xxx;
INSERT INTO t0 VALUES (' '), ('dadfad'), ('2342dfad'), ('2dfad');
select * from t0 where c0='   ';

correct. strictly, we could try test with hashjoin case like the issue states also, cause it goes different func when running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major sig/execution SIG execution type/bug This issue is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants