Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nullif on two identical subselects produces non null value #52986

Closed
r33s3n6 opened this issue Apr 29, 2024 · 5 comments
Closed

nullif on two identical subselects produces non null value #52986

r33s3n6 opened this issue Apr 29, 2024 · 5 comments

Comments

@r33s3n6
Copy link

r33s3n6 commented Apr 29, 2024

1. Minimal reproduce step (Required)

Firstly, execute init.sql to create the table. Then executing error.sql yields unexpected results. Note that reproducing these results might not be entirely stable. Typically, it can be completed within three attempts. You can try executing error.sql multiple times or execute init.sql again to rebuild the table.

init.sql.txt
error.sql.txt

2. What did you expect to see? (Required)

The SQL statement applies nullif to two identical expressions select variance(c_k6tgklu10) from t_rl02, then counts the results. Since the result is NULL, the count should be 0.

3. What did you see instead (Required)

In the multi-node version, the count is 1.
output_re_main2.log
output_re_single2.log

4. What is your TiDB version? (Required)

Release Version: v8.0.0
Edition: Community
Git Commit Hash: 8ba1fa452b1ccdbfb85879ea94b9254aabba2916
Git Branch: HEAD
UTC Build Time: 2024-03-28 14:22:15
GoVersion: go1.21.4
Race Enabled: false
Check Table Before Drop: false
Store: tikv

topology:

distributed.yaml:

global:
  user: "tidb"
  ssh_port: 22
  deploy_dir: "/tidb-deploy"
  data_dir: "/tidb-data"

pd_servers:
  - host: 10.0.2.31

tidb_servers:
  - host: 10.0.2.21

tikv_servers:
  - host: 10.0.2.11
  - host: 10.0.2.12
  - host: 10.0.2.13

monitoring_servers:
  - host: 10.0.2.8

grafana_servers:
  - host: 10.0.2.8

alertmanager_servers:
  - host: 10.0.2.8

tiflash_servers:
  - host: 10.0.2.32

single.yaml

global:
  user: "tidb"
  ssh_port: 22
  deploy_dir: "/tidb-deploy"
  data_dir: "/tidb-data"

pd_servers:
  - host: 10.0.2.73

tidb_servers:
  - host: 10.0.2.72

tikv_servers:
  - host: 10.0.2.71

tiflash_servers:
  - host: 10.0.2.74

about us

We are the BASS team from the School of Cyber Science and Technology at Beihang University. Our main focus is on system software security, operating systems, and program analysis research, as well as the development of automated program testing frameworks for detecting software defects. Using our self-developed database vulnerability testing tool, we have identified the above-mentioned vulnerabilities in TiDB that may lead to database logic error.

@yibin87
Copy link
Contributor

yibin87 commented May 6, 2024

The error.sql.txt query's plan looks incorrect:

+--------------------------+----------+-----------+-----------------------------------------+--------------------------------+
| id                       | estRows  | task      | access object                           | operator info                  |
+--------------------------+----------+-----------+-----------------------------------------+--------------------------------+
| Projection_47            | 10000.00 | root      |                                         | 1->Column#35                   |
| └─IndexReader_51         | 10000.00 | root      |                                         | index:IndexFullScan_50         |
|   └─IndexFullScan_50     | 10000.00 | cop[tikv] | table:ref_2, index:c_uajclsr(c_uajclsr) | keep order:false, stats:pseudo |
+--------------------------+----------+-----------+-----------------------------------------+--------------------------------+

@yibin87
Copy link
Contributor

yibin87 commented May 6, 2024

/sig planner

@ti-chi-bot ti-chi-bot bot added the sig/planner SIG: Planner label May 6, 2024
@yibin87
Copy link
Contributor

yibin87 commented May 6, 2024

/remove-sig execution

@ti-chi-bot ti-chi-bot bot removed the sig/execution SIG execution label May 6, 2024
@time-and-fate
Copy link
Member

The core issue here is the (select variance(c_k6tgklu10) from t_rl02) subquery.
Since its output column type is DOUBLE, and tidb executes SQLs with intra-query concurrency, the order of calculations that happen in the query would be unstable, therefore the execution result will be unstable due to the precision loss of the floating number.

> select variance(c_k6tgklu10) from t_rl02;
+-----------------------+
| variance(c_k6tgklu10) |
+-----------------------+
|   0.24595041322314046 |
+-----------------------+
1 row in set (0.00 sec)

> select variance(c_k6tgklu10) from t_rl02;
+-----------------------+
| variance(c_k6tgklu10) |
+-----------------------+
|   0.24595041322314048 |
+-----------------------+
1 row in set (0.00 sec)

> select variance(c_k6tgklu10) from t_rl02;
+-----------------------+
| variance(c_k6tgklu10) |
+-----------------------+
|   0.24595041322314043 |
+-----------------------+
1 row in set (0.00 sec)

That makes the two same subqueries in the SQL may have different results. Then the nullif expression will be 1 sometimes and 0 otherwise. Finally, the query result would be unstable. Most of the time it will be all 1s, but sometimes it would also be 0s.

MySQL document (B.3.4.8 Problems with Floating-Point Values) also has descriptions on this issue and suggestions that resolve this issue by rewriting the SQL.

@time-and-fate
Copy link
Member

time-and-fate commented May 13, 2024

Currently, we'd like to consider this as the expected behavior and close this issue later.
Feel free to comment or reopen if you have further questions.

@time-and-fate time-and-fate closed this as not planned Won't fix, can't repro, duplicate, stale May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants