Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix the lower bound when converting numbers less than 0 to unsigned integers #8544

Merged
merged 24 commits into from Jan 10, 2019

Conversation

@exialin
Copy link
Contributor

commented Dec 2, 2018

What problem does this PR solve?

Fix issue #6360 so that when set sql_mode = '', values less than 0 will be clipped to 0 for unsigned integer types. Besides, some other cases are also fixed.

  • Insert

MySQL:

mysql> create table t(a bigint unsigned);
Query OK, 0 rows affected (5.01 sec)

mysql> set sql_mode='';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> insert into t value(-1.111);
Query OK, 1 row affected, 1 warning (0.16 sec)

mysql> insert into t value('-1.111');
Query OK, 1 row affected, 1 warning (0.17 sec)

mysql> select * from t;
+------+
| a    |
+------+
|    0 |
|    0 |
+------+
2 rows in set (0.00 sec)

Previously in TiDB:

tidb> create table t(a bigint unsigned);
Query OK, 0 rows affected (0.01 sec)

tidb> set sql_mode='';
Query OK, 0 rows affected (0.01 sec)

tidb> insert into t value(-1.111);
Query OK, 1 row affected, 1 warning (0.00 sec)

tidb> insert into t value('-1.111');
Query OK, 1 row affected, 1 warning (0.00 sec)

tidb> select * from t;
+----------------------+
| a                    |
+----------------------+
| 18446744073709551615 |
|                 NULL |
+----------------------+
2 rows in set (0.00 sec)
  • Update

See issue #8623. If the problem described in that issue is fixed, the wrong behaviour can be fixed by rountines of this pull request (test cases should be added later).

  • Load data

Data file:

$ cat a.csv 
-1
-18446744073709551615
-18446744073709551616

MySQL:

mysql> create table t(a bigint unsigned);
Query OK, 0 rows affected (0.32 sec)

mysql> load data local infile '/home/exialin/Test/6360/a.csv' into table t;
Query OK, 3 rows affected, 3 warnings (0.07 sec)
Records: 3  Deleted: 0  Skipped: 0  Warnings: 3

mysql> select * from t;
+----------------------+
| a                    |
+----------------------+
|                    0 |
|                    0 |
| 18446744073709551615 |
+----------------------+
3 rows in set (0.00 sec)

Previously in TiDB:

tidb> load data local infile '/home/exialin/Test/6360/a.csv' into table t;
Query OK, 1 row affected, 1 warning (0.01 sec)

tidb> select * from t;
+------+
| a    |
+------+
| NULL |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)

After this fix:

mysql> select * from t;
+------+
| a    |
+------+
|    0 |
|    0 |
|    0 |
+------+

Question: is the third row of MySQL reasonable?

  • Alter table

TiDB doesn't support the following alter table query currently. Just post for completeness.

MySQL:

mysql> create table t (a bigint);
Query OK, 0 rows affected (0.41 sec)

mysql> insert into t value (-1);
Query OK, 1 row affected (0.11 sec)

mysql> alter table t modify column a bigint unsigned;
ERROR 1264 (22003): Out of range value for column 'a' at row 1
mysql> set sql_mode='';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> alter table t modify column a bigint unsigned;
Query OK, 1 row affected, 1 warning (1.15 sec)
Records: 1  Duplicates: 0  Warnings: 1

mysql> select * from t;
+------+
| a    |
+------+
|    0 |
+------+
1 row in set (0.00 sec)

What is changed and how it works?

MySQL doc says:

If no restrictive modes are enabled, MySQL clips the value to the appropriate endpoint of the column data type range and stores the resulting value instead.

When strict SQL mode is not enabled, column-assignment conversions that occur due to clipping are reported as warnings for ALTER TABLE, LOAD DATA INFILE, UPDATE, and multiple-row INSERT statements.

So this makes the result different from other conversions like CAST() function. When the number to be inserted is less than 0, it should be clipped to 0.

TODO: the warning message has not been fixed yet. It would be better to fix them in other pull requests.

Check List

Tests

  • Unit test

Code changes

  • Has exported variable/fields change

Side effects

  • Increased code complexity. It seems that the implementation is a bit unclean.

This change is Reviewable

@sre-bot

This comment has been minimized.

Copy link

commented Dec 2, 2018

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

@exialin

This comment has been minimized.

Copy link
Contributor Author

commented Dec 2, 2018

Currently make dev will fail:

OOPS: 113 passed, 1 FAILED
--- FAIL: TestT (0.44s)
FAIL
coverage: 77.5% of statements
FAIL	github.com/pingcap/tidb/types	0.912s

But it doesn't show which line caused the error. I tried to pick them out but still missed some. So is there any way to locate the line?

@shenli

This comment has been minimized.

Copy link
Member

commented Dec 3, 2018

@exialin You can find out which case failed in the output of make dev.

@exialin exialin changed the title [WIP] types: fix the lower bound when converting numbers less than 0 to uns… *: fix the lower bound when converting numbers less than 0 to unsigned integers Dec 8, 2018
@tiancaiamao

This comment has been minimized.

Copy link
Contributor

commented Dec 8, 2018

PTAL @XuHuaiyu

@tiancaiamao

This comment has been minimized.

Copy link
Contributor

commented Dec 8, 2018

6,14c6,14
<     └─IndexJoin_24	0.00	root	inner join, inner:IndexLookUp_23, outer key:gad.aid, inner key:test.dd.aid, other cond:eq(test.dd.ip, gad.ip), gt(test.dd.t, gad.t)
<       ├─IndexLookUp_23	0.00	root	
<       │ ├─IndexScan_20	10.00	cop	table:dd, index:aid, dic, range: decided by [gad.aid gad.ip], keep order:false, stats:pseudo
<       │ └─Selection_22	0.00	cop	eq(test.dd.bm, 0), eq(test.dd.pt, "android"), gt(test.dd.t, 1478143908)
<       │   └─TableScan_21	10.00	cop	table:dd, keep order:false, stats:pseudo
<       └─IndexLookUp_33	0.00	root	
<         ├─IndexScan_30	3333.33	cop	table:gad, index:t, range:(1478143908,+inf], keep order:false, stats:pseudo
<         └─Selection_32	0.00	cop	eq(gad.bm, 0), eq(gad.pt, "android")
<           └─TableScan_31	3333.33	cop	table:st, keep order:false, stats:pseudo
\ No newline at end of file
---
>     └─IndexJoin_24	0.00	root	inner join, inner:IndexLookUp_23, outer key:gad.aid, inner key:test.dd.aid, other cond:eq(gad.ip, test.dd.ip), gt(test.dd.t, gad.t)
>       ├─IndexLookUp_33	0.00	root	
>       │ ├─IndexScan_30	3333.33	cop	table:gad, index:t, range:(1478143908,+inf], keep order:false, stats:pseudo
>       │ └─Selection_32	0.00	cop	eq(gad.bm, 0), eq(gad.pt, "android")
>       │   └─TableScan_31	3333.33	cop	table:st, keep order:false, stats:pseudo
>       └─IndexLookUp_23	0.00	root	
>         ├─IndexScan_20	10.00	cop	table:dd, index:aid, dic, range: decided by [gad.aid gad.ip], keep order:false, stats:pseudo
>         └─Selection_22	0.00	cop	eq(test.dd.bm, 0), eq(test.dd.pt, "android"), gt(test.dd.t, 1478143908)
>           └─TableScan_21	10.00	cop	table:dd, keep order:false, stats:pseudo
\ No newline at end of file

The order of explain result of the IndexJoin is not stable? @winoros

executor/write_test.go Show resolved Hide resolved
sessionctx/stmtctx/stmtctx.go Outdated Show resolved Hide resolved
@winoros

This comment has been minimized.

Copy link
Member

commented Dec 10, 2018

We're checking it.

@exialin

This comment has been minimized.

Copy link
Contributor Author

commented Dec 15, 2018

After more tests, I found there are two special cases hard to resolve:

  1. the value is slightly larger than max uint64

For example:

tidb> create table t(a bigint unsigned);
Query OK, 0 rows affected (0.03 sec)

tidb> insert into t value (18446744073709551616);
Query OK, 1 row affected (0.00 sec)


tidb> select * from t;
+---------------------+
| a                   |
+---------------------+
| 9223372036854775808 |
+---------------------+
1 row in set (0.00 sec)

tidb> insert into t value (18446744073709554616);
ERROR 1264 (22003): Out of range value for column 'a' at row 1

Here 18446744073709551616 (2^64) should cause an error, but it is truncated to an incorrect value. Because 18446744073709551616 is treated as a float, so it goes to types.ConvertFloatToUint. Due to precision loss of float, it doesn't cause an overflow error.

  1. when not in strict mode, values in certain range still cause errors instead of warnings

For example:

tidb> set sql_mode='';
Query OK, 0 rows affected (0.01 sec)

tidb> insert into t value (-9223372036854775809);
ERROR 1264 (22003): Out of range value for column 'a' at row 1
tidb> insert into t value (-18446744073709551615);
ERROR 1264 (22003): Out of range value for column 'a' at row 1

To be compatible with MySQL, here values in [-18446744073709551615, -9223372036854775809] should be clipped to 0. This is a rather special case that only happens for values in this range. Take a look at insert_common.go/evalRow:

val, err := expr.Eval(e.evalBuffer.ToRow())
if err = e.handleErr(e.insertColumns[i], &val, rowIdx, err); err != nil {
return nil, errors.Trace(err)
}
val1, err := table.CastValue(e.ctx, val, e.insertColumns[i].ToInfo())
if err = e.handleErr(e.insertColumns[i], &val, rowIdx, err); err != nil {
return nil, errors.Trace(err)
}

Other integer literals are evaluated correctly in expr.Eval, but values in this range will cause errors, hence not going to table.CastValue. For a value like -9223372036854775809, here expr is a ScalarFunction which represents builtinUnaryMinusIntSig, which means it is treated as a Minus expression and an integer. Since 9223372036854775809 overlows int64, an error will be returned.

I tried to do some workaround like making ScalarFunction.Eval return 0 and appending the error in warning queue, but it would fail other test cases and it isn't a clean way. Perhaps the best way is to evaluate values in this range as float.

@zz-jason Do you have any suggestion for these two cases?

@zz-jason

This comment has been minimized.

Copy link
Member

commented Jan 2, 2019

@exialin Can not find out any easy way to fix them either. Could you file a github issue to record these two cases so that one day we may be able to fix it?

@exialin

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2019

@zz-jason OK. Maybe after changes in this PR are merged? The two cases are somewhat related with this PR.

@codecov-io

This comment has been minimized.

Copy link

commented Jan 4, 2019

Codecov Report

Merging #8544 into master will decrease coverage by 0.02%.
The diff coverage is 55.55%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8544      +/-   ##
==========================================
- Coverage   67.59%   67.57%   -0.03%     
==========================================
  Files         363      363              
  Lines       75317    75326       +9     
==========================================
- Hits        50913    50901      -12     
- Misses      19918    19932      +14     
- Partials     4486     4493       +7
Impacted Files Coverage Δ
executor/load_data.go 82.35% <ø> (ø) ⬆️
executor/distsql.go 73% <100%> (ø) ⬆️
expression/errors.go 94.44% <100%> (ø) ⬆️
executor/executor.go 66.8% <100%> (+0.04%) ⬆️
types/convert.go 85.75% <25%> (-1.08%) ⬇️
types/datum.go 68.92% <38.46%> (ø) ⬆️
expression/builtin_cast.go 79.96% <75%> (-0.02%) ⬇️
ddl/delete_range.go 74.28% <0%> (-5.72%) ⬇️
executor/join.go 77.89% <0%> (-0.82%) ⬇️
executor/index_lookup_join.go 77.28% <0%> (-0.64%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b40b14...e1141b3. Read the comment docs.

exialin added 3 commits Jan 4, 2019
fix
@zz-jason

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

@zz-jason OK. Maybe after changes in this PR are merged? The two cases are somewhat related with this PR.

OK

exialin and others added 2 commits Jan 8, 2019
@winoros

This comment has been minimized.

Copy link
Member

commented Jan 9, 2019

/run-all-tests

@XuHuaiyu

This comment has been minimized.

Copy link
Contributor

commented Jan 9, 2019

The following sql failed, it seems to be unexpected.

CREATE TABLE `table1_int_autoinc` (
`col_int_unsigned_not_null_unique` int(10) unsigned NOT NULL,
UNIQUE KEY `col_int_unsigned_not_null_unique` (`col_int_unsigned_not_null_unique`)
) ;
DELETE FROM `table1_int_autoinc` WHERE `col_int_unsigned_not_null_unique` = -1438187520 ;
@XuHuaiyu XuHuaiyu requested review from jackysp and zz-jason Jan 9, 2019
@exialin

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2019

/run-all-tests

@exialin

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2019

/run-all-tests

@exialin

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2019

@XuHuaiyu CI test fails when executing UPDATE `table10_int_autoinc` SET `col_int` = 335740928 WHERE `col_int_unsigned_not_null_unique` = -798228480;, because -798228480 is incorrectly converted.
I'm wondering which is better: fix it in this PR or fix it together with #8623 later. Or fix #8623 in this PR...


// ShouldIgnoreError indicates whether we should ignore the error when type conversion overflows,
// so we can leave it for further processing like clipping values less than 0 to 0 for unsigned integer types.
func (sc *StatementContext) ShouldIgnoreError() bool {

This comment has been minimized.

Copy link
@jackysp

jackysp Jan 10, 2019

Member

It is better to clarify what kind of error in the function name.

This comment has been minimized.

Copy link
@exialin

exialin Jan 10, 2019

Author Contributor

Modified as you suggested.

Copy link
Member

left a comment

LGTM

@XuHuaiyu

This comment has been minimized.

Copy link
Contributor

commented Jan 10, 2019

/run-all-tests

Copy link
Member

left a comment

LGTM

@XuHuaiyu

This comment has been minimized.

Copy link
Contributor

commented Jan 10, 2019

It will be better to fix #8623 in an individual PR.

@zz-jason zz-jason merged commit 68ddb7f into pingcap:master Jan 10, 2019
4 checks passed
4 checks passed
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
idc-jenkins-ci-tidb/build Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details
@zz-jason

This comment has been minimized.

Copy link
Member

commented Jan 10, 2019

@exialin Could you cherry-pick this commit to release-2.1 and release-2.0 branch?

@exialin

This comment has been minimized.

Copy link
Contributor Author

commented Jan 10, 2019

@zz-jason OK, I'll do it later and file a issue about the unresolved cases. Thanks for the review from all of you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.