sql: support -2^63 .. 2^64-1 integer type #3810

Totktonada · 2018-11-16T05:00:59Z

We need a SQL type to match native Tarantool's 'integer' space format / index format type, which represents -2^63 .. 2^64-1 range of integral numbers. Those number are encoded as int64 or uint64 msgpack type depending of value.

Now we support the following integral number types in Tarantool/SQL: INT / INTEGER. SMALLINT and BIGINT intended to be supported after #3687.

There are the following ways to implement SQL type for the -2^63 .. 2^64-1 range:

Expand range for INT and INTEGER (it is now -2^63 .. 2^63-1). Expand BIGINT too. Don't sure about SMALLINT.
Expand BIGINT range, leave SMALLINT, INT, INTEGER as is.
Introduce a new SQL type, leave existing / planned integral SQL types as is.

I vote for the first way, because Tarantool/SQL type INTEGER is expected to represent the same range as native Tarantool's 'integer'.

Totktonada · 2018-11-16T05:57:16Z

@kostja says it should be INT, so it seems we'll going with the first way.

Korablev77 · 2018-11-30T17:36:33Z

Due to this feature we may observe strange behaviour during Lua-SQL interaction:

\set language sql
CREATE TABLE t (id INT PRIMARY KEY, a INT);
\set language lua
box.space.T:insert{1, 18446744073709551615ULL}
\set language sql
SELECT a + 1 FROM t;
---
  - [1.844674407371e+19]
...

kyukhin · 2019-01-25T13:07:51Z

Let's do this:

SQL's INTEGER should cover same range as Lua's: -2^63..2^64-1
In the scope of this issue we do not introduce new types in SQL
In case, when calculated INTEGER value become out-of-range: runtime error is emitted by VDBE
Output for type INTEGER should always be numeric, no conversion to FP value

Totktonada · 2019-02-18T13:29:08Z

Note re ODBC:

The C Data Types part of ODBC standard explicitly states that
SQLINTEGER is defined as long int and SQLUINTEGER is defined as
unsigned long int. How to expose our INTEGER (-2^63..2^64-1) SQL
type?

AFAIR, there is also a document around JDBC that provide some recommendations for mapping SQL types to Java ones that recommends to map INTEGER into long or int (don't sure), but it seems it is not mandatory. (TBD: find the document.)

However ODBC wording looks as requirement rather then recommendation. There is the sense here: we have native BigInteger in Java, but have no a native (and standard) type in C broader then int64_t / uint64_t.

Korablev77 · 2019-02-18T13:38:41Z

Another one evidence that suggested by @kostja and supported by @kyukhin implementation is inappropriate. We are going to suffer if we attempt to follow such weird (and likely to be wrong) range for INTEGER.

Since previous commit allows us to raise an error during msgpack decode inside VDBE, lets do this if decoded integer is out of [INT64_MIN, INT64_MAX] range and set "integer is overflowed" diagnostic message. Closes #3735 Workaround for #3810

Korablev77 · 2019-02-20T15:04:18Z

upd.: I've banned opportunity to operate on integers in range [INT64_MAX, UINT64_MAX] in scope of this issue: #3735, even if they are fetched from space tuples. So examples like:

box.space.T:insert{1, 18446744073709551615ULL}
\set language sql
SELECT a + 1 FROM t;
---
  - [1.844674407371e+19]
...

now are resulting in error: 'Failed to execute SQL statement: integer is overflowed'

If this is enough for 2.1.1, I suggest to move this ticket to 2.2.

Since previous commit allows us to raise an error during msgpack decode inside VDBE, lets do this if decoded integer is out of [INT64_MIN, INT64_MAX] range and set "integer is overflowed" diagnostic message. Closes #3735 Workaround for #3810

pgulutzan · 2019-03-21T18:53:40Z

I suggest changing the error message from "integer is overflowed" to "integer overflow".

Fixes an error in the conversion functions. The cast to integer didn't take into account the 'unsigned' bit. Fixes the error in overflow check inside the sqlMulInt64(). Makes the overflow check more precisely in sql_atoi64(). Fixes the error message, and affected tests. Part of #3810

Removes unused code. Part of #3810

Adds functions setting and retrieving unsigned value. Adds a new member to struct sql_bind and struct Mem to keep an unsigned value atoi functions return 0 on success and -1 on error and return the signed/unsigned property by pointer. The cast operations take the unsigned into account. Fixes the affected tests. The sql function avg() uses uint64. Part of #3810

Returns correct text name for unsigned data type. Applies a coding standard. Improves mapping of vdbe data types to sql data types. Part of #3810

Adds tests. Closes #3810

Adapts auxiliary functions for supporting arithmetic operations with unsigned integers. VDBE distinguishes signed and unsigned integers. SELECT query correctly returns values greater than INT64_MAX. Part of #3810

Fixes an error in the conversion functions. The cast to integer didn't take into account the 'unsigned' bit. Fixes the error in overflow check inside the sqlMulInt64(). Makes the overflow check more precisely in sql_atoi64(). Fixes the error message, and affected tests. Part of #3810

Removes unused code. Part of #3810

We are going to allow using unsigned values in SQL and extend range of INTEGER type. Hence, we should be able to parse and operate on integers in range of [2^63, 2^64 - 1]. Current mechanism which involves sql_atoi64() function doesn't allow this. Let's refactor this function: firstly, get rid of manual parsing and use strtoll() and strtoull() functions from standard library. Then, let's return sign of parsed literal. In case of success now function returns 0, -1 otherwise. This patch also inlines sql_dec_or_hex_to_i64() to place of its only usage: it makes code cleaner and more straightforward. Needed for #3810 Needed for #4015

As it was stated in the previous commit message, we are going to support operations on unsigned values. Since unsigned and signed integers have different memory representations, to provide correct results of arithmetic operations we should be able to tell whether value is signed or not. This patch introduces new type of value placed in VDBE memory cell - MEM_UInt. This flag means that value is integer and greater than zero, hence can be fitted in range [0, 2^64 - 1]. Such approach would make further replacing MEM_* flags with MP_ format types quite easy: during decoding and encoding msgpack we assume that negative integers have MP_INT type and positive - MP_UINT. We also add and refactor several auxiliary helpers to operate on integers. Note that current changes don't add ability to operate on unsigned integers - it is still unavailable. Needed for #3810 Needed for #4015

Let's patch internal VDBE routines which add, subtract, multiply, divide and calculate the remainder of division to allow them take operands of unsigned type. In this respect, each operator now accepts signs of both operands and return sign of result. Part of #3810 Part of #4015