New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Make INT an alias for INT8 #32831

Merged
merged 1 commit into from Dec 10, 2018

Conversation

Projects
None yet
5 participants
@bobvawter
Copy link
Contributor

bobvawter commented Dec 4, 2018

This is the first in a series of changes to allow INT to eventually be
redefined as INT4.

Currently, a "naked" INT is generally treated as a 64-bit int type, but INT is
separate from INT8. This change eliminates INT as a distinct type in favor of
always using INT8. This is analogous to how we treat FLOAT/FLOAT8.

The coltypes.Int global has been removed and we should no longer produce type
objects that have a zero Width. The existing interpretation of zero-Width types
as 64-bit values will continue indefinitely to preserve backwards-compatibility
with existing descriptors.

In general, the changes to tests are due to the widespread use of INT as an
arbitary datatype. Except in a few cases where we are testing how "INT" is
interpreted as a datatype, the tests have been changed to simply use INT8, to
avoid another major round of text fixups as work on #26925 continues.

Release note (sql change): The INT type is now treated as an alias for INT8.

@bobvawter bobvawter requested review from mjibson and knz Dec 4, 2018

@bobvawter bobvawter requested review from cockroachdb/cli-prs as code owners Dec 4, 2018

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Dec 4, 2018

This change is Reviewable

@mjibson
Copy link
Member

mjibson left a comment

Overall LGTM

{"INTEGER", "CREATE TABLE a (b INT8)", &coltypes.TInt{Width: 64}},
}
for i, d := range testData {
sql := fmt.Sprintf("CREATE TABLE a (b %s)", d.str)

This comment has been minimized.

@mjibson

mjibson Dec 4, 2018

Member

I encourage subtests here. That will allow using t.Fatal instead of t.Error and removing the various continues.

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 5, 2018

sql: Add default_int_size to control INT alias
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.default_int_size` have been added to control how the INT
type is interpreted. The default value, 8, causes the INT type in inputs to be
interpreted as an alias for INT8, which has been the historical default.
Clients that expect INT to be a 32-bit value, can set `default_int_size` to 4,
which will cause INT to be an alias for INT4. Please note that due to cockroachdb#32846,
`SET default_int_size` does not take effect until the next statement batch is
executed.

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 5, 2018

sql: Add default_int_size to control INT alias
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.default_int_size` have been added to control how the INT
type is interpreted. The default value, 8, causes the INT type in inputs to be
interpreted as an alias for INT8, which has been the historical default.
Clients that expect INT to be a 32-bit value, can set `default_int_size` to 4,
which will cause INT to be an alias for INT4. Please note that due to cockroachdb#32846,
`SET default_int_size` does not take effect until the next statement batch is
executed.
@knz
Copy link
Member

knz left a comment

Very nicely done. Just a few minor comments remaining.

Reviewed 130 of 130 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/serial.go, line 74 at r1 (raw file):

		// If unique_rowid() or virtual sequences are requested, we have
		// no choice but to use the full-width integer type, no matter
		// which serial size was requested, otherwise the values will not fit.

You need to add a new TODO here; when the type SERIAL is specified and the default int size is 4, we may want to flip the behavior here and force the use of a real SQL sequence.

Please also file a follow-up issue so we can investigate this further with example client apps interested in this change.


pkg/sql/distsqlplan/physical_plan.go, line 906 at r1 (raw file):

func equivalentTypes(c, other *sqlbase.ColumnType) bool {
	// Convert a "naked" INT type to INT8
	if c.SemanticType == sqlbase.ColumnType_INT && c.Width == 0 {

lhs := *c

then modify lhs in-place

(this avoids escaping to the heap. Please check the result with the escape analysis tool. In case of doubt double-check with the original author of the code.)


pkg/sql/sem/tree/normalize_test.go, line 44 at r1 (raw file):

		{`(a)`, `a`},
		{`((((a))))`, `a`},
		{`CAST(NULL AS INT2)`, `CAST(NULL AS INT8)`},

Add a comment to explain why this happens.


pkg/sql/sem/tree/type_check_test.go, line 45 at r1 (raw file):

		{`NULL || 'hello'`, `NULL`},
		{`NULL || 'hello'::bytes`, `NULL`},
		{`NULL::int4`, `NULL::INT4`},

this one is ok

@bobvawter bobvawter force-pushed the bobvawter:int4-2-2 branch from 20cea89 to 155dc4f Dec 6, 2018

@bobvawter
Copy link
Contributor

bobvawter left a comment

@BramGruneir, whom I forgot to invite to the review.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/serial.go, line 74 at r1 (raw file):

Previously, knz (kena) wrote…

You need to add a new TODO here; when the type SERIAL is specified and the default int size is 4, we may want to flip the behavior here and force the use of a real SQL sequence.

Please also file a follow-up issue so we can investigate this further with example client apps interested in this change.

Done.


pkg/sql/distsqlplan/physical_plan.go, line 906 at r1 (raw file):

Previously, knz (kena) wrote…

lhs := *c

then modify lhs in-place

(this avoids escaping to the heap. Please check the result with the escape analysis tool. In case of doubt double-check with the original author of the code.)

Done.


pkg/sql/sem/tree/col_types_test.go, line 106 at r1 (raw file):

Previously, mjibson (Matt Jibson) wrote…

I encourage subtests here. That will allow using t.Fatal instead of t.Error and removing the various continues.

Done.


pkg/sql/sem/tree/normalize_test.go, line 44 at r1 (raw file):

Previously, knz (kena) wrote…

Add a comment to explain why this happens.

Done.

@knz

knz approved these changes Dec 7, 2018

Copy link
Member

knz left a comment

:lgtm: modulo rebase and resolution of the remaining CI failures

Reviewed 24 of 24 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/serial.go, line 76 at r2 (raw file):

		// which serial size was requested, otherwise the values will not fit.
		//
		// TODO: Follow up with https://github.com/cockroachdb/cockroach/issues/32534

We have a linter it should be TODO(bob): ...

@bobvawter bobvawter force-pushed the bobvawter:int4-2-2 branch 2 times, most recently from d3e73fd to 00997e7 Dec 10, 2018

sql: Make INT an alias for INT8
This is the first in a series of changes to allow INT to eventually be
redefined as INT4.

Currently, a "naked" INT is generally treated as a 64-bit int type, but INT is
separate from INT8. This change eliminates INT as a distinct type in favor of
always using INT8. This is analogous to how we treat FLOAT/FLOAT8.

The coltypes.Int global has been removed and we should no longer produce type
objects that have a zero Width. The existing interpretation of zero-Width types
as 64-bit values will continue indefinitely to preserve backwards-compatibility
with existing descriptors.

In general, the changes to tests are due to the widespread use of INT as an
arbitary datatype. Except in a few cases where we are testing how "INT" is
interpreted as a datatype, the tests have been changed to simply use INT8, to
avoid another major round of text fixups as work on #26925 continues.

Release note (sql change): The INT type is now treated as an alias for INT8.

@bobvawter bobvawter force-pushed the bobvawter:int4-2-2 branch from 00997e7 to d333976 Dec 10, 2018

@bobvawter

This comment has been minimized.

Copy link
Contributor

bobvawter commented Dec 10, 2018

bors r+

craig bot pushed a commit that referenced this pull request Dec 10, 2018

Merge #32831
32831: sql: Make INT an alias for INT8 r=bobvawter a=bobvawter

This is the first in a series of changes to allow INT to eventually be
redefined as INT4.

Currently, a "naked" INT is generally treated as a 64-bit int type, but INT is
separate from INT8. This change eliminates INT as a distinct type in favor of
always using INT8. This is analogous to how we treat FLOAT/FLOAT8.

The coltypes.Int global has been removed and we should no longer produce type
objects that have a zero Width. The existing interpretation of zero-Width types
as 64-bit values will continue indefinitely to preserve backwards-compatibility
with existing descriptors.

In general, the changes to tests are due to the widespread use of INT as an
arbitary datatype. Except in a few cases where we are testing how "INT" is
interpreted as a datatype, the tests have been changed to simply use INT8, to
avoid another major round of text fixups as work on #26925 continues.

Release note (sql change): The INT type is now treated as an alias for INT8.

Co-authored-by: Bob Vawter <bob@cockroachlabs.com>
@craig

This comment has been minimized.

Copy link

craig bot commented Dec 10, 2018

Build succeeded

@craig craig bot merged commit d333976 into cockroachdb:master Dec 10, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 10, 2018

sql: Add default_int_size to control INT alias
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.default_int_size` have been added to control how the INT
type is interpreted. The default value, 8, causes the INT type in inputs to be
interpreted as an alias for INT8, which has been the historical default.
Clients that expect INT to be a 32-bit value, can set `default_int_size` to 4,
which will cause INT to be an alias for INT4. Please note that due to cockroachdb#32846,
`SET default_int_size` does not take effect until the next statement batch is
executed.

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 10, 2018

sql: Add default_int_size to control INT alias
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.default_int_size` have been added to control how the INT
type is interpreted. The default value, 8, causes the INT type in inputs to be
interpreted as an alias for INT8, which has been the historical default.
Clients that expect INT to be a 32-bit value, can set `default_int_size` to 4,
which will cause INT to be an alias for INT4. Please note that due to cockroachdb#32846,
`SET default_int_size` does not take effect until the next statement batch is
executed.

@bobvawter bobvawter deleted the bobvawter:int4-2-2 branch Dec 10, 2018

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 10, 2018

sql: Add default_int_size to control INT and SERIAL aliases.
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.defaults.default_int_size` have been added to control how
the INT and SERIAL types are interpreted. The default value, 8, causes the INT
and SERIAL types to be interpreted as aliases for INT8 and SERIAL8, which have
been the historical defaults for CockroachDB.  PostgreSQL clients that expect
INT and SERIAL to be 32-bit values, can set `default_int_size` to 4, which will
cause INT and SERIAL to be aliases for INT4 and SERIAL4.  Please note that due
to issue cockroachdb#32846, `SET default_int_size` does not take effect until the next
statement batch is executed.

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 12, 2018

sql: Add default_int_size to control INT and SERIAL aliases.
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

The combination of `default_int_size` interacts with
`experimental_serial_normalization`. Sequence types that are ultimately based
on `unique_rowid()` must always be INT8 types in order to hold the returned
values.  Only in the case where `default_int_size=4
experimental_serial_normalization='sql_sequence'` will an INT4 column be
created in response to a SERIAL type.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.defaults.default_int_size` have been added to control how
the INT and SERIAL types are interpreted. The default value, 8, causes the INT
and SERIAL types to be interpreted as aliases for INT8 and SERIAL8, which have
been the historical defaults for CockroachDB.  PostgreSQL clients that expect
INT and SERIAL to be 32-bit values, can set `default_int_size` to 4, which will
cause INT and SERIAL to be aliases for INT4 and SERIAL4.  Please note that due
to issue cockroachdb#32846, `SET default_int_size` does not take effect until the next
statement batch is executed.

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 12, 2018

sql: Add default_int_size to control INT and SERIAL aliases.
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

The combination of `default_int_size` interacts with
`experimental_serial_normalization`. Sequence types that are ultimately based
on `unique_rowid()` must always be INT8 types in order to hold the returned
values.  Only in the case where `default_int_size=4
experimental_serial_normalization='sql_sequence'` will an INT4 column be
created in response to a SERIAL type.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.defaults.default_int_size` have been added to control how
the INT and SERIAL types are interpreted. The default value, 8, causes the INT
and SERIAL types to be interpreted as aliases for INT8 and SERIAL8, which have
been the historical defaults for CockroachDB.  PostgreSQL clients that expect
INT and SERIAL to be 32-bit values, can set `default_int_size` to 4, which will
cause INT and SERIAL to be aliases for INT4 and SERIAL4.  Please note that due
to issue cockroachdb#32846, `SET default_int_size` does not take effect until the next
statement batch is executed.

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 12, 2018

sql: Add default_int_size to control INT and SERIAL aliases.
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

The combination of `default_int_size` interacts with
`experimental_serial_normalization`. Sequence types that are ultimately based
on `unique_rowid()` must always be INT8 types in order to hold the returned
values.  Only in the case where `default_int_size=4
experimental_serial_normalization='sql_sequence'` will an INT4 column be
created in response to a SERIAL type.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.defaults.default_int_size` have been added to control how
the INT and SERIAL types are interpreted. The default value, 8, causes the INT
and SERIAL types to be interpreted as aliases for INT8 and SERIAL8, which have
been the historical defaults for CockroachDB.  PostgreSQL clients that expect
INT and SERIAL to be 32-bit values, can set `default_int_size` to 4, which will
cause INT and SERIAL to be aliases for INT4 and SERIAL4.  Please note that due
to issue cockroachdb#32846, `SET default_int_size` does not take effect until the next
statement batch is executed.

bobvawter added a commit to bobvawter/cockroach that referenced this pull request Dec 12, 2018

sql: Add default_int_size to control INT and SERIAL aliases.
This is a follow-up from cockroachdb#32831 to support cockroachdb#26925.

The combination of `default_int_size` interacts with
`experimental_serial_normalization`. Sequence types that are ultimately based
on `unique_rowid()` must always be INT8 types in order to hold the returned
values.  Only in the case where `default_int_size=4
experimental_serial_normalization='sql_sequence'` will an INT4 column be
created in response to a SERIAL type.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.defaults.default_int_size` have been added to control how
the INT and SERIAL types are interpreted. The default value, 8, causes the INT
and SERIAL types to be interpreted as aliases for INT8 and SERIAL8, which have
been the historical defaults for CockroachDB.  PostgreSQL clients that expect
INT and SERIAL to be 32-bit values, can set `default_int_size` to 4, which will
cause INT and SERIAL to be aliases for INT4 and SERIAL4.  Please note that due
to issue cockroachdb#32846, `SET default_int_size` does not take effect until the next
statement batch is executed.

craig bot pushed a commit that referenced this pull request Dec 12, 2018

Merge #32848
32848: sql: Add default_int_size to control INT alias r=bobvawter a=bobvawter

This is a follow-up from #32831 to support #26925.

The combination of `default_int_size` interacts with
`experimental_serial_normalization`. Sequence types that are ultimately based
on `unique_rowid()` must always be INT8 types in order to hold the returned
values.  Only in the case where `default_int_size=4
experimental_serial_normalization='sql_sequence'` will an INT4 column be
created in response to a SERIAL type.

Release note (sql change): A new session variable `default_int_size` and
cluster setting `sql.defaults.default_int_size` have been added to control how
the INT and SERIAL types are interpreted. The default value, 8, causes the INT
and SERIAL types to be interpreted as aliases for INT8 and SERIAL8, which have
been the historical defaults for CockroachDB.  PostgreSQL clients that expect
INT and SERIAL to be 32-bit values, can set `default_int_size` to 4, which will
cause INT and SERIAL to be aliases for INT4 and SERIAL4.  Please note that due
to issue #32846, `SET default_int_size` does not take effect until the next
statement batch is executed.



Co-authored-by: Bob Vawter <bob@cockroachlabs.com>
@druud

This comment has been minimized.

Copy link

druud commented Dec 19, 2018

Don't conflate SERIAL and INT.

Reserve the freedom to implement SERIAL as a (locally!) ever-increasing (globally!) unique value (with gaps), without restricting it to INT, or to UUID.

@knz

This comment has been minimized.

Copy link
Member

knz commented Dec 19, 2018

@druud this behavior is not changed. As-is SERIAL will continue to map to INT8 DEFAULT unique_rowid() which is the feature you request. The pg behavior (using an INT4 sequence) is only activated if the serial_normalization option is also changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment