Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/parser: Add new DOidWrapper type and use for OID and Name types #12641

Merged

Conversation

nvanbenschoten
Copy link
Member

@nvanbenschoten nvanbenschoten commented Jan 3, 2017

Replaces #12281.
Resolves #12207.

This change adds the parser.DOidWrapper and parser.tOidWrapper
types, which wrap Datums and Types respectively, allowing custom Oid
values to be attached. The reason the types were introduced was to
permit the introduction of Datum types with new Object IDs while
maintaining identical behavior to current Datum types. Specifically,
it obviates the need to:

  • define a new parser.Datum type.
  • define a new parser.Type type.
  • support operations and functions for the new Type.
  • support mixed-type operations between the new Type and the old Type.

Instead, DOidWrapper allows a standard Datum to be wrapped with a new Oid.
This approach provides two major advantages:

  • performance of the existing Datum types are not affected because they
    do not need to have custom oid.Oids added to their structure.
  • the introduction of new Datum aliases is straightforward and does not
    require
    additions to typing rules or type-dependent evaluation behavior.

Types that currently benefit from DOidWrapper are:

  • DOid => DOidWrapper(*DInt, oid.T_oid)
  • DName => DOidWrapper(*DString, oid.T_name)

This change is Reviewable

@knz
Copy link
Contributor

knz commented Jan 3, 2017

Quite a good job. Well done.

I assume that you have checked the output of git grep '.(\*parser.D' and git grep '.(\*D' to see if there any remaining casts.

I wonder if pursuant to #12145 and #12530, the types int2 int4 etc shouldn't also be wrapped types around int. Same for text around string. What do you think?


Reviewed 6 of 6 files at r1, 29 of 29 files at r2, 16 of 16 files at r3, 17 of 17 files at r4, 1 of 1 files at r5, 5 of 5 files at r6.
Review status: all files reviewed at latest revision, 17 unresolved discussions, some commit checks failed.


pkg/sql/analyze.go, line 1435 at r2 (raw file):

			// a LIKE 'foo%' -> a >= "foo" AND a < "fop"
			if d, ok := right.(parser.Datum); ok {
				if s, ok := parser.AsDString(d); ok {

You could simplify this by making AsDString accept an Expr.


pkg/sql/limit.go, line 118 at r2 (raw file):

			// folding prior to type checking.
			if d, ok := datum.src.(parser.Datum); ok {
				if i, ok := parser.AsDInt(d); ok {

Same as earlier, AsDInt could take Expr as argument.


pkg/sql/pg_catalog.go, line 868 at r3 (raw file):

CREATE TABLE pg_catalog.pg_namespace (
	oid INT,
	nspname NAME NOT NULL,

I don't understand why you removed the DEFAULT - or perhaps why the DEFAULT clause was there in the first place :)


pkg/sql/pg_catalog.go, line 1262 at r3 (raw file):

CREATE TABLE pg_catalog.pg_type (
	oid INT,
	typname NAME NOT NULL,

see my previous comment


pkg/sql/parser/datum.go, line 238 at r2 (raw file):

// be used instead of direct type assertions wherever a *DInt wrapped by a
// *DOidWrapper is possible.
func AsDInt(d Datum) (DInt, bool) {

If I were you I would use the name AsDInt for what you currently call MustBeDInt
Then rename the other one MaybeDInt

(Same for AsDString / MaybeDString).


pkg/sql/parser/datum.go, line 1894 at r2 (raw file):

		panic("cannot wrap DNull with an Oid")
	case *DOidWrapper:
		// Assure that *DOidWrappers never nest, meaning that unwrapping

Like in my other comments I'd suggest erroring out if this ever happens instead of accepting it silently.


pkg/sql/parser/datum.go, line 1900 at r2 (raw file):

	case *DString:
	default:
		panic("currently only *DInt and *DString are hooked up to work with *DOidWrapper. " +

Put the explanation in a comment and use a simpler panic message. The person encountering the panic can then look at the code where it was triggered.


pkg/sql/parser/datum.go, line 1931 at r2 (raw file):

		return 1
	}
	v, ok := other.(*DOidWrapper)

The inverse condition reads simpler:

if v, ok := other.(*DOidWrapper); ok {
    return d.Wrapped.Compare(v.Wrapped)
}
return d.Wrapped.Compare(other)

pkg/sql/parser/datum.go, line 1941 at r2 (raw file):

func (d *DOidWrapper) Prev() (Datum, bool) {
	if prev, ok := d.Wrapped.Prev(); ok {
		return wrapWithOid(prev, d.Oid), true

Make wrapWithOid return nil if its 1st argument is nil, then you can simply write prev, ok := d.Wrapped.Prev(); return wrapWithOid(prev, d.Oid), ok


pkg/sql/parser/datum.go, line 1987 at r2 (raw file):

// Size implements the Datum interface.
func (d *DOidWrapper) Size() uintptr {
	return unsafe.Sizeof(*d)

unsafe.Sizeof(*d) + d.Wrapped.Size()


pkg/sql/parser/eval.go, line 2385 at r2 (raw file):

	}

	// TODO(nvanbenschoten) This should be a switch on the ResolvedType of d.

Why not changing it right away then?


pkg/sql/parser/type.go, line 410 at r1 (raw file):

	case TypeString:
		return oid.T__text
	case TypeAny:

I think T_anyarray can be returned in the default case. For example pg's array_agg returns T_anyarray always.


pkg/sql/parser/type.go, line 449 at r1 (raw file):

// Oid implements the Type interface.
func (TTable) Oid() oid.Oid {
	panic("TODO(knz) unknown Oid for TTable")

return oid.T_anyelement (see the advertised return type for unnest in pg_proc)


pkg/sql/parser/type.go, line 502 at r2 (raw file):

		panic("cannot wrap TypeAny with an Oid")
	case tOidWrapper:
		// Assure that tOidWrappers never nest, meaning that unwrapping

I'm not sure it even makes sense to re-wrap a type by calling wrapTypeWithOid on a tOidWrapper. Instead of accepting this case silently I'd suggest a panic to check that it indeed never happens.


pkg/sql/parser/type.go, line 110 at r4 (raw file):

	// PGOIDType in expr.go.
	// TODO(nvanbenschoten) This should be fixed.
	TypePGOID = wrapTypeWithOid(TypeInt, oid.T_oid)

In the intermediate commit(s) where both TypePGOID and TypeOid still exist side by side, please also add a comment to explain that TypePGOID is superseded by TypeOid and will be removed in a subsequent commit.


pkg/sql/pgwire/types.go, line 614 at r3 (raw file):

			return d, errors.Errorf("unsupported name format code: %d", code)
		}
	case oid.T_bytea:

And what about T_oid here.


pkg/sql/sqlbase/structured.proto, line 42 at r3 (raw file):

    TIMESTAMPTZ = 9;
    COLLATEDSTRING = 10;  // Collated STRING, CHAR, and VARCHAR
    NAME = 11;

Given the amount of redundancy caused by STRING and NAME existing side by side, or INT and OID, I would suggest investigating if we could have one field in the column type with its "behavior type" (the code that runs) and another one with its "advertised type". Have the latter nullable so that if it doesn't exist yet it defaults to the behavior type, for backward compatibility.

Either in this PR or file a followup issue.


Comments from Reviewable

@jordanlewis
Copy link
Member

Reviewed 6 of 6 files at r1, 29 of 29 files at r2, 16 of 16 files at r3, 17 of 17 files at r4, 1 of 1 files at r5, 5 of 5 files at r6.
Review status: all files reviewed at latest revision, 18 unresolved discussions, some commit checks failed.


pkg/sql/pg_catalog.go, line 868 at r3 (raw file):

Previously, knz (kena) wrote…

I don't understand why you removed the DEFAULT - or perhaps why the DEFAULT clause was there in the first place :)

I did something similar in my NAME patch - agreed that there was no reason to have this in the first place. We have no name literal syntax, so the default doesn't work.


pkg/sql/parser/datum.go, line 238 at r2 (raw file):

Previously, knz (kena) wrote…

If I were you I would use the name AsDInt for what you currently call MustBeDInt
Then rename the other one MaybeDInt

(Same for AsDString / MaybeDString).

I'm personally happy about the Must naming. I think a panic is a bad enough consequence that it warrants naming a function that could do so something like Must or FooOrPanic.


pkg/sql/parser/datum.go, line 1876 at r2 (raw file):

// TODO during review: other option would be to keep tOidWrapper but create new Datum
// types for DOid and DName. The benefit would be that we could avoid needing to attach
// an Oid onto each Datum instance and that methods like String could be overridden for

Being able to override String seems like it will likely come up at one point or another, but I don't like the downsides you mention, which seem to erase most of the gains that you've achieved with this factorization.

Couldn't we still modify the String behavior based on the inner datum type by modifying what the wrapper's String does based on the inner OID?


pkg/sql/parser/datum.go, line 1894 at r2 (raw file):

Previously, knz (kena) wrote…

Like in my other comments I'd suggest erroring out if this ever happens instead of accepting it silently.

+1


Comments from Reviewable

@cuongdo
Copy link
Contributor

cuongdo commented Jan 11, 2017

@nvanbenschoten Any chance you can progress this soon? It'd be great to get further along with our Sequelize support.

@nvanbenschoten
Copy link
Member Author

@cuongdo yep, I'm hoping to get this in by early next week.

@nvanbenschoten
Copy link
Member Author

Thanks for the reviews!


Review status: 0 of 38 files reviewed at latest revision, 18 unresolved discussions.


pkg/sql/analyze.go, line 1435 at r2 (raw file):

Previously, knz (kena) wrote…

You could simplify this by making AsDString accept an Expr.

Good call. Done.


pkg/sql/limit.go, line 118 at r2 (raw file):

Previously, knz (kena) wrote…

Same as earlier, AsDInt could take Expr as argument.

Done.


pkg/sql/pg_catalog.go, line 868 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I did something similar in my NAME patch - agreed that there was no reason to have this in the first place. We have no name literal syntax, so the default doesn't work.

Yeah, I was looking at Jordan's change when I removed DEFAULT. He had correctly recognized that the DEFAULT clause was useless. It was there because I was much more strict about information_schema tables when they were originally implemented (I tried to get the CREATE TABLE results to match with MySQL). Since then it has become clear that identical schemas are unnecessary.


pkg/sql/parser/datum.go, line 1876 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Being able to override String seems like it will likely come up at one point or another, but I don't like the downsides you mention, which seem to erase most of the gains that you've achieved with this factorization.

Couldn't we still modify the String behavior based on the inner datum type by modifying what the wrapper's String does based on the inner OID?

Ah great point! We could do pretty much exactly what we do for tOidWrapper.String. Added a comment in Datum.Format to stress that point.


pkg/sql/parser/datum.go, line 1894 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

+1

Done.


pkg/sql/parser/datum.go, line 1900 at r2 (raw file):

Previously, knz (kena) wrote…

Put the explanation in a comment and use a simpler panic message. The person encountering the panic can then look at the code where it was triggered.

Done.


pkg/sql/parser/datum.go, line 1931 at r2 (raw file):

Previously, knz (kena) wrote…

The inverse condition reads simpler:

if v, ok := other.(*DOidWrapper); ok {
    return d.Wrapped.Compare(v.Wrapped)
}
return d.Wrapped.Compare(other)

Done.


pkg/sql/parser/datum.go, line 1941 at r2 (raw file):

Previously, knz (kena) wrote…

Make wrapWithOid return nil if its 1st argument is nil, then you can simply write prev, ok := d.Wrapped.Prev(); return wrapWithOid(prev, d.Oid), ok

Done.


pkg/sql/parser/datum.go, line 1987 at r2 (raw file):

Previously, knz (kena) wrote…

unsafe.Sizeof(*d) + d.Wrapped.Size()

Done.


pkg/sql/parser/eval.go, line 2385 at r2 (raw file):

Previously, knz (kena) wrote…

Why not changing it right away then?

Added as the new first commit.


pkg/sql/parser/type.go, line 410 at r1 (raw file):

Previously, knz (kena) wrote…

I think T_anyarray can be returned in the default case. For example pg's array_agg returns T_anyarray always.

Done.


pkg/sql/parser/type.go, line 449 at r1 (raw file):

Previously, knz (kena) wrote…

return oid.T_anyelement (see the advertised return type for unnest in pg_proc)

Done.


pkg/sql/parser/type.go, line 502 at r2 (raw file):

Previously, knz (kena) wrote…

I'm not sure it even makes sense to re-wrap a type by calling wrapTypeWithOid on a tOidWrapper. Instead of accepting this case silently I'd suggest a panic to check that it indeed never happens.

Done.


pkg/sql/parser/type.go, line 110 at r4 (raw file):

Previously, knz (kena) wrote…

In the intermediate commit(s) where both TypePGOID and TypeOid still exist side by side, please also add a comment to explain that TypePGOID is superseded by TypeOid and will be removed in a subsequent commit.

Done.


pkg/sql/pgwire/types.go, line 614 at r3 (raw file):

Previously, knz (kena) wrote…

And what about T_oid here.

Done.


pkg/sql/sqlbase/structured.proto, line 42 at r3 (raw file):

Previously, knz (kena) wrote…

Given the amount of redundancy caused by STRING and NAME existing side by side, or INT and OID, I would suggest investigating if we could have one field in the column type with its "behavior type" (the code that runs) and another one with its "advertised type". Have the latter nullable so that if it doesn't exist yet it defaults to the behavior type, for backward compatibility.

Either in this PR or file a followup issue.

That's actually exactly how I initially implemented this, but it quickly became apparent that the approach caused more issues than it solved. To demonstrate why this is, take a look at all references to ColumnType_OID. Out of the 7 places that the value is used, 5 of them would need special handling for DOids (either calling parser.NewDOid or returning parser.TypeOid). Adding a new "advertised type" field would mean that we would need to break these "flat" enumeration switch statements into nested switches, which makes the code less readable and less pleasant to work with.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Jan 22, 2017

Reviewed 1 of 29 files at r2, 31 of 33 files at r7, 1 of 16 files at r8, 14 of 19 files at r9, 1 of 1 files at r10, 4 of 5 files at r11, 1 of 1 files at r12.
Review status: 37 of 38 files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


pkg/sql/parser/builtins.go, line 1564 at r7 (raw file):

		},
	},
	"pg_catalog.format_type": {

I don't get how this format_type() built-in belongs to this PR. Where is it used?


pkg/sql/sqlbase/structured.proto, line 42 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

That's actually exactly how I initially implemented this, but it quickly became apparent that the approach caused more issues than it solved. To demonstrate why this is, take a look at all references to ColumnType_OID. Out of the 7 places that the value is used, 5 of them would need special handling for DOids (either calling parser.NewDOid or returning parser.TypeOid). Adding a new "advertised type" field would mean that we would need to break these "flat" enumeration switch statements into nested switches, which makes the code less readable and less pleasant to work with.

Well one would argue that the current state with all this redundancy is not especially pleasant to work with either.
Let's still investigate this further: #13045


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Jan 22, 2017

LGTM except the change to builtins.go which I don't understand.


Review status: 37 of 38 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful.


Comments from Reviewable

This change adds the `parser.DOidWrapper` and `parser.tOidWrapper`
types, which wrap Datums and Types respectively, allowing custom Oid
values to be attached. The reason the types were introduced was to
permit the introduction of Datum types with new Object IDs while
maintaining identical behavior to current Datum types. Specifically,
it obviates the need to:
- define a new `parser.Datum` type.
- define a new `parser.Type` type.
- support operations and functions for the new Type.
- support mixed-type operations between the new Type and the old Type.

Instead, DOidWrapper allows a standard Datum to be wrapped with a new Oid.
This approach provides two major advantages:
- performance of the existing Datum types are not affected because they
  do not need to have custom oid.Oids added to their structure.
- the introduction of new Datum aliases is straightforward and does not
  require additions to typing rules or type-dependent evaluation behavior.

Types that currently benefit from DOidWrapper are:
- `DOid  => DOidWrapper(*DInt,    oid.T_oid)`
- `DName => DOidWrapper(*DString, oid.T_name)`
This change uses DOidWrapper to add a new DName type. It then uses this
type in pg_catalog wherever Postgres does.
This change uses DOidWrapper to add a new DOid type. It then uses
this type in pg_catalog wherever Postgres does.
This change cleans up `OidPseudoType` and makes its distinction from
`OidColType` more clear.
@nvanbenschoten
Copy link
Member Author

TFTR!


Review status: 12 of 38 files reviewed at latest revision, 5 unresolved discussions.


pkg/sql/parser/builtins.go, line 1564 at r7 (raw file):

Previously, knz (kena) wrote…

I don't get how this format_type() built-in belongs to this PR. Where is it used?

I think what you're seeing is the result of a rebase. I didn't add this in here.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Jan 24, 2017

Reviewed 9 of 9 files at r13.
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


Comments from Reviewable

@nvanbenschoten nvanbenschoten merged commit 1fa5eba into cockroachdb:master Jan 24, 2017
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/oidWrapper branch January 24, 2017 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants