Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/sem/types: make all types non-comparable #32124

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

BramGruneir
Copy link
Member

This is WIP and should not be merged.

I'm putting it out there to generate some discussion before I go any further.

So this is my attempt to fix #27293 and is a continuation from #27300.

Being able to use == to compare some types and not other is dangerous and since Go doesn't allow us to override == and when types are not comparable it only shows up as a runtime error.

I tried to use the linked list trick to make tuples comparable, but sadly that won't work because tuples have optional labels that are ignored for the actual type equality checking. Furthermore, if we performed this fix for tuples, we could still run into problem with the other already non-comparable types.

So the idea here is to go the other way and to for the use of an equality function every time. For now I was calling it Identical but I'm all for using another name.

This approach is not perfect but it does expose, hopefully, every instance in which we do a type comparison.

Please note that there are a lot of opportunities to speed things up, especially all of the switch statements by exporting the other types like tBool and tInt, etc. But I'll do that once I find and fix all the bugs.

Also, this of course will still have to be benchmarked to ensure that there is no massive penalty for this change.

@knz suggested going one step further and adding a linter to ensure we never compare types. I'd be down for that if we proceed in this direction, but I'm pretty sure I've rooted out most of the places already.

Also, there are still a few bugs remaining, specifically to do with OID matching. I'll route them out later.

@andy-kimball, @RaduBerinde, @jordanlewis and @knz. Please take a look at what I have so far here and if any of you have a better solution, I'd be all for it.

Types.go was starting to get unmanagable so I've taken any non-standard type
and pulled it into its own file.

Release note: None
@BramGruneir BramGruneir added the do-not-merge bors won't merge a PR with this label. label Nov 2, 2018
@BramGruneir BramGruneir requested review from a team November 2, 2018 21:03
@BramGruneir BramGruneir requested a review from a team as a code owner November 2, 2018 21:03
@BramGruneir BramGruneir requested review from a team November 2, 2018 21:03
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@BramGruneir
Copy link
Member Author

So I've been thinking about this for the last week and I don't like the direction it's going in. We do need to ensure we don't perform the comparisons but adding type asserts all over is not going to significantly slow things down.
Here's an alternate proposal:

  • add comparable() bool to the types.T interface, which returns if a type is comparable or not
  • add a single function types.Equal(x, y types.T) bool to the types library
  • add a linter to ensure we never perform type equality directly, we should always use the types.Equal function
  • add the non-comparable hack [0][]byte to all types that shouldn't be comparable to ensure if they are ever compared directly, we panic
  • Leave the switch statements alone for now. There are only a few and perhaps there is some way to pull those into the types library as well, or an improved linter, but that can be done at a future time

Furthermore, we could consider adding type.Equivalent(x, y types.T) bool which would first check for == any then call types.Equal. Then we could remove the equivalent function from the interface as well. There's no need to add a linter here, since these are already function calls. But unless I'm mistaken, they all do the exact same check.

We also need to spend some time determining when we should call unwarpType. It doesn't seem to be consistent right now.

I'd appreciate some comments on this.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get the [0][]byte thing. We don't use struct values, we use pointer-to-struct values, which you can always compare with == regardless if the structs are comparable.

Regarding the switches - those might be faster if they were changed to be type switches instead of value switches. We can also consider adding a T.TypeID() method that returns an enum and using that for switches.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@bobvawter
Copy link
Member

This is based on a conversation in Slack.

The current approach has some awkward bifurcation between parameterized and non-parameterized types. For instance, the var types.Bool T field is used both as a singleton instance of types.T as well as a sentinel value for type matching. However, there's no analogous var types.Tuple T field, because Tuple doesn't make sense as a singleton.

What if we add a concept of a "meta-type" (prototype?) that we use for type-matching so that identifying instances of both parameterized and non-parameterized types can be accomplished the same way?

package main

import (
	"fmt"
)

// T represents a SQL type.
type T interface {
	GetMetatype() M
}

// M represents a meta-type; that is, the "raw" type of a parameterized
// type like TTuple. M instances will be singletons and == comparable.
type M interface {
	isM() // Internal to prevent arbitrary casts.
}

// TM is a combination of the T and M interfaces for types which
// have no parameters and for which a singleton instance can be used.
// TM instances are == comparable.
type TM interface {
	T
	M
}

var (
	// The existing, top-level T fields, except for AnyArray, are changed to the TM type.
	Bool TM = &tBool{}
	Int  TM = &tInt{}
	// We add top-level M fields for all parameterized types.
	Tuple M = &mImpl{}
)

type tBool struct{}

func (m *tBool) GetMetatype() M { return m }
func (*tBool) isM()             {}

type tInt struct{}

func (t *tInt) GetMetatype() M { return t }
func (*tInt) isM()             {}

type TTuple struct {
	Args   []T
	Labels []string
}

func (*TTuple) GetMetatype() M { return Tuple }

// Just some trivial implementation of M
type mImpl struct{}

var _ M = &mImpl{}

func (mImpl) isM() {}

func main() {
	fmt.Println(Bool.GetMetatype() == Bool.GetMetatype())
	fmt.Println(Bool.GetMetatype() == Int.GetMetatype())
	what(Bool)
	what(Int)
	what(&TTuple{Args: []T{Int}, Labels: []string{"i"}})
}

func what(t T) {
	switch t.GetMetatype() {
	case Bool:
		fmt.Println("bool")
	case Int:
		fmt.Println("int")
	case Tuple:
		fmt.Printf("tuple %v\n", t.(*TTuple).Labels)
	}
}

This gives us values that are easy to perform == tests against.

We could also add a collection of cast-or-nil methods to T such as IsTuple() *TTuple which would effect the cast or return null. That way, you could write code like

if tuple := someType.IsTuple(); tuple != nil {
  // Do stuff
}

We'd only really need those helpers for parameterized types, since tBool and friends aren't exported.

@bobvawter
Copy link
Member

bobvawter commented Nov 14, 2018

Alternatively / additionally, is there any compelling reason to not export tBool and friends for parity with TTuple? Then we could use switch x := someType.(type) {case *TBool: case *TTuple:} constructs where it makes sense to do so, but still have map-compatible sentinel values where they make sense.

@RaduBerinde
Copy link
Member

@bobvawter what's the advantage of having the M type compared to defining a meta-type enum and having a MetaTypeID() method in T?

@bobvawter
Copy link
Member

@RaduBerinde, the collection of top-level M values would form a type-id enum, albeit of pointers rather than ints, and, I hope, minimize code-thrash since all of the existing code that wants to use T merely as an enum (i.e. == comparable) would only have to be updated to use M to make its intention clear, rather than switching wholesale to a separate collection of values.

It may indeed be the case that this is too cute for its own good, and a wholly separate enum is the right way to go.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try it out and see what the generated code looks like? If there is no surprise wrt heap allocations on passing values by interface, and the switch statements look code (both to the human reader and after code generation), I'd be in favor of the additional cleanliness.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@knz knz moved this from Triage to Candidate next milestone in (DEPRECATED) SQL Front-end, Lang & Semantics Nov 15, 2018
@knz knz moved this from Candidate next milestone to Monday pile in (DEPRECATED) SQL Front-end, Lang & Semantics Dec 13, 2018
@knz knz moved this from Monday pile to Candidate next milestone in (DEPRECATED) SQL Front-end, Lang & Semantics Dec 17, 2018
@tbg tbg added the X-noremind Bots won't notify about PRs with X-noremind label Jun 19, 2019
@bobvawter
Copy link
Member

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge bors won't merge a PR with this label. X-noremind Bots won't notify about PRs with X-noremind
Projects
No open projects
(DEPRECATED) SQL Front-end, Lang & Se...
  
Candidate next milestone
Development

Successfully merging this pull request may close these issues.

sql: panic: comparing uncomparable type types.TTuple
6 participants