Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: go/types: define identity of package-level Named types in terms of Package.Path equality #57497

Open
adonovan opened this issue Dec 28, 2022 · 1 comment
Labels
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Dec 28, 2022

(Status: half-baked food for thought.)

Currently, the types.Identity function applied to two *Named types tests that they point to the same variable. This is the classic symbolic programming approach used in compilers since early Lisp days. However it makes for an inflexible library package interface because it requires that all possible pairs of Named types that might need to be compared for equality belong to the same "realm" of objects, where a realm is essentially the map used by a types.Importer that defines the canonical mapping of package paths to package objects.

Consider an application that wishes to type check hundreds of source packages, in parallel, using export data, and then perform assignability tests on pairs of types from different packages. Each source package has its own importer, and thus its own realm of Named objects, so it is not possible to intermingle objects from these realms correctly. For example, you can't ask whether a concrete type in one realm satisfies an interface type from another realm because, even though they might both possess a method f(*bytes.Buffer), they would not agree that the two Buffer types are the same.

This means that various relations that could easily be well defined over go/types data structures, including types.Identity, types.Assignable, and types.MissingMethod, and types.LookupFieldOrMethod, don't work as expected. The application described above must either load all packages of interest in a single realm (entailing redundant work, reduced parallelism, and increased peak memory usage), or find another way to encode the query without using go/types predicates.

Is this restriction essential? I wonder whether we could relax the notion of type identity used by the predicates mentioned above so that they work naturally across realms. Two Named types declared at package level would instead be considered identical iff their names are identical and the Paths of the Packages to which they belong are equal. Pointer identity would just be a fast special case.

Named types local to a function are trickier since there's no obviously ideal way to express the correspondence of the same declaration encountered by the type checker at two different times. Using source positions seems fragile. Perhaps something analogous to DeBruijn indices could work: a computed name for each local type based on the path up the scope tree to the top-level enclosing function, which of course has an Object.Id that is valid across realms.

Though local types are trickier, they are also less important, as the need to apply predicates (e.g. IsAssignable) to a pair of local types from different packages is rare--though nonetheless conceivably useful in a whole-program analysis.

It's worth noting that if we ignore local types for now, the four predicates above could probably be implemented as described entirely outside go/types, if we were to fork them and edit the object identity comparison.

@griesemer @findleyr @mdempsky @timothy-king @zpavlinovic @dominikh

@gopherbot gopherbot added this to the Proposal milestone Dec 28, 2022
@timothy-king
Copy link
Contributor

(my own half baked thoughts)

Perhaps something analogous to DeBruijn indices could work: a computed name for each local type based on the path up the scope tree to the top-level enclosing function,

Seems like the order of the package's files + what is the set of files for this package [tags, _test.go, etc] is going to come into play no matter what when you assign an index to a name the package's scope. File order would work if tedious. But that is basically a source position.

A long as we are brainstorming, is the unit of type checking still a package? Maybe gopls benefit from a per file view?

But if we stick to a per package unit, assigning equality to Named based on (package, index) seems reasonable, so does (package, name). For efficiency of checking equivalence, it is not clear whether the packages would benefit from being in the "same realm" or not. The two paths that need to be fastish 'are these packages are identical or distinct?'. If there is a package realm, pointer equality/id is straightforward. If there is not a realm, disequality can be fast in practice by hashing the path. Equality for packages will either get expensive or there is a cache hidden somewhere.

Yet another half baked idea, if you have an explicit realm id you can recapture checking within a realm quickly. Within a realm, you can assume uniqueness. So x, y *Named are not pointer equal, then they are equal only if they are from distinct realms and are slow-path-equal (which would be walking the scopes, names, etc. in the other realm). I don't think this would help the two main cases of every package gets its own importer or there is only one importer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

3 participants