Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: go/types: lazy resolution of imported objects #46449

Open
mdempsky opened this issue May 29, 2021 · 6 comments
Open

proposal: go/types: lazy resolution of imported objects #46449

mdempsky opened this issue May 29, 2021 · 6 comments
Labels
Projects
Milestone

Comments

@mdempsky
Copy link
Member

@mdempsky mdempsky commented May 29, 2021

cmd/compile uses an indexed export format that allows lazily resolving imported objects so that only the objects that are actually used by user code need to be loaded into memory. However, the go/types API doesn't currently support lazy import resolution, so the gcexportdata importer still needs to eagerly load all objects. (It's able to use the index to skip loading of objects that were already loaded by another importer though.)

This proposal is to add a new function to facilitate lazy import resolution:

func (s *Scope) InsertLazy(name string, resolve func() Object) Object

Semantically from the end user's point of view, this function behaves like:

func (s *Scope) InsertLazy(name string, resolve func() Object) Object {
    obj := resolve()
    if name != obj.Name() {
        panic("invalid resolved object name")
    }
    return s.Insert(obj)
}

except it's unspecified when/whether resolve is actually called.

Importer implementations can then use this function to create lazy objects to add to the Package.Scope without having to fully load all details right away. go/types then assumes responsibility for noticing when it has a lazy object and calling resolve to get the actual object.

For example, a partial sketch of how this might really work:

type lazyObject func() Object

// Unused methods to implement Object.
func (*lazyObject) Pkg() *Package { panic("unresolved object") }
// ...

func (s *Scope) InsertLazy(name string, resolve func() Object) Object {
    if alt := s.elems[name]; alt != nil {
        return alt
    }
    if s.elems == nil {
        s.elems = make(map[string]Object)
    }
    s.elems[name] = lazyObject(resolve)
    return nil
}

func (s *Scope) Lookup(name string) Object {
    obj := s.elems[name]
    if resolve, ok := obj.(lazyObject); ok {
        obj = resolve()
        if name != obj.Name() {
            panic("invalid resolved object name")
        }
        s.elems[name] = obj
        if obj.Parent() == nil {
            obj.setParent(s)
        }
    }
    return obj
}

Other internal code that directly accesses Scope.elems would need to be similarly updated to handle lazyObjects. User code should be unaffected though: the public APIs will always return fully resolved objects.

One possible concern with lazy resolution though is the resolve function for any unresolved objects will need to hang onto the importer resources for constructing them. Which in turn means the Scope, Package, and any other Objects from that same package will also keep those resources live. I would expect this is likely still an improvement over the status quo (i.e., holding onto a single flat byte slice, rather than the complete object/type graph).

Another possible future extension would be for the Scope to only keep a weak reference to the resolved object, and then re-resolve it on demand as needed. I think this would be transparent both to users and importer implementations (as long as the resolve method is safe to invoke multiple times).

/cc @griesemer @findleyr

@gopherbot gopherbot added this to the Proposal milestone May 29, 2021
@mdempsky
Copy link
Member Author

@mdempsky mdempsky commented May 29, 2021

@dominikh points out that this has implications for concurrent calls to Scope.Lookup, which I hadn't considered.

But I think we have a bunch of options on how to keep that working. E.g., either synchronize on the Scope (with a fast path for scopes without any lazy objects) or turn lazyObject into a separate struct with its own internal synchronization.

We can also add a flag to either turn on/off lazy resolution (depending on whether we make lazy importing the default or not), though I think it's probably preferable if we can just make lazy importing always work.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals May 29, 2021
@gopherbot
Copy link

@gopherbot gopherbot commented May 29, 2021

Change https://golang.org/cl/323569 mentions this issue: [dev.typeparams] cmd/compile: lazy import resolution for types2

@mdempsky
Copy link
Member Author

@mdempsky mdempsky commented May 29, 2021

CL 323569 has a working example of using this extension for types2. I expect the same design would map easily to go/types.

Two observations from updating the types2 importer to use InsertLazy:

  1. Initially, I thought I could just add lazy objects for the directly imported package's symbols, and then lazily add indirect imports as needed. But that caused this test to fail, because it imports "net/http" and then expects sync.Mutex to be Lookup'able:

imports := make(map[string]*types2.Package)
_, err := Import(imports, "net/http", ".", nil)
if err != nil {
t.Fatal(err)
}
mutex := imports["sync"].Scope().Lookup("Mutex").(*types2.TypeName).Type()

This wasn't hard to fix, and fixing it actually made the importer somewhat simpler (able to get rid of some extra maps), and is probably necessary anyway to make concurrent Lookups safe. But it surprised me, so I thought I'd note it.

  1. Previously the gcexportdata importer broke type cycles by declaring the TypeName early so it would be visible later via Scope.Lookup when importing the underlying type and any associated methods. However, the proposed API doesn't support this, because the object won't be inserted until the resolve function returns.

    Instead I changed to to keep a separate "predeclared" map that holds TypeNames until they've been declared properly. This is a little clumsy, but it turns out to also be useful for indicating when it's safe to Complete any interfaces. And handling recursive defined types is always a little clumsy because of cycles. So I don't think this was too bad of a solution.

@gopherbot
Copy link

@gopherbot gopherbot commented May 30, 2021

Change https://golang.org/cl/323629 mentions this issue: [dev.typeparams] cmd/compile: lazy type definitions for types2

@mdempsky
Copy link
Member Author

@mdempsky mdempsky commented May 30, 2021

CL 323629 further adds another API to types2, conceptually equivalent to:

func NewTypeNameLazy(pos token.Pos, pkg *Package, name string, resolve func(named *Named) (tparams []*TypeParam, underlying Type, methods []*Func)) *TypeName {
    obj := NewTypeName(pos, pkg, name, nil)
    named := NewNamed(obj, nil, nil)

    // [This interface for `resolve` is just what was easiest to prototype.
    // It's definitely clumsy though, and I think ideally we'd eliminate
    // either the `named` parameter or the results.]
    tparams, underlying, methods := resolve(named)

    named.SetTParams(named)
    named.SetUnderlying(underlying)
    for _, method := range methods {
        named.AddMethod(method)
    }

    return obj
}

except that again resolve may be called later (or never).

Not only does this allow avoiding resolving defined types whose definitions go unused, but it also completely eliminates the complexity of handling recursive types within the importer, instead letting go/types manage these details. I've also wanted to add lazy type loading to cmd/compile for a while, but it hasn't been practical.

Finally, after prototyping it, I realized that it actually seems somewhat similar to the current instance stuff, so maybe they can actually share a common implementation.

mdempsky added a commit to mdempsky/go that referenced this issue May 30, 2021
This CL adds a new Scope.InsertLazy method that allows lazily
constructing objects only when actually needed. It also updates the
types2 importer to use it.

Updates golang#46449.

Change-Id: I4d1e8e649f6325a11790d25fd90c39fa07c8d41d
mdempsky added a commit to mdempsky/go that referenced this issue May 30, 2021
This CL extends the previous commit to also support lazy loading of
the underlying type and method set of defined types.

Nicely, this eliminates the awkwardness of needing to handle recursive
types within the importer. Instead, types2 is able to handle those
details.

The current prototype is a little fragile though. It works, but while
debugging, I kept running into issues with reentrant calls into
sync.Once.Do, which were a pain to debug. If we decide to do this for
real, we probably want to replace sync.Once with something that can
print useful diagnostics and stack traces rather than just
mysteriously hang.

Finally, it looks like the details are actually somewhat similar to
the lazy "instance" stuff used for generics, so maybe they can
actually use a common idea instead of duplicating the idea.

Updates golang#46449.

Change-Id: I7752c668a9169206318d4a777e4a0db2eb1e8ff7
mdempsky added a commit to mdempsky/go that referenced this issue May 31, 2021
This CL adds a new Scope.InsertLazy method that allows lazily
constructing objects only when actually needed. It also updates the
types2 importer to use it.

Updates golang#46449.

Change-Id: I4d1e8e649f6325a11790d25fd90c39fa07c8d41d
mdempsky added a commit to mdempsky/go that referenced this issue May 31, 2021
This CL extends the previous commit to also support lazy loading of
the underlying type and method set of defined types.

Nicely, this eliminates the awkwardness of needing to handle recursive
types within the importer. Instead, types2 is able to handle those
details.

The current prototype is a little fragile though. It works, but while
debugging, I kept running into issues with reentrant calls into
sync.Once.Do, which were a pain to debug. If we decide to do this for
real, we probably want to replace sync.Once with something that can
print useful diagnostics and stack traces rather than just
mysteriously hang.

Finally, it looks like the details are actually somewhat similar to
the lazy "instance" stuff used for generics, so maybe they can
actually use a common idea instead of duplicating the idea.

Updates golang#46449.

Change-Id: I7752c668a9169206318d4a777e4a0db2eb1e8ff7
mdempsky added a commit to mdempsky/go that referenced this issue Jun 2, 2021
This CL adds a new Scope.InsertLazy method that allows lazily
constructing objects only when actually needed. It also updates the
types2 importer to use it.

Updates golang#46449.

Change-Id: I4d1e8e649f6325a11790d25fd90c39fa07c8d41d
mdempsky added a commit to mdempsky/go that referenced this issue Jun 2, 2021
This CL extends the previous commit to also support lazy loading of
the underlying type and method set of defined types.

Nicely, this eliminates the awkwardness of needing to handle recursive
types within the importer. Instead, types2 is able to handle those
details.

The current prototype is a little fragile though. It works, but while
debugging, I kept running into issues with reentrant calls into
sync.Once.Do, which were a pain to debug. If we decide to do this for
real, we probably want to replace sync.Once with something that can
print useful diagnostics and stack traces rather than just
mysteriously hang.

Finally, it looks like the details are actually somewhat similar to
the lazy "instance" stuff used for generics, so maybe they can
actually use a common idea instead of duplicating the idea.

Updates golang#46449.

Change-Id: I7752c668a9169206318d4a777e4a0db2eb1e8ff7
gopherbot pushed a commit that referenced this issue Jun 4, 2021
This CL adds three new functions to the types2 API to support lazy
import resolution:

1. A new Scope.InsertLazy method to allow recording that Objects exist
in a particular Scope (in particular, package scopes) without having
to yet fully construct those objects. Instead, types2 will call the
provided `resolve` function if/when the object is actually needed.

2. Similarly, a new NewTypeNameLazy function to create TypeName
objects without yet instantiating their underlying Named
instance.

3. Finally, an InstantiateLazy method, that allows creating type
instances without requiring any of the types to be expanded right
away. Importantly, this requires providing a types2.Checker argument
to handle recursive types correctly.

The APIs as-is are a bit clumsy (esp. NewTypeNameLazy), but seem to
work well for cmd/compile's needs. In particular, they simplify some
of the complexities of handling recursive type definitions within the
importer.

Also, the current prototype is a bit fragile. It uses sync.Once to
manage concurrent lazy resolution, which is frustrating to debug in
the presence of reentrancy issues. It also means the importer needs to
deal with concurrency as well. These aren't issues for types2 though
as cmd/compile only walks the type-checked AST sequentially.

Finally, it looks like some of the details of lazy type names are
similar to the lazy "instance" stuff used for generics, so maybe
there's opportunity for unifying them under a more general (but still
internal) lazy type mechanism.

I had originally intended for this CL to also update the types2
importer, but (1) it doesn't have access to the types2.Checker
instance needed to call InstantiateLazy, and (2) it creates a new
TypeName/TypeParam at each use rather than reusing them, which
evidently works with types2.Instantiate but not
types2.(*Checker).instantiate (i.e., InstantiateLazy). I spent a while
trying to fix these issues, but kept running into more subtle
issues. Instead, I've included my WIP "unified IR" CL as a followup CL
that demonstrates these Lazy methods (see noder/reader2.go).

Updates #46449.

Change-Id: I4d1e8e649f6325a11790d25fd90c39fa07c8d41d
Reviewed-on: https://go-review.googlesource.com/c/go/+/323569
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Trust: Matthew Dempsky <mdempsky@google.com>
Trust: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Jun 11, 2021

Change https://golang.org/cl/327053 mentions this issue: [dev.typeparams] cmd/compile: add "check" field to noder.gcimports

gopherbot pushed a commit that referenced this issue Jun 11, 2021
The unified IR importer needs access to the *types2.Checker instance
to lazily construct objects and types. Eventually, maybe the
types2.Importer API can be extended to add the Checker as another
parameter (or more likely something like an ImportConfig struct), but
right now we can handle this ourselves as long as we forgo the
types2.(*Config).Check convenience wrapper.

Updates #46449.

Change-Id: I89c41d5d47c224a58841247cd236cd9f701a23a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/327053
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Proposals
Incoming
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants