Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/go/ssa: generics support #48525

Open
timothy-king opened this issue Sep 21, 2021 · 10 comments
Open

x/tools/go/ssa: generics support #48525

timothy-king opened this issue Sep 21, 2021 · 10 comments
Assignees
Milestone

Comments

@timothy-king
Copy link
Contributor

@timothy-king timothy-king commented Sep 21, 2021

This is a tracking issue for supporting generics within the x/tools/go/ssa package. This depends heavily on proposals for "go/ast" #47781 and "go/types" #47916.

A reasonable goal is to have ssa available at roughly the same time as the release of generics so tool authors can take advantage of it. Earlier so tool authors that build on top of ssa can support generics at the same time as the release would be even better.

@timothy-king timothy-king added this to the Go1.18 milestone Sep 21, 2021
@timothy-king timothy-king self-assigned this Sep 21, 2021
@timothy-king
Copy link
Contributor Author

@timothy-king timothy-king commented Sep 21, 2021

There are two relevant contexts we need to consider:

  1. Whole program tools and packages like x/tools/go/callgraph/{...} that will see the original source and all of the possible instantiation sites.
  2. x/tools/go/analysis/passes/buildssa and similar passes that can have access to just the exported type data for imported packages. ATM in ssa, these are primarily used to build stub functions for functions in the imported packages.

@timothy-king
Copy link
Contributor Author

@timothy-king timothy-king commented Sep 21, 2021

My current idea is to "monomorphize" or "stencil" the instances of the generics functions. This would create a copy of the function specialized for each type instantiation.

Cons:

  • More copies of the function.
  • Further from the source than an uninstantiated version.
    Pros:
  • Simple for the ssa package to implement.
  • Expect it to be relatively simple for ssa users to adjust their existing checkers/analysis.

As for whether there will be too many copies of functions with this approach, it is not yet clear. I am somewhat optimistic it will not. For whole program analysis, this is roughly k=1 context-sensitivity in all of instantiation locations (which strikes me as a reasonable thing to try anyways).

For incremental program analysis (like buildssa), ssa needs to at least provide a stub for generic functions defined in other packages and locally instantiated. Providing an empty stub for each instantiation won't be too much blowup. Whether such empty stubs would be sufficient for ssa users is not yet clear.

@scott-cotton
Copy link

@scott-cotton scott-cotton commented Sep 22, 2021

I think that stencilling has the advantage of preserving the signatures of the ast whereas dictionaries or gcshape stencilling would have fewer copies and be more aligned with the current compiler implementation. However, these are questions of implementation of instantiation whilst the API for using ssa with generics is still open.

For incremental program analysis (like buildssa), ssa needs to at least provide a stub for generic functions defined in other packages and locally instantiated. Providing an empty stub for each instantiation won't be too much blowup. Whether such empty stubs would be sufficient for ssa users is not yet clear.

I am also not sure whether empty stubs would be sufficient. The ssa interpreter comes to mind as one which it seems would require code instantiations, the pointer package may be more easily adapted with code instantiations (it is based on a notion of type size which would become variable where it is not now). Maybe "ssa.value instantiations" instead of code instantiations could be an alternative.

Are there thoughts about providing an instantiation API to users of SSA vs instantiating ast/types and using that at ssa construction time? I would be inclined to explore using the ast/types instantiation APIs to provide an instantiation API on ssa.

@scott-cotton
Copy link

@scott-cotton scott-cotton commented Sep 22, 2021

For incremental program analysis (like buildssa), ssa needs to at least provide a stub for generic functions defined in other packages and locally instantiated. Providing an empty stub for each instantiation won't be too much blowup. Whether such empty stubs would be sufficient for ssa users is not yet clear.

I am also not sure whether empty stubs would be sufficient. The ssa interpreter comes to mind as one which it seems would require code instantiations, the pointer package may be more easily adapted with code instantiations (it is based on a notion of type size which would become variable where it is not now). Maybe "ssa.value instantiations" instead of code instantiations could be an alternative.

Ah wait, the interpreter is whole-program. Not so sure about the pointer package, it is in one sense whole-program whereas in another it can be applied to libraries.

@timothy-king
Copy link
Contributor Author

@timothy-king timothy-king commented Sep 22, 2021

Not so sure about the pointer package, it is in one sense whole-program whereas in another it can be applied to libraries.

x/tools/go/pointer falls into the whole-program paradigm.

Zooming out a bit. If we have an instantiation "p.F[local]" within a package "q", the only stubs solution for buildssa does means we would never see an instance of "p.F[local]". This would create some holes for analysis: like incremental pointer analysis, "is there a reachable instruction that modifies this field", etc.

@timothy-king
Copy link
Contributor Author

@timothy-king timothy-king commented Sep 22, 2021

the pointer package may be more easily adapted with code instantiations (it is based on a notion of type size which would become variable where it is not now

Non-instantiated generic functions [if we choose to have them] do not need to be considered reachable/real code. So tools could dodge this by skipping them. (Not in love with this, but I think it would be okay?)

@timothy-king
Copy link
Contributor Author

@timothy-king timothy-king commented Sep 22, 2021

Are there thoughts about providing an instantiation API to users of SSA vs instantiating ast/types and using that at ssa construction time?

@scott-cotton I don't think I understand the question well enough to answer. Can you give more details?

@scott-cotton
Copy link

@scott-cotton scott-cotton commented Sep 22, 2021

Are there thoughts about providing an instantiation API to users of SSA vs instantiating ast/types and using that at ssa construction time?

@scott-cotton I don't think I understand the question well enough to answer. Can you give more details?

The intuition I have is to explore making ssa instantiable for consumers of ssa. Something perhaps (very roughly) like ssa.Instantiate(v ssa.Value, ty1, ty2,...) which would do the stencilling (or whatever implementation). Not sure whether you are considering this, but if it is a desirable thing to provide then implementing that may also serve buildssa, whereas if the instantiations are hidden in buildssa, built by instantiating on the ast and then produce the ssa, then providing this seems like it would be harder. This idea is just an intuition I have, it is not really fleshed out...

@scott-cotton
Copy link

@scott-cotton scott-cotton commented Sep 22, 2021

Not so sure about the pointer package, it is in one sense whole-program whereas in another it can be applied to libraries.

x/tools/go/pointer falls into the whole-program paradigm.

Zooming out a bit. If we have an instantiation "p.F[local]" within a package "q", the only stubs solution for buildssa does means we would never see an instance of "p.F[local]". This would create some holes for analysis: like incremental pointer analysis, "is there a reachable instruction that modifies this field", etc.

Regarding whole program vs incremental, I think it will help to clarify some things, so we are talking about the same properties:

  1. Is whole program analysing behaviour only reachable from main entry points or does it also include having the whole source but no entry-points? For example, a package with no mains and all its dependants -- is it whole program or not?
  2. Does incremental mean bottom-up (depended upon first, like printf checker detecting and propagating wrappers) or where units of analysis are piece-wise independent (like prove in the compiler on a loop)?

On point 1), godoc -analysis=pointer works on library code and uses x/tools/go/pointer
while in principal the basis of the analysis is only valid for whole program in terms of reachable code from a given set of entry points. It is run anyway and provides information.

Yes I agree the buildssa stubs approach would create holes in what you are calling incremental analysis. I also agree that stubs may be preferable for other tools using ssa from buildssa on non-main programs. That combination of things may also make it interesting to consider an instantiation api as noted in the previous comment: with such a thing callers could decide what to instantiate and if and when.

@timothy-king
Copy link
Contributor Author

@timothy-king timothy-king commented Oct 5, 2021

Something perhaps (very roughly) like ssa.Instantiate(v ssa.Value, ty1, ty2,...) which would do the stencilling (or whatever implementation).

I think it might be possible to provide this as a feature for *ssa.Function, and *ssa.Type. Arbitrary Values will result in unconnected instructions. I do think it would be helpful to have a clear use case before expanding the API with this though. SSA's API tends not to expose much for users to modify. A convincing case would be something that (1) comes up in practice and (2) is very hard to otherwise support. I think we can wait on this as a feature.

Regarding whole program vs incremental, I think it will help to clarify some things, so we are talking about the same properties:

The distinction I am really interested in is:

  1. All packages.Packages used for building are transitively loaded and built. This is what I was somewhat inaccurately calling wholeprogram.
  2. The current packages.Package is available for 1 package and gcexportdata is available for direct imports (and supports go/types.Importer). This is what I'm referring to as incremental.

godoc -analysis=pointer would fall into the first bucket. buildssa falls into the second bucket.

That combination of things may also make it interesting to consider an instantiation api as noted in the previous comment: with such a thing callers could decide what to instantiate and if and when.

For buildssa, the source[/ast] needed for generation from a direct[/transitive] import is not immediately available. What would we stencil from? It needs to be serialized somewhere in the on disk cache. This is solvable in a variety of ways, but it not currently supported.

A practical implication of this is that we currently expect that in a different package from a generic function, one will only see the stub for an instantiation "math.Max[int]". If "math" did not do this specific instantiation, "math.Max[int]", there will be no instantiated body for "math.Max[int]" at any point available to an analysis.Analyzer. These Analyzers would be expected to summarize "Max" on the Pass looking at "math" and apply this summary/Fact on the instantiation. These are going to be more challenging to write as one would need to deal with type parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants