Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: go/importer: provide a registry for x/tools importers to plug into #52163

Open
mdempsky opened this issue Apr 5, 2022 · 14 comments
Open

Comments

@mdempsky
Copy link
Contributor

mdempsky commented Apr 5, 2022

Right now the situation with go/types.Importer implementations is:

  1. The standard library provides importers that only support the current export data format.
  2. The x/tools repo provides additional importers that support older formats too, needed for users that deploy tools that need to support multiple toolchain revisions.

However, there's currently no mechanism where the x/tools importers can fallback to the stdlib importers for current export data format. This makes active development of new export data formats much more painful, because we need to keep the x/tools importers up-to-date even before the new format is finalized.

It would be nice if instead x/tools only provided legacy file formats that are no longer supported by the standard library.

I don't have a concrete suggestion yet of what a solution looks like here. But filing this issue so we can discuss it.

@gopherbot gopherbot added this to the Proposal milestone Apr 5, 2022
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Apr 5, 2022
@findleyr
Copy link
Contributor

findleyr commented Apr 6, 2022

Couldn't we invert the relationship, and just have the x/tools importer call go/importer for data formats that go/importer supports? I.e. no need to change go/importer, except perhaps to expose the format version(s) it supports.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/398615 mentions this issue: all: disable failing tests on linux-amd64-unified builder

@mdempsky
Copy link
Contributor Author

mdempsky commented Apr 6, 2022

Couldn't we invert the relationship, and just have the x/tools importer call go/importer for data formats that go/importer supports?

That seems fine with me too. I don't have any strong opinions on API.

Ultimately, the goal I care about is that we don't need to add support to x/tools for the Go 1.XX export data file format until after Go 1.XX is actually released and finalized, and we want to start making changes for the next Go 1.YY export data file format. Whereas right now, the x/tools importer needs to understand even WIP export data file formats.

Note that we've tossed around before the idea of making export data files non-self-contained, so importers need to potentially know to find/open multiple files. E.g., the x/tools/go/gcexportdata.Read API would probably be insufficient, because it presupposes being able to read everything from a single io.Reader. I suspect this would be easier to accommodate with a registry design.

@rsc
Copy link
Contributor

rsc commented Apr 13, 2022

It sounds like we can fix this problem without an API change, as @findleyr suggested. Are we confident that approach is possible? If so, we should probably decline this proposal.

@rsc
Copy link
Contributor

rsc commented Apr 13, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Apr 13, 2022
@findleyr
Copy link
Contributor

Thinking more about delegating to go/importer from x/tools/go/internal/gcimporter (my proposal in #52163 (comment)), I have some additional concerns. If we were to implement this right now, we'd have to encode a support matrix into x/tools, to the effect of "after go1.19 go/importer implements the unified export data format", so delegate to it whenever the export data version is 3. That's problematic for two reasons:

  1. It is liable to introduce some tricky version skew during the development cycle.
  2. We don't know the data version until we start reading. go/importer.Lookup returns an io.ReadCloser, so we'd have to call the lookup function, read a little bit, then potentially close the data reader and delegate to go/importer, which would call the Lookup again. That feels a bit wrong, and might be problematic in practice.

So I am starting to agree that a registry design might be cleaner. Is there precedent for extending the standard library in this way?

@mdempsky
Copy link
Contributor Author

So I am starting to agree that a registry design might be cleaner. Is there precedent for extending the standard library in this way?

https://pkg.go.dev/database/sql#Register was my motivation for suggesting a registry initially.

@findleyr
Copy link
Contributor

https://pkg.go.dev/database/sql#Register was my motivation for suggesting a registry initially.

Aha, thanks.

But we'd presumably register by indexed export data version. Older export data versions are detected by header, so how would we register support for these legacy versions?

@mdempsky
Copy link
Contributor Author

mdempsky commented Apr 18, 2022

Yeah, the existing file format detection logic is a bit of a mess, and probably we don't want to bake that into the registry API.

I'm thinking going forward, it probably makes sense to define a new export data framing file format, which can have an explicit format ID, like an IANA media type or FourCC code. Then importers just need to register that they know how to parse a particular format ID.

The standard library would need to know how to detect (but not necessarily parse) the existing legacy formats that don't explicitly encode the format ID. But that's a relatively small amount of work, I think.

(The existing __.PKGDEF file format is kind of a historical outgrowth of how export data used to be encoded directly into the .o files, and so had to hide from the linker and other tools that parsed object files. But it's been its been completely detached from that for years now: https://go.dev/cl/102236.)

@rsc
Copy link
Contributor

rsc commented May 4, 2022

It doesn't sound like there is agreement about the path forward here. Does anyone want to work on a concrete API proposal?

Lookup may return an io.ReadCloser, but if we want indexed (random access) export data, it seems like we probably are going to require Seek anyway? So maybe we can check the version and seek back and call the right importer?

@rsc
Copy link
Contributor

rsc commented May 18, 2022

Let's put this on hold for a concrete API proposal.

@rsc rsc moved this from Active to Hold in Proposals (old) May 18, 2022
@rsc
Copy link
Contributor

rsc commented May 18, 2022

Placed on hold.
— rsc for the proposal review group

@dr2chase
Copy link
Contributor

dr2chase commented Jun 9, 2022

I'm looking at this, still with an excess of ignorance, but as I understand the lifecycle of an export data format, IF we were to implement the fallback-to-stdlib-for-unknown-formats proposal, after an export data format is settled, it is ported into go/tools. I'm assuming go/tools tries to figure out the format, sees nothing it recognizes, and then tries the standard library.

And, that trying the standard library would be something that normally should only activate during compiler development, because otherwise we'll have a tool whose abilities/behavior depend upon which version of Go was used to compile it (e.g., go.mod in tools currently specifies 1.17, clearly we are not surprised if it is compiled by 1.17 else we would bump that to 1.18)

The glitch, as I understand it, is that in the case of punting to the standard library, we need to "seek" a stream that is not necessarily seekable -- except that it looks like the code in x/tools has already read the entire thing into a byte slice, so why not just wrap that?

Clearly I'm missing something; I don't see what would justify a registry here, and even if I didn't have a byte slice to work with, I would reset the stream by (1) see if the existing stream is seekable, and seek or (2) only the header has been read, so just wrap the not-seekable stream in a new ReadCloser that first returns the already read bytes, then the unread remainder. Either of these choices is considered to be tolerably vile because this is only for the case of work-in-progress export data formats.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/412821 mentions this issue: internal/gcimporter: add support for reading unified IR export data

gopherbot pushed a commit to golang/tools that referenced this issue Jul 15, 2022
This does not include writing export data in the unified IR format.

Most of this change is an import of code from the Go implementation,
with minor tweaks and gaskets added.

This does not (yet) address the registry issue mentioned in golang/go#52163.

Updates golang/go#52163.

Change-Id: I98030e6c9ff35c6ff678b8a7ce9b653b18e65e17
Reviewed-on: https://go-review.googlesource.com/c/tools/+/412821
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Hold
Development

No branches or pull requests

5 participants