Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: add a workspace mode #45713

Open
matloob opened this issue Apr 22, 2021 · 28 comments
Open

proposal: cmd/go: add a workspace mode #45713

matloob opened this issue Apr 22, 2021 · 28 comments

Comments

@matloob
Copy link
Contributor

@matloob matloob commented Apr 22, 2021

Detailed Design Doc: Proposal: Multi-Module Workspaces in cmd/go

High level overview:

I propose adding a new workspace mode in the go command for editing multiple modules. The presence of a go.work file in the working directory or a containing directory will put the go command into workspace mode. The go.work file specifies a set of local modules that comprise a workspace. When invoked in workspace mode, the go command will always select these modules and a consistent set of dependencies.

This is intended to help with workflows making changes across multiple modules and with editor support for those workflows.

This is what an example go.work file would look like:

go 1.17

directory (
    ./baz // foo.org/bar/baz
    ./tools // golang.org/x/tools
)

replace golang.org/x/net => example.com/fork/net v1.4.5

This adds two modules to the workspace. If the user's current working directory is is under the directory containing this go.work, the go command will be in workspace mode and use both the modules defined by ./baz/go.mod and ./tools/go.mod as main modules, regardless of which module the user is currently in (unless workspace mode is turned off or a different workspace is chosen with the proposed new -workfile flag). The replace would override any replaces
in the main modules' go.mod files.

Related issues

#32394 x/tools/gopls: support multi-module workspaces

Issue #32394 is about gopls' support for multi-module workspaces. gopls currently allows users to provide a "workspace root" which is a directory it searches for go.mod files to build a supermodule from. Alternatively, users can create a gopls.mod file in their workspace root that gopls will use as its supermodule. This proposal creates a concept of a workspace that is similar to that gopls that is understood by the go command so that users can have a consistent configuration across their editor and direct invocations of the go command.

#44347 proposal: cmd/go: support local experiments with interdependent modules; then retire GOPATH

Issue #44347 proposes adding a GOTINKER mode to the go command. Under the proposal, if GOTINKER is set to a directory, the go command will resolve import paths and dependencies in modules by looking first in a GOPATH-structured tree under the GOTINKER directory before looking at the module cache. This would allow users who want to have a GOPATH like workflow to build a GOPATH at GOTINKER, but still resolve most of their dependencies (those not in the GOTINKER tree) using the standard module resolution system. It also provides for a multi-module workflow for users who put their modules under GOTINKER and work in those modules.

This proposal also tries to provide some aspects of the GOPATH workflow and to help with multi-module workflows. A user could put the modules that they would put under GOTINKER in that proposal into their go.work files to get a similar experience to the one they'd get under the GOTINKER proposal. A major difference between the proposals is that in GOTINKER modules would be found by their paths under the GOTINKER tree instead of being explicitly listed in the go.work file. But both proposals provide for a set of replaced module directories that take precedence over the module versions that would normally be resolved by MVS, when working in any of those modules.

#26640 cmd/go: allow go.mod.local to contain replace/exclude lines

The issue of maintaining user-specific replaces in go.mod files was brought up in #26640. It proposes an alternative go.mod.local file so that local changes to the go.mod file could be made adding replaces without needing to risk local changes being committed in go.mod itself. The go.work file provides users a place to put many of the local changes that would be put in the proposed go.mod.local file.

@gopherbot gopherbot added this to the Proposal milestone Apr 22, 2021
@mvdan
Copy link
Member

@mvdan mvdan commented Apr 23, 2021

Thanks for working on this, Michael! It's clear that many people want a solution to multi-module development, given all the activity in various issues over the last couple of years.

For those who just arrived here, note that this was discussed over the previous week on the golang-tools list, too. Some points raised there resulted in edits to the proposal draft.

Overall, I really like the idea. Below are some thoughts. As a summary, I think the proposal is trying to fix a few too many issues at once.

When doing a build operation under workspace mode the go command will try to find a go.mod file. If a go.mod file is found, its containing directory must be declared with a directory directive in the go.work file.

I still wonder why we need directory (...) to be an explicit list. Using a module in workspace mode that isn't listed in go.work is an error, so why not just make directory implicitly mean all modules under the directory where the go.work file is located? Akin to how go.mod implicitly includes all packages under its directory (with some basic exception rules).

If the answer is "it's too expensive to find all modules every time", I'd love to see some proof of that. I'd also like to see if caching could be a solution, similar to go list ./.... I know you mentioned alternatives like directory ./... or go work init filling in the directory list, but either of those seems unnecessary if we can implicitly do the right thing.

go.work files allow users to operate in directories outside of any modules but still use the workspace build list. This makes it easy for users to have a GOPATH-like user experience by placing a go.work file in their home directory linking their modules together.

We touched on this in the email list, and you added this section, so as a follow-up - I'm still not convinced that the first version of go.work needs to support doing builds outside of any of the modules. Do we have any particular use cases in mind other than "be more like GOPATH"? For example, before modules I'd never run go build all or go test all from the root of my GOPATH, and I similarly do not expect to do it in workspace mode.

And if I want to build or test one of the modules, I'd want to cd into that module anyway, to make sure I'm using its go.mod file - replace directives, etc.

#39005 proposal: cmd/go: introduce a build configurations file

This issue proposes to add a mechanism to specify configurations for builds, such as build tags. This might be something that can be added in a future extension of go.work files.

I don't think that would be a good idea, to be honest. I'm starting to believe that build configurations (e.g. what build tags to use in go build) aren't only useful at the module level, and by extension not only at the workspace level either. Maybe a Go module is entirely portable by default, but one of its packages explicitly targets js/wasm only. The build config should separate the package. Similarly, when working across many modules, it's reasonable for them to fall under different build configurations.

A "build config" also has a purpose, at least when proposed in #39005: for the original author to announce to the users what are the supported ways to build their packages. Without that information, one can only assume that all build tag and GOOS/GOARCH combinations are possible, which quickly explodes exponentially. Since most users would not publish go.work in a VCS repository, the usefulness of a "build config" to give information to users is lost.

All in all, I don't think this proposal should ever aim to solve the "build config" problem.

The go.work file provides users a place to put many of the local changes that would be put in the proposed go.mod.local file.

This one is an interesting thought. This only works if we declare that go.work files should not be committed into VCS, though, and the proposal currently does not say that.

For example, if we're OK with monorepos publishing go.work files, how can that work as a place to put local changes?

@ohir
Copy link

@ohir ohir commented Apr 23, 2021

Note: comments are related to the proposal document. Please take all "must", "should", "*-not" placed below as prepended with a humble IMO.)

If -workfile is set to off, workspace mode will be disabled. If it is auto (the default), workspace mode will be enabled if a file named go.work is found in the current directory (or any of its parent directories), and disabled otherwise. If -workfile names a path to an existing file that ends in .work, workspace mode will be enabled.

Workspace mode should NOT be turned on without a human demanding so. Proposed defaults with "auto" and go.work presence make for traps if there is go.work present in vcs but we want to build/test in module mode — we may forgot to add a flag "workspace=off" using tools by hand. Or there was an uncommited go.work we forgot to copy, or we checked out older revision, then we think we continue within workspace while we now do not. Former trap might bite the most when we'd set to work deep into the tree but go.work still hangs two floors above us: we might think we're testing this module in module mode (oh, no go.work here) while we are testing this module with dependencies replaced by an active go.work directive.

Make intent explicit either by mandating option to always be present. or by environment.
Allow user to decide on "by presence" default eg. with GOWORK="."


The directory directive takes an absolute or relative path to a directory containing a go.mod file as an argument.

The go.work file should not take absolute path in the directory directive — only paths relative to the place of the go.work file. Otherwise having go.work commited to the repository freezes filesystem layout at a go.work mandated state, as "no restriction on where the directory is located" is proposed too. ("No restriction, anywhere" in practice says that after a longer while of hiatus user might be missing pieces of her workspace; with no friendly go get to rescue.)


The replace directives in the go.work are applied in addition to and with higher precedence than replaces in the workspace modules. A replace directive in the go.work file overrides replace directives in workspace modules applying to the same module or module version. If two or more workspace modules replace the same module or module version with different module versions or directories, and there is not an overriding replace in the go.work file, the go command will report an error.

To my understanding, the ultimate goal of "workspace" is to produce versioned code for reproductible builds, hence at least the unversioned replacements should be disallowed in go.mods of modules we currently work on in the "workspace". Allowing for two or more sets of replacements overlayed would be a costly mistake.

IOW: either we commit to progress in work with set of go.mod replacements (current state), or we commit to work using workspaces and put all our replacements in go.work. Tools should guard us from the mix.

In my imagination, the "workspace" workflow (ie. workspace=on) should be a simple loop:

  1. make code work;
  2. test locally; if test != ok {; goto 1 };
  3. tag changes and fix go.mods accordingly
  4. push tagged changes upstream for CI to consume from modules proxy/product repo;

After 4 the CI pipeline should use the same code we tagged at 3!

If code to be built with workspace "=on" and "=off" does not converge at 4, we've lost the track of what we're shipping. Tools should somehow prevent that. A bare minimum is to disallow unversioned replacements in go.mods of modules pulled into workspace. The real shield would be to move all replacements from go.mod to go.work — hardly possible now.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 23, 2021

@mvdan, Thanks for the feedback! A few replies:

I still wonder why we need directory (...) to be an explicit list. Using a module in workspace mode that isn't listed in go.work is an error, so why not just make directory implicitly mean all modules under the directory where the go.work file is located? Akin to how go.mod implicitly includes all packages under its directory (with some basic exception rules).

If the answer is "it's too expensive to find all modules every time", I'd love to see some proof of that. I'd also like to see if caching could be a solution, similar to go list ./.... I know you mentioned alternatives like directory ./... or go work init filling in the directory list, but either of those seems unnecessary if we can implicitly do the right thing.

I'm not convinced that the right behavior is to add all the modules under the directory, even if it were free: if we added all the modules under the directory, then it would be much harder (I think impossible?) to build a workspace that includes a module, but not any of the modules contained in subdirectories. So if we added all the modules in a directory we'd have to add another mechanism to remove modules from the workspace. And if we're going to do that it seems to me that the default should be to add a module at a time, and maybe consider if we want to do a directory tree in the future, if it turns out to be necessary.

I think the name of the directory directive is unfortunately misleading, because it does seem to imply that we're adding all the modules under a directory. But it's the best name I could come up with: unfortunately module is already taken by the go.mod file to have a different meaning.

We touched on this in the email list, and you added this section, so as a follow-up - I'm still not convinced that the first version of go.work needs to support doing builds outside of any of the modules. Do we have any particular use cases in mind other than "be more like GOPATH"? For example, before modules I'd never run go build all or go test all from the root of my GOPATH, and I similarly do not expect to do it in workspace mode.

And if I want to build or test one of the modules, I'd want to cd into that module anyway, to make sure I'm using its go.mod file - replace directives, etc.

I'm wondering what we'd get by disabling builds outside any of the modules? It's something that essentially comes for 'free' from the rest of the design because the go.work file already has to explicitly specify its set of directories, and the build list does not depend on the 'current' module. And keeping this in reinforces the point that if you are in the same workspace
cding into the module doesn't change go.mod file or the set of replace directives that are used

I don't think that would be a good idea, to be honest. I'm starting to believe that build configurations (e.g. what build tags to use in go build) aren't only useful at the module level, and by extension not only at the workspace level either. Maybe a Go module is entirely portable by default, but one of its packages explicitly targets js/wasm only. The build config should separate the package. Similarly, when working across many modules, it's reasonable for them to fall under different build configurations.

A "build config" also has a purpose, at least when proposed in #39005: for the original author to announce to the users what are the supported ways to build their packages. Without that information, one can only assume that all build tag and GOOS/GOARCH combinations are possible, which quickly explodes exponentially. Since most users would not publish go.work in a VCS repository, the usefulness of a "build config" to give information to users is lost.

All in all, I don't think this proposal should ever aim to solve the "build config" problem.

I think that the workspace file could be useful to allow the developer to specify configuration to tooling, because it is developer specific configuration, but this proposal does not intend to solve the build config problem. It's definitely out of scope for the design. If it's too distracting for the conversation I can remove the reference from the related issues section in the doc.

This one is an interesting thought. This only works if we declare that go.work files should not be committed into VCS, though, and the proposal currently does not say that.

For example, if we're OK with monorepos publishing go.work files, how can that work as a place to put local changes?

The doc does recommend that "most go.work files should exist outside of any repository" and I do think that go.work files in should rarely be checked into VCS. (Though it's buried in one paragraph in the doc so maybe I could do a better job of specifying that.)

If a monorepo publishes a go.work file, a user can still put a go.work file in a parent directory of the repository and run the go command from there. This is another one of the benefits of allowing workspace builds outside of any particular module. It allows there to be multiple workspace 'directories' that a particular module can participate in without needing to pass -workfile. Of course a user can also pass in -workfile, either directly or through GOFLAGS.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 23, 2021

@ohir, thanks for the comments!

Workspace mode should NOT be turned on without a human demanding so.

I definitely understand the issue about users not realizing which workspace they're in and accidentally ending up with the wrong build. I think this is most likely to happen if the workspace file is checked into a repository. (Otherwise, the user created the workspace file themselves, and opted-in to workspace mode). Would you agree with that?

I think in general checking in a workspace file to a repository should be rare: because it is really only useful for multi-module repositories which themselves are not the common case, and because it will affect the build configuration for every developer of a module, it's something that should only be done if it fits into the workflow of most developers.

The go.work file should not take absolute path in the directory directive — only paths relative to the place of the go.work file.

Do you think this would be a problem for go.work files that are not checked into a repository? Of course, I agree that absolute paths should not show up in a go.work file that's checked into a repository (for that matter, there shouldn't be any relative paths that point outside the repository). If such a change was committed, I expect that it would be quickly reverted because it likely wouldn't work for any other developers than the one who originally committed it!

Here is a case for absolute paths: some editors might want to make temporary workspace files because a user just opened two files in different modules. In that case the workspace file would be put in a temporary directory, and it would make more sense to add absolute paths rather than trying to construct a relative path from the temporary directory to the location of the module. And on Windows, I believe it's not possible to construct relative paths across volumes, so the editor would have to attempt to make a temporary directory on the same volume, something that might not be easy to do.

I wonder if this is something an outside tool can catch, like a sort of workspace vet command?

To my understanding, the ultimate goal of "workspace" is to produce versioned code for reproductible builds, hence at least the unversioned replacements should be disallowed in go.mods of modules we currently work on in the "workspace". Allowing for two or more sets of replacements overlayed would be a costly mistake.

Oh! I think the doc must have not been as clear: the ultimate goal is not to produced versioned code for reproducible builds, but to assist in workflows on multiple modules. To get the reproducible build you still need a single go.mod file that a user will build from. I wonder if you have ideas for how to make the intention more clear in the document.

Because replacements are made for developing modules, I think users would be surprised to not have them in workspaces. I think it's a bit unfortunate that they need to be supported in workspaces, but it's hard for me to see a better solution.

If code to be built with workspace "=on" and "=off" does not converge at 4, we've lost the track of what we're shipping. Tools should somehow prevent that. A bare minimum is to disallow unversioned replacements in go.mods of modules pulled into workspace. The real shield would be to move all replacements from go.mod to go.work — hardly possible now.

I think having replaces in multiple places is a good idea, but (unless I'm misunderstanding) this seems to encourage users to check in go.work files in single module workspaces, which could impede users setting up their own workspaces (see my reply to @mvdan )

@mvdan
Copy link
Member

@mvdan mvdan commented Apr 23, 2021

if we added all the modules under the directory, then it would be much harder (I think impossible?) to build a workspace that includes a module, but not any of the modules contained in subdirectories

I see. I can't say I can think of any scenario where I'd need this, but I've also avoided nested Go modules at any cost.

it's the best name I could come up with

How about include, import, or just modules?

I'm wondering what we'd get by disabling builds outside any of the modules?

I'm looking at this from the other side; why should we support this edge case right away if we don't have a strong use case for it? Doing builds inside the workspace but outside any module is something we could add at a later time if users really ask for it, but we wouldn't be able to take it back if we later find out it's confusing or unnecessary.

I think that the workspace file could be useful to allow the developer to specify configuration to tooling, because it is developer specific configuration, but this proposal does not intend to solve the build config problem. It's definitely out of scope for the design. If it's too distracting for the conversation I can remove the reference from the related issues section in the doc.

I'd remove that section, personally. Nowadays I tend to think that build config shouldn't need to be attached at the module level, and the same applies to the workspace level.

The doc does recommend that "most go.work files should exist outside of any repository" and I do think that go.work files in should rarely be checked into VCS. (Though it's buried in one paragraph in the doc so maybe I could do a better job of specifying that.)

I think making that recommendation clearer is a good idea, as it's probably one of the first questions users will have.

If a monorepo publishes a go.work file, a user can still put a go.work file in a parent directory of the repository and run the go command from there. This is another one of the benefits of allowing workspace builds outside of any particular module. It allows there to be multiple workspace 'directories' that a particular module can participate in without needing to pass -workfile. Of course a user can also pass in -workfile, either directly or through GOFLAGS.

Hmm, I find that recommendation for users to be a tad weird. We're introducing nested workspaces and builds outside any module, all for the sake of being able to satisfy the use case of local-only replace directives. It feels like a bit of a stretch :)

@neild
Copy link
Contributor

@neild neild commented Apr 24, 2021

The presence of a go.work file in the working directory or a containing directory will put the go command into workspace mode.

Does this imply that every go command invocation which reads a go.mod needs to check for go.work in every directory up to the fs root?

I’ve worked on systems with slow network filesystems where that would be expensive.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 27, 2021

it's the best name I could come up with

How about include, import, or just modules?

include could work. Added it as an option in the "Open issues" section

I'm wondering what we'd get by disabling builds outside any of the modules?

I'm looking at this from the other side; why should we support this edge case right away if we don't have a strong use case for it? Doing builds inside the workspace but outside any module is something we could add at a later time if users really ask for it, but we wouldn't be able to take it back if we later find out it's confusing or unnecessary.

I really don't think of this as an edge case, it's just a simple way to determine how to set the build list: "Prefer workspace mode (ie search for a go.work), if found use its build list. If not found, use module mode (ie search for a go.mod), and use its build list". I think we'd need to have a stronger reason to add the extra case of making an extra rule to disable it, for example cases that it could become confusing for.

I think that the workspace file could be useful to allow the developer to specify configuration to tooling, because it is developer specific configuration, but this proposal does not intend to solve the build config problem. It's definitely out of scope for the design. If it's too distracting for the conversation I can remove the reference from the related issues section in the doc.

I'd remove that section, personally. Nowadays I tend to think that build config shouldn't need to be attached at the module level, and the same applies to the workspace level.

Got it. I'll start by make it it much more clear that we have no intention of addressing that use case in this proposal or extending go.work in the future to address it.

The doc does recommend that "most go.work files should exist outside of any repository" and I do think that go.work files in should rarely be checked into VCS. (Though it's buried in one paragraph in the doc so maybe I could do a better job of specifying that.)

I think making that recommendation clearer is a good idea, as it's probably one of the first questions users will have.

Done, in the latest revision, and strengthened it to advocate against checking in go.work in vcs period.

If a monorepo publishes a go.work file, a user can still put a go.work file in a parent directory of the repository and run the go command from there. This is another one of the benefits of allowing workspace builds outside of any particular module. It allows there to be multiple workspace 'directories' that a particular module can participate in without needing to pass -workfile. Of course a user can also pass in -workfile, either directly or through GOFLAGS.

Hmm, I find that recommendation for users to be a tad weird. We're introducing nested workspaces and builds outside any module, all for the sake of being able to satisfy the use case of local-only replace directives. It feels like a bit of a stretch :)

Thinking about this much more in light of your comments and @ohir's I think recommending go.work files to be checked in to repos was a mistake so I'm going back and uniformly recommending against it.

@meling
Copy link

@meling meling commented Apr 27, 2021

Caveat: I've skimmed the proposal but haven't really tried to understand all the nuances here, so this may well be too simplistic.

My suggestion would be to try really, really hard to do this without introducing the go.work file. Here is an idea that maybe could work, using only go.mod with some minor additions.

Here workspace is a magic module name that the go command can interpret in a similar way to the go.work file.

module workspace

go 1.18

require (
	./baz	foo.org/bar/baz
	./tools	golang.org/x/tools
)

If the go.mod file for some module has require entries with a local path, these would be used; I guess this would be similar to a replace entry.

module github.com/relab/gorums

go 1.18

require (
	./baz	foo.org/bar/baz
	./tools	golang.org/x/tools
	google.golang.org/grpc v1.36.1
	google.golang.org/protobuf v1.26.0
)

As I mentioned at the top, I don't know if this would work for all scenarios, but I would strongly encourage finding a simple design.

Obviously, I understand adding another field to the require entries makes parsing a bit more challenging. Still, I think there are reasonable solutions to this, e.g., paths must start with ./ or ../ or /. Alternatively, one could add a => between the path and the module name, if necessary.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 27, 2021

The presence of a go.work file in the working directory or a containing directory will put the go command into workspace mode.

Does this imply that every go command invocation which reads a go.mod needs to check for go.work in every directory up to the fs root?

I’ve worked on systems with slow network filesystems where that would be expensive.

I want to understand this a bit better: my assumption for these cases are that most builds (or lists other go command operations that need the info in a go.mod will need to open many files and directories, and the lookups done for go.work wouldn't dominate. Is that not the case? Are there operations or filesystem layouts that would be particularly bad here?

@neild
Copy link
Contributor

@neild neild commented Apr 27, 2021

I want to understand this a bit better: my assumption for these cases are that most builds (or lists other go command operations that need the info in a go.mod will need to open many files and directories, and the lookups done for go.work wouldn't dominate. Is that not the case? Are there operations or filesystem layouts that would be particularly bad here?

I've been on systems where stats on files under $HOME were reasonably fast, but $HOME/.. was slow due to being a network automount. I recall AFS as being one offender here, although it's been quite a while and perhaps I'm unfairly maligning it.

Perhaps this isn't a case that matters any more these days; multiuser systems are certainly out of vogue.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 27, 2021

@meling I'd like to understand better why you'd like to avoid introducing the go.work file? I think making a kind of magic go.mod file instead could be more confusing and make things more complicated.

I don't totally understand the second example: what does it mean when you have both types of requires in a 'regular' module?

@mvdan
Copy link
Member

@mvdan mvdan commented Apr 27, 2021

I really don't think of this as an edge case, it's just a simple way to determine how to set the build list: "Prefer workspace mode (ie search for a go.work), if found use its build list. If not found, use module mode (ie search for a go.mod), and use its build list".

Intuitively, that's now how I thought about it. Should workspace mode really kick in if there's a go.work file somewhere in the parent directory chain, but there are absolutely no go.mod files anywhere on the filesystem?

If you think about go.work as a replacement to GOPATH, then it seems reasonable to want to use workspace mode outside any module. But if you think of it as a way to develop multiple modules at once, then it still doesn't make sense to me. From my point of view, the special case is to allow this extra mode of operation, not to disallow it :)

strengthened it to advocate against checking in go.work in vcs period.

Just thinking outloud: this does make the story for monorepos worse, because in those cases it can be easier for developers to share a go.work file by committing it at the root of the VCS repository. It seems like there might be intended use cases here which are at odds with each other.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Apr 27, 2021

Maybe there could be an env var that let's you specify a list of directories and the go command treats those directories as the root when performing searches up the directory tree for go.work or go.mod or anything else that comes up in the future. It seems like a general setting and might deserve its own thread, even if this proposal is what would cause it to be needed more than it is now.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 27, 2021

I really don't think of this as an edge case, it's just a simple way to determine how to set the build list: "Prefer workspace mode (ie search for a go.work), if found use its build list. If not found, use module mode (ie search for a go.mod), and use its build list".

Intuitively, that's now how I thought about it. Should workspace mode really kick in if there's a go.work file somewhere in the parent directory chain, but there are absolutely no go.mod files anywhere on the filesystem?

If there are no go.mod files on the filesystem, then the workspace must be empty or invalid (because every directory directive must point to a directory with a go.mod file). If that's the case, the only operations that make sense from the go command are those that apply "globally", such as go install module@version.

More generally: the rules are similar to what go.mod does. There could be alternative ways to decide how to enter a "single build list for multiple local modules" mode, but it doesn't seem necessary for such a mechanism to only work when the current directory is contained in one of the component modules.

If you think about go.work as a replacement to GOPATH, then it seems reasonable to want to use workspace mode outside any module. But if you think of it as a way to develop multiple modules at once, then it still doesn't make sense to me. From my point of view, the special case is to allow this extra mode of operation, not to disallow it :)

It's true that multiple modules can still be developed at once even if we add the restriction, so that it doesn't solely address the problem of developing multiple modules at once. But it's a smaller set of rules for a user to remember (when I'm under this go.work file, rather than, when I'm under this go.work file, and contained in one if its modules). I think a lot of users would think of it as an arbitrary restriction if they got the error message saying that it's not supported outside of a module's directory.

I'm interested in what cases this could lead to user confusion or other errors that we should be worried about.

strengthened it to advocate against checking in go.work in vcs period.

Just thinking outloud: this does make the story for monorepos worse, because in those cases it can be easier for developers to share a go.work file by committing it at the root of the VCS repository. It seems like there might be intended use cases here which are at odds with each other.

One future option could be to make an external tool that produces a go.work file listing all the modules recursively in a directory to easily link together a repo.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 27, 2021

Maybe there could be an env var that let's you specify a list of directories and the go command treats those directories as the root when performing searches up the directory tree for go.work or go.mod or anything else that comes up in the future. It seems like a general setting and might deserve its own thread, even if this proposal is what would cause it to be needed more than it is now.

Yeah I think that would be a reasonable option if that ends up being a problem, though I hope that this proposal isn't the reason why we would need it.

@myitcv
Copy link
Member

@myitcv myitcv commented Apr 27, 2021

so I'm going back and uniformly recommending against it.

I'm somewhat confused 😄 . Because it was my understanding that the work flow of being able to check in a go.work file that would then be shared amongst developers of a multi-module repo, e.g. x/tools was a fundamental use case we were looking to cover here. Are we now saying that's not a valid use case? And that instead, each developer needs to create such a file themselves?

I might well be alone, but I've somewhat lost track of the original use cases driving this proposal, and therefore the extent to which this proposal satisfies those problems/concerns. In particular, the UX before and after, who is expected to create go.work files in a given use case and when, is the go.work file added as the workspace to the editor, what further in-editor configuration is necessary, when/where -workfile needs to be specified on the command line etc.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 28, 2021

Because it was my understanding that the work flow of being able to check in a go.work file that would then be shared amongst developers of a multi-module repo, e.g. x/tools was a fundamental use case we were looking to cover here.

To confirm, this is is the workflow of someone wanting to work across modules in a multi module workspace, right? If so, the proposal definitely aims to address this proposal. We could create a tool to create go.work from the modules recursively contained in a directory. Like a more robust version of this:

#!/bin/sh
find $1 | grep "\.mod$" | awk '
        BEGIN { print "go 1.17\n\ndirectory (" }
  !/testdata/ { printf "\t./%s\n",substr($1,0,length($1)-length("/go.mod")) }
          END { print ")" }
'

Then someone working in a module like tools would run multimodulework tools when they clone the tools repo and they'd have a go.work file, outside the repo, that links together the tools and gopls modules. Would that address the use case?

@myitcv
Copy link
Member

@myitcv myitcv commented Apr 28, 2021

We could create a tool to create go.work from the modules recursively contained in a directory.

Indeed, but that impacts the overall UX of the solution. I'm not arguing for/against such a tool, but to my mind the ultimate workflow needs to be clear to anyone reading this proposal. Not least because this is (I think?) the first mention of such a tool, and I assume it's beyond the scope of this proposal?

Would that address the use case?

I think this is a question for those folks who have to deal with this workflow. Is the fact that every developer now has to create a go.work file themselves using a tool they need to install, updating that file periodically, a less painful approach than the current workflow?

@meling
Copy link

@meling meling commented Apr 28, 2021

@meling I'd like to understand better why you'd like to avoid introducing the go.work file? I think making a kind of magic go.mod file instead could be more confusing and make things more complicated.

I prefer fewer file types in my folders, and having to examine two different file types for dependencies and replace directives, etc seems unnecessary, when there is already a file type for the Go module system.

As for the "magic" workspace go.mod file, maybe my suggestion was too magic since it would change the interpretation of the require entries. An alternative format could instead be:

module workspace

go 1.18

workspace (
	./baz	foo.org/bar/baz
	./tools	golang.org/x/tools
)

Obviously, the workspace keyword isn't important; it can be something else, e.g. modules as proposed above.

In fact, an alternative format could reuse the replace directive:

module workspace

go 1.18

replace (
	foo.org/bar/baz => ./baz
	golang.org/x/tools => ./tools
)

PS: I'm assuming that a workspace module can be specified without any require entries in "workspace" mode.

I don't totally understand the second example: what does it mean when you have both types of requires in a 'regular' module?

My idea was that entries with path names could serve the same role as replace directives; you wouldn't need to specify replace directives if you specified a path name in the require directive. It isn't important since there is already a way to do it with the replace directive, but as you point out in the proposal, working with replace directives can often be awkward.

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 28, 2021

@myitcv Got it- I've updated the doc to mention that an external tool can potentially help with the creation of go.work files for multi module repos.

@meling A go.work file has fundamentally different semantics than a go.mod file. A go.mod provides the definition of a single module, while go.work links together several modules. Putting these specification in different files helps better delineate the differences.

@rsc
Copy link
Contributor

@rsc rsc commented Apr 28, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rhcarvalho
Copy link
Contributor

@rhcarvalho rhcarvalho commented Apr 28, 2021

@matloob thanks for fleshing out the proposal.

#45713 (comment)

The replace would override any replaces
in the main modules' go.work files.

Wondering if this might be a typo, I didn't find clarification in the other comments -- did you mean "override any replaces in the main modules' go.mod files"?

@matloob
Copy link
Contributor Author

@matloob matloob commented Apr 29, 2021

@rhcarvalho Ooh yeah it's absolutely a typo! Edited the comment to fix! Thanks!

@meling
Copy link

@meling meling commented Apr 30, 2021

@meling A go.work file has fundamentally different semantics than a go.mod file. A go.mod provides the definition of a single module, while go.work links together several modules. Putting these specification in different files helps better delineate the differences.

Well, I have to disagree that the semantics between go.work and go.mod are fundamentally different. They are quite similar in their objectives. Further, it is quite common that a single file type can encode different semantics. For instance, the go.mod file already supports different "semantics," such as replace and exclude.

@zevdg
Copy link
Contributor

@zevdg zevdg commented May 1, 2021

I still wonder why we need directory (...) to be an explicit list. Using a module in workspace mode that isn't listed in go.work is an error, so why not just make directory implicitly mean all modules under the directory where the go.work file is located? Akin to how go.mod implicitly includes all packages under its directory (with some basic exception rules).

If the answer is "it's too expensive to find all modules every time", I'd love to see some proof of that. I'd also like to see if caching could be a solution, similar to go list ./.... I know you mentioned alternatives like directory ./... or go work init filling in the directory list, but either of those seems unnecessary if we can implicitly do the right thing.

I'm not convinced that the right behavior is to add all the modules under the directory, even if it were free: if we added all the modules under the directory, then it would be much harder (I think impossible?) to build a workspace that includes a module, but not any of the modules contained in subdirectories. So if we added all the modules in a directory we'd have to add another mechanism to remove modules from the workspace. And if we're going to do that it seems to me that the default should be to add a module at a time, and maybe consider if we want to do a directory tree in the future, if it turns out to be necessary.

In discouraging multi-module projects from checking in go.work, I think you inadvertently made the case for a directory tree feature much stronger. Assuming there is a tool that walks a directory tree and generates a go.work file that includes all its submodules, in order to stay up to date, it would need to be re-run after any sub-module was added or removed. If the go.work file is checked in, then only the person who actually added or removed the module needs to make the change. If it's not checked in, then every contributor would have to re-run the command to update their workspace. To be fair, modules probably aren't added or removed from multi-module repos too often, but if I want a reliable workflow for a multi-module repo, I'd need to defensively re-run the go.work generator command after every pull from upstream and every branch switch just in case someone else added or removed a module in that update or branch.

However, if something like

go 1.17

directory ./...

were valid and added all submodules under go.work to the workspace, then contributors workspaces would include newly added modules without needing to either constantly regenerate OR check in go.work.

To state it more generally, if we do commit to discouraging multi-module projects from checking in go.work, then we'd want to ensure both

  • A) that it's easy for new contributors to generate initial go.work files and
  • B) that those go.work files only need to be updated in exceedingly rare cases since project maintainers don't have a good way to signal to all their contributors that their workspaces need to be updated.

If we ignore these 2 problems, then multi-module repo owners will simply ignore our advice to not check in go.work.

Problem A) is easy. As you've said this could be done by an external tool, but it's something that go work init should handle, at least in the common cases IMO. Problem B) is the harder part and fundamentally impossible without a crystal ball. As you noted, some projects may want to specifically exclude certain modules, so although allowing directory ./... insulates users from needing to update their go.work file in the case of the a newly added module that should be included in their workspace, it doesn't insulate them from the case of a newly added module in the repo that shouldn't be included in their workspace. That said, accidentally including a new unnecessary module module in the workspace seems both rarer and less likely to cause problems than not including a new module that contributors would expect to be included. Given that, maybe it's ok to not handle this case cleanly.

Long story short, there's enough incidental complexity arising from the decision to discourage multi-module repos from checking in their go.work that it may be worth revisiting that recommendation.

Disclamer: This opinion is not backed by any hard data.
I have a gut feeling that the minimal go.work file above is a special case in that it's exactly what several users seem to want, and it's also the behavior that many multi-module repo owners would encourage their contributors to use. If all the modules in a repo are tightly coupled and released together (which I suspect is one of the main reasons people group multiple modules in the same repository), then it is IMO obviously the most intuitive configuration for working on the project. If my suspicions here are correct, then it would make sense for the generation of that exact go.work (or one that behaves like it) to be a Critical User Journey for this feature, and not something delegated to an external tool.

@rsc
Copy link
Contributor

@rsc rsc commented May 5, 2021

The ./... is too expensive. It incurs an unbounded large amount of file walking just to get the go command started. I don't need to prove that with measurements. It won't scale, and we want Go tools to scale.

@matloob
Copy link
Contributor Author

@matloob matloob commented May 5, 2021

@zevdg I agree with Russ that ./... is too expensive. But I don't think it's necessary either. Of the two issues you listed

  • A) that it's easy for new contributors to generate initial go.work files and
  • B) that those go.work files only need to be updated in exceedingly rare cases since project maintainers don't have a good way to signal to all their contributors that their workspaces need to be updated.

We agree that A is relatively easy, so that leaves B.

Splitting out a "submodule" from another module should be a rare operation. It often causes confusion and other workflow issues (for instance, updating the version dependencies between the submodules to correctly pick up changes in the modules). And in the cases in which a module does need to be split out, there are safeguards to prevent confusion. As specified in the design document, the go command will error out when it is run from a module directory if the module is not in the go.work file. That requires the user to explicitly turn off workspace mode or add the module to their workspace. And it would be easy for tooling like gopls to show users a warning if the've opened a file that doesn't belong to the current workspace.

So I think the cases that all module developers will want to update their go.work should be very rare, and when it does happen, we can have safeguards to catch and help users.

@hherman1
Copy link

@hherman1 hherman1 commented May 6, 2021

I've only skimmed the discussion, so my apologies. I just wanted to say that my first impression is that I'm sad I will have to learn about a new go toolchain file that will be often seen in my directories, because it makes it harder for me to understand the whole world. It may still be worth adding, but I think its important to consider the cost. I'm very fond of how simple the go build tools/API is, especially compared to gradle which I spend much more time with, and I hope it stays that way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked pull requests

Successfully merging a pull request may close this issue.

None yet