Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: add global ignore mechanism for Go tooling ecosystem #42965

Open
burdiyan opened this issue Dec 3, 2020 · 36 comments
Open

Comments

@burdiyan
Copy link

burdiyan commented Dec 3, 2020

Problem

For non-trivial (often multi-language) projects it's often desirable to make all the Go tools (including gopls) ignore certain directories.

Some of the examples could be the huge amount of files within node_modules, or bazel-* directories generated by Bazel. This causes many operations with ./... wildcards taking longer than desired. Also gopls often eats up a lot of CPU in VS Code depending on what you are doing.

Prior Art

This is something that has been discussed in several issues before, but seems like people couldn't get agree on a solution.

Some tools started to have their own solutions which causes fragmentation and is cumbersome.

For example goimports have its own machinery for this - .goimportsignore file in this case. But it's not working with Go Modules.

Other tools have a hard-coded list of directories to ignore, like .git and so on.

It seems like having a global solution that all the Go ecosystem could understand would make sense to solve this kind of problem.

Recently a workaround for this was to place a dummy go.mod file in the directories you wanted to ignore. But this is not easily portable between users of the project, because often these directories can be re-created on the user's machine and aren't even checked-in. Asking people to sprinkle some go.mod files all around every time is cumbersome.

@robpike was against of creating more dot files (#30058 (comment)).

Proposed Solution

Here're some of the options that this could be implemented with.

  1. Use go.mod file for specifying directories to ignore. (Rejected because go.mod is not a catch-all config file like package.json in NodeJS).
  2. Use a separate .goignore file. (This would go against Rob's desire to avoid new dot files, and although being in the spirit with other tools: .dockerignore, .gitignore, .bazelignore, etc. is concerning. The concerns are discussed in this thread).
  3. Use the go.work file that's coming in the next Go 1.18 release.
  4. Have a separate go.ignore file that would specify directories to ignore.

/cc @tj @stamblerre

@gopherbot gopherbot added this to the Proposal milestone Dec 3, 2020
@mvdan
Copy link
Member

mvdan commented Dec 3, 2020

But this is not easily portable between users of the project, because often these directories can be re-created on the user's machine and aren't even checked-in. Asking people to sprinkle some go.mod files all around every time is cumbersome.

I'm not sure that I understand this argument. Presumably, it's a program that creates and fills those directories, since they have to contain a significant amount of files for you to really want to ignore them in Go. If they were just a handful of files created manually by a human, it would be a negligible cost for Go to walk those and realise there are no Go packages there.

So, given that it is a program or script creating those large directories, why not add a touch ${dir}/go.mod at the end? That seems easy enough at a high level, at least.

I'm proposing to add this configuration into the existing go.mod file.

This is unlikely to happen, see #42343 (comment).

Another solution could be a global .goignore file. This would go against Rob's desire to avoid new dot files, but would be in the spirit with other tools like that have files like .dockerignore, .gitignore, .bazelignore, etc.

I have to admit that I dislike this option. It's bad enough that all these other tools use separate ignore files.

@burdiyan
Copy link
Author

burdiyan commented Dec 3, 2020

I'm not sure that I understand this argument. Presumably, it's a program that creates and fills those directories, since they have to contain a significant amount of files for you to really want to ignore them in Go. If they were just a handful of files created manually by a human, it would be a negligible cost for Go to walk those and realise there are no Go packages there.

So, given that it is a program or script creating those large directories, why not add a touch ${dir}/go.mod at the end? That seems easy enough at a high level, at least.

@mvdan It is indeed a program that creates these directories. But it's a program that you don't control normally. Wrapping well-known tools like nom install with your own script only to put an empty go.mod in there doesn't seem right.

On the other hand by placing arbitrary files in these directories you're invading the territory of other tools. What if that program checks the integrity of the directory and would break seeing a random unknown file? It's not the case with node_modules but breaking into structures created by other programs, only to work around your own problem doesn't seem right either.

I understand the objection about go.mod. I was not aware about @rsc's statement.

I have to admit that I dislike this option. It's bad enough that all these other tools use separate ignore files.

Could you elaborate on why do you think it's bad? It may not be the most elegant solution, but it's common practice, well-understood and somewhat expected.

If we already have .goimportsignore, why not standardizing it into something that can be handled and understood by all the ecosystem of Go tools?

@bcmills
Copy link
Member

bcmills commented Dec 3, 2020

It is indeed a program that creates these directories. But it's a program that you don't control normally. Wrapping well-known tools like nom install with your own script only to put an empty go.mod in there doesn't seem right.

Wouldn't you need to wrap the tool to add a .goignore file anyway? (Given that you need to inject a file, why does it matter whether it is named .goignore or go.mod?)

@burdiyan
Copy link
Author

burdiyan commented Dec 3, 2020

@bcmills My proposal is to add a file in the root of the project, not in the directory being ignored. So it would be checked in. Like .gitignore in Git. Basically the idea is to list the paths to ignore in that file, and check it in.

@bcmills
Copy link
Member

bcmills commented Dec 3, 2020

...ok? But why would you not also check in the injected go.mod files?

@mvdan
Copy link
Member

mvdan commented Dec 3, 2020

Could you elaborate on why do you think it's bad? It may not be the most elegant solution, but it's common practice, well-understood and somewhat expected.

The common practice is to litter repositories with dot files. That does not mean we should do the same, making the problem worse :) Go already has multiple mechanisms to ignore entire directories (. or _ prefixes, and dropping empty go.mod files), so there needs to be a really good reason to add another method.

@burdiyan
Copy link
Author

burdiyan commented Dec 3, 2020

...ok? But why would you not also check in the injected go.mod files?

@bcmills because often directories to ignore aren't checked in.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Dec 3, 2020
@burdiyan
Copy link
Author

burdiyan commented Dec 4, 2020

The common practice is to litter repositories with dot files. That does not mean we should do the same, making the problem worse :) Go already has multiple mechanisms to ignore entire directories (. or _ prefixes, and dropping empty go.mod files), so there needs to be a really good reason to add another method.

@mvdan IMHO, having a dot file in one place, that is trackable, is less of an evil, than sprinkling empty go.mod files all over the place, ad-hoc, and breaking into opinions of other tools.

Thinking about pros and cons of implementing such a feature, I'm struggling to see any cons (probably due to my ignorance), besides having to spend the time to implement it. I'd appreciate if anyone could bring some light into this to understand the implications.

@psigen
Copy link

psigen commented Mar 19, 2021

I have a use case where I have a multi-language repo where not all of the developers are touching the go components.

I don't think it is reasonable to ask my docker, bazel, and nodejs developers to all wrap their normal tooling in scripts that touch extra files in their build directories, nor ask them to try to rename their standard build directories to match existing go conventions, some of which conflict with the other tool conventions.

It seems like there should be a way to specify how to ignore certain files or directories that does not require modifying the content of those files or directories, because the ignored content is not being managed by go and may have its own conflicting conventions and lifecycle.

@rsc
Copy link
Contributor

rsc commented Aug 18, 2021

@psigen Go wants a directory tree that belongs to it. In a multi-language repo, why not create a top-level go/ directory?

@psigen
Copy link

psigen commented Aug 21, 2021

@rsc: because my projects are not organized that way. I have services like:

service1/
    backend/  # golang
    debug-cli/
    proto/
service2/
    backend/  #python
    debug-ui/
    proto/
webapp/
    frontend/ 

I know what you are asking, which is why not reorganize to:

proto/
    service1/
    service2/
golang/
    service1-svc/
    service1-cli/
python/
    service2/
nodejs/
    webapp/
    service1/debug-ui/

And the answer, (besides "that's a lot of work right now") is that it is not how our ownership is structured.

It is not convenient to have duplication in the CODEOWNERS files, .gitignore patterns that look like **/service1/foo*, cross-directory-tree docs links, etc. all in service of golang. It makes PR reviews harder when related changes happen all over the directory tree. It forces docker build contexts to all need to be at the root of the entire source tree, and makes live-rebuilds in tools like Tilt and Skaffold much more difficult to author.

I could go on, but I'm really just reiterating the core premise of this proposal:

For non-trivial (often multi-language) projects it's often desirable to make all the Go tools (including gopls) ignore certain directories.

@4k1k0
Copy link

4k1k0 commented Sep 3, 2021

I use the serverless framework to deploy lambdas on AWS. Some plugins that I have to use contains go files. So when I run go mod download or go mod tidy I add dependencies to my go.mod file that are required by the go files inside the node_modules directory. It would be great to define a way to exclude directories from go modules.

node_modules
  serverless
    lib
      plugins
        create
          templates
            aws-go
              main.go
cmd
  main.go
pkg
  dirA
  dirB
  dirC

@cespare
Copy link
Contributor

cespare commented Sep 3, 2021

@rsc

@psigen Go wants a directory tree that belongs to it. In a multi-language repo, why not create a top-level go/ directory?

Reading this statement does not make me happy. It goes against the entire premise of the code organization at my company.

We use a monorepo with projects in several languages. Some Go programs live inside projects written in other languages. Sometimes we rewrite a project from one language to another. Some projects use a combination of Go and other languages (imagine a website written using both Go and JavaScript extensively). We already have an organizational hierarchy within the monorepo that is based around purpose and ownership, not language.

This all worked fine with $GOPATH: the repo was inside its own single $GOPATH segment and a top-level vendor directory contained a single version of all shared dependencies.

Moving to modules has raised some challenges, but mostly it has worked. The whole repo is one module so we use a fixed, shared set of dependencies. One issue we faced is that the go mod tidy and other commands printed out a bunch of irrelevant spam (#35941) -- we sent a fix for that. Other issues we see involve various kinds of slowness in gopls (#46438 describes one particular issue I have). But mostly it works fine, and ISTM that the remaining issues are surmountable if the folks working on the tools care about making them work well in the presence of mixed-language source trees (and until now it seemed to me that they mostly do!).

But when I read "Go wants a directory tree that belongs to it", it sounds like you don't think this use case matters as far as the standard Go tools are concerned. I don't know how we could possibly adapt our repo to a "Go code all belongs in its own tree" model. Probably we wouldn't -- I imagine that if push came to shove, we'd look into alternative build tools.

@dolmen
Copy link
Contributor

dolmen commented Dec 16, 2021

In #50225 I'm bring in concerns about the resources (network, disk space) wasted on every developers machine because the module zips contain many irrelevant files.

Check this list of files that are in your Go modules cache:

find $(go env GOMODCACHE)/*.* -type f ! -name '*.go' ! -name 'go.mod' ! -name 'go.sum' ! -name 'list.lock' ! -name 'v*.mod' ! -name 'v*.info' ! -name 'v*.zip' ! -name 'v*.ziphash' ! -name 'v*.lock' ! -name 'LICENSE*' ! -name 'README*' -print

I have more than 200,000 useless files on my machine.

This also impacts CI builds (download time/space of new dependencies, requires to enable strong module caching to reduce the problem).

@burdiyan
Copy link
Author

burdiyan commented Dec 17, 2021

While similar, I think this proposal is a bit different from yours @dolmen in a sense that here I mostly care about ignoring directories, not specific files, and definitely not for specific packages like x/mod/zip. Still, it could be the same solution for solving both problems.

@burdiyan
Copy link
Author

burdiyan commented Dec 17, 2021

BTW, go.work is coming in the next Go release. Maybe this feature could be implemented in there eventually? Or maybe a separate go.ignore file? Looks like a better approach than .goignore for sure!

I updated the initial comment.

@dolmen
Copy link
Contributor

dolmen commented Dec 17, 2021

BTW, go.work is coming in the next Go release. Maybe this feature could be implemented in there eventually?

I consider go.work as a development tool for your local development environment. Which means it is a file I would not commit in the repo.

Instead, ignore patterns must be available for tools that download the code from a VCS (for publishing on a proxy, or for filling the module cache, see #50225), so the ignore patterns must be always available in the repository.

So go.work would not be a good place for ignore patterns.

@burdiyan
Copy link
Author

burdiyan commented Dec 18, 2021

@dolmen While I suspect that go.work is meant to be checked-in (I'm not sure about it), I think you're right that the ignore stuff should probably be in a separate place, because not all project would want to have go.work. Then maybe go.ignore is the remaining option that would make some people happy, and the rest (those who don't like the idea of dot files) at least not angry about it :)

@burdiyan burdiyan changed the title proposal: global ignore mechanism for Go tool ecosystem proposal: global ignore mechanism for Go tooling ecosystem Dec 18, 2021
@antichris
Copy link

antichris commented Dec 19, 2021

@burdiyan, go.work is indeed not meant to be checked in:

These go.work files should not be checked into the repositories so that they don‘t override the workspaces users explicitly define. Checking in go.work files could also lead to CI/CD systems not testing the actual set of version requirements on a module and that version requirements among the repository’s modules are properly incremented to use changes in the modules. And of course, if a repository contains only a single module, or unrelated modules, there's not much utility to adding a go.work file because each user may have a different directory structure on their computer outside of that repository.
Proposal: Multi-Module Workspaces in cmd/go §Multiple modules in the same repository that depend on each other

@jaronsummers
Copy link

jaronsummers commented Feb 4, 2022

It is kind of frustrating that the responses from Go contributors are uniformly "Everyone else on earth is wrong, they should change to accommodate our design choices."

No matter how inelegant another dotfile is, it solves the problem in a universal way that will work for all repository structures and build tools. None of the proposed alternatives even attempt to do the same.

I currently just don't run gopls and try to minimize how often I have to write Go, which is not a "solution" that is available to everyone.

@hyangah
Copy link
Contributor

hyangah commented Feb 18, 2022

Some of us discussed the problem this proposal aims to address - i.e., allow to exclude certain directories when running go with patterns including ....

We agree this is a problem for some tools (e.g. gopls, and others that accept go's import path patterns). Many tools developed their own ways of configure exclusion rules (e.g. gopls has directoryFilter) but this is still not sufficient if they depend on go invocation with ... pattern underneath.

@bcmills had a great idea during the discussion - go already has the overlay mechanism (see the summary of the feature by @matloob and also the -overlay flag description in go command help page). That can be used as the directory exclusion mechanism. For exclusion, place an empty value; for inclusion, set identity mapping. gopls can implement this by applying already existing directoryFilter, and I guess other tools can do the same. (x/tools/go/packages supports overlay)

The overlay config isn't as flexible as glob patterns many dotfiles accept, but I think it still provides the sufficient knob
tools can play with. What do you think?


#50225 (for mechanism to fine tune the scope of a module) was mentioned during the discussion, but I don't think that is the goal of this proposal. For example, I think it's possible one wants to speed up gopls by excluding a directory but want to still keep it in the distributed module (directories containing asset files, etc) or the directory doesn't affect module distribution at all (ephemeral directories such as node_modules or bazel directories created during build).

@jaronsummers I think the Go team is trying to understand the problem better, not dismiss or ignore problems users are facing in the real world.

@paralin
Copy link
Contributor

paralin commented Jun 16, 2022

I'm sure this was already mentioned here, but for clarity:

The most simple example case of this is if you have node_modules which happens to have any Go code in it. When running "go mod tidy" the Go files in node_modules are scanned & included in go.mod. ignoring node_modules would be the most obvious application of some .goignore feature.

@amery
Copy link

amery commented Jun 16, 2022

it was mentioned in the past but adding ignore support to go.mod would be flexible, no hardcoded rules, and no new magic files.

@antichris
Copy link

antichris commented Jun 16, 2022

@amery

it was mentioned in the past

And it was already rejected in the past:

... because go.mod is not a catch-all config file like package.json in NodeJS
#42965 (comment)

@amery
Copy link

amery commented Jun 16, 2022

@antichris every solution has been rejected because developers don't recognize the problem. go.mod is not a catch-all and .goignore is.. another file

@burdiyan
Copy link
Author

burdiyan commented Jun 20, 2022

I'd vote for go.ignore! It's another file, but it's not a dot-file, which was the main concern of Rob, and others I believe.

@amery
Copy link

amery commented Jun 20, 2022

I'd vote for go.ignore! It's another file, but it's not a dot-file, which was the main concern of Rob, and others I believe.

as long as it can be used to specify patterns to ignore I'm happy

@antichris
Copy link

antichris commented Jun 20, 2022

How about a go env variable, e.g. GOIGNOREFILE, that could be set to point to an arbitrary ignore pattern file with .gitignore-compatible syntax?

Users could then assign GOIGNOREFILE=go.ignore (or .goignore or even .gitignore), if they chose to do so, and no new "another" (dot or not) file is forced on anyone out of the box. It would also be possible to settle on an OOtB default value for it eventually, without breaking established workflows, yet providing a much needed relief in the meantime.

@dee-kryvenko
Copy link

dee-kryvenko commented Jun 23, 2022

Surprised there wasn't any devops use cases yet, so I'll step in.

Many great projects are based on Go, among which are - docker, kubernetes, helm and terraform. It is only so natural that many of us dealing with these tools on a daily basis becoming fluent with Go over time, and starting to use it more. Particular example of that is Terratest - a great framework that is used to write integration tests for Dockerfile, helm charts and terraform modules. Terratest itself is a Go module so the tests are executed with go test ./....

What that means is that every helm chart or terraform module repository is initialized as a Go module, despite that it doesn't have any Go code (other than tests).

I saw somewhere a mention that Go excludes folders named with . in the front. I am not sure how but it doesn't really happening. Maybe it is just gopls specific issue - I don't know. Terraform creates .terraform folder under which it creates sub folders and checking out other modules this module depends on. My VScode in gopls mode consumes a lot of CPU and generates a lots of warning for duplicated modules because TF modules in my VScode workspace indeed are duplicated under .terraform folders of other modules that are using them. This makes the whole setup so painful. And what if some other tool like terraform were to use a folder without . in the front, just like npm, and there wasn't a way to modify its behavior?

This is such an easy feature to add, that this discussion spread among multiple tickets already consumed x10 times of everyone's time than would otherwise take one single person to just implement it. I hope my struggle is not for Go developers arrogance and they indeed trying to understand the problem, but somehow I find it hard to believe. I feel like to address this issue we first will need a help of a licensed therapist that would conduct a series of sessions with the maintainers and help them understand that despite their awesomeness and undeniable historical contribution to the humanity legacy - the universe does not spin around them and there are other tools in existence that Go needs to peacefully coexist with.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jun 23, 2022

@dee-kryvenko Please be charitable and respectful, per the Go Community Code of Conduct. Criticize the arguments, not the people. Thanks.

@dee-kryvenko

This comment was marked as off-topic.

@mvdan
Copy link
Member

mvdan commented Jun 23, 2022

@dee-kryvenko your last comment is off-topic for the proposal at hand, so I've hidden it to keep the thread on topic. If you have any thoughts on the code of conduct, I would suggest to email conduct@golang.org or golang-dev@googlegroups.com.

I saw somewhere a mention that Go excludes folders named with . in the front. I am not sure how but it doesn't really happening. Maybe it is just gopls specific issue - I don't know.

If that is happening, that is a separate bug that needs to be fixed. Please file it as a new issue with details.

@findleyr
Copy link
Contributor

findleyr commented Jun 24, 2022

A lot of gopls' problems mentioned here can be fixed by improved handling of existing mechanisms: go.work files and directoryFilters settings. We're aware of several places where those settings are not handled correctly, resulting in gopls loading unnecessary data. Several of these will be fixed in the next gopls release (v0.9.0). Others will take longer.

In particular: one missing piece is hiding directories filtered by directoryFilters from the go command using overlays, as suggested by @bcmills and described by @hyangah in #42965 (comment).

Given that go.work and directoryFilters already exist, I think we should mostly disconnect this proposal from gopls' performance problems: having a global ignore mechanism does not necessarily solve gopls' problems, nor are gopls' problems unsolvable without a global ignore mechanism*.

*I'll caveat that it may still be that the integration of go.work+directoryFilters is not great, for example because of our inability to translate wildcard filters to overlays.

@silverwind
Copy link

silverwind commented Jul 21, 2022

I see intermittent test failures when files in node_modules change while thing like go test run (here in a parallel CI job):

go test -timeout 360s -cover ./...
pattern ./...: open node_modules/esbuild-darwin-64: no such file or directory

There absolutely must be a way to get go tooling to ignore certain directories and files. node_modules can easily contain upwards of 100k files, the presence of it will slow down all go commands that search for go files by at least a few seconds even on a fast SSD.

@antichris

This comment was marked as off-topic.

@mvdan
Copy link
Member

mvdan commented Jul 21, 2022

Again, please keep it civil.

And please avoid adding comments that repeat what has already been said. To add your +1, add a reaction at the top: https://github.com/golang/go/wiki/NoPlusOne

@rsc rsc changed the title proposal: global ignore mechanism for Go tooling ecosystem proposal: cmd/go: add global ignore mechanism for Go tooling ecosystem Aug 10, 2022
jynnantonix added a commit to wormhole-foundation/wormhole that referenced this issue Aug 18, 2022
The wormhole sdk is a new go module in the sdk/ directory.  This
initially contains the *_consts.go files from the common package in the
top-level sdk package and the entire vaa package as a sub-package.

For go reasons this needs to be in the sdk directory itself (rather than
a sdk/go subdir).  To prevent the go tooling from looking into the other
non-go subdirs, add an empty go.mod file in each one.  See
golang/go#42965 for more details on why we
can't have nice things.
jynnantonix added a commit to wormhole-foundation/wormhole that referenced this issue Aug 22, 2022
The wormhole sdk is a new go module in the sdk/ directory.  This
initially contains the *_consts.go files from the common package in the
top-level sdk package and the entire vaa package as a sub-package.

For go reasons this needs to be in the sdk directory itself (rather than
a sdk/go subdir).  To prevent the go tooling from looking into the other
non-go subdirs, add an empty go.mod file in each one.  See
golang/go#42965 for more details on why we
can't have nice things.
jynnantonix added a commit to wormhole-foundation/wormhole that referenced this issue Aug 30, 2022
The wormhole sdk is a new go module in the sdk/ directory.  This
initially contains the *_consts.go files from the common package in the
top-level sdk package and the entire vaa package as a sub-package.

For go reasons this needs to be in the sdk directory itself (rather than
a sdk/go subdir).  To prevent the go tooling from looking into the other
non-go subdirs, add an empty go.mod file in each one.  See
golang/go#42965 for more details on why we
can't have nice things.
jynnantonix added a commit to wormhole-foundation/wormhole that referenced this issue Aug 30, 2022
The wormhole sdk is a new go module in the sdk/ directory.  This
initially contains the *_consts.go files from the common package in the
top-level sdk package and the entire vaa package as a sub-package.

For go reasons this needs to be in the sdk directory itself (rather than
a sdk/go subdir).  To prevent the go tooling from looking into the other
non-go subdirs, add an empty go.mod file in each one.  See
golang/go#42965 for more details on why we
can't have nice things.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests