Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: workspace/didChangeWatchedFiles RegisterCapability globs are too broad #41504

Open
myitcv opened this issue Sep 20, 2020 · 11 comments
Open

Comments

@myitcv
Copy link
Member

@myitcv myitcv commented Sep 20, 2020

What version of Go are you using (go version)?

$ go version
go version devel +06f7e655d1 Fri Sep 18 07:56:50 2020 +0000 linux/amd64
$ go list -m golang.org/x/tools
golang.org/x/tools v0.0.0-20200918232735-d647fc253266
$ go list -m golang.org/x/tools/gopls
golang.org/x/tools/gopls v0.0.0-20200918232735-d647fc253266

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/myitcv/.cache/go-build"
GOENV="/home/myitcv/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/myitcv/gostuff/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/myitcv/gostuff"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/myitcv/gos"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/myitcv/gos/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/myitcv/gostuff/src/github.com/myitcv/govim/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build962924290=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Given the following setup:

-- blah/blah.go --
package blah

const Message = "hello"
-- blah/go.mod --
module blah.com

go 1.16
-- go.mod --
module mod.com

go 1.16

require blah.com v0.0.0-00010101000000-000000000000

replace blah.com => ./blah
-- go.sum --
-- main.go --
package main

import (
	"fmt"

	"blah.com"
)

func main() {
	fmt.Println(blah.Message)
}

I loaded govim.

What did you expect to see?

Precise workspace/didChangeWatchedFiles globls in any RegisterCapability call.

What did you see instead?

RegisterCapability: &protocol.RegistrationParams{
    Registrations: {
        {
            ID:              "workspace/didChangeWatchedFiles-0",
            Method:          "workspace/didChangeWatchedFiles",
            RegisterOptions: map[string]interface {}{
                "watchers": []interface {}{
                    map[string]interface {}{
                        "globPattern": "**/*.{go,mod,sum}",
                        "kind":        float64(7),
                    },
                    map[string]interface {}{
                        "globPattern": "/home/myitcv/gostuff/src/github.com/myitcv/playground/blah/**/*.{go,mod,sum}",
                        "kind":        float64(7),
                    },
                    map[string]interface {}{
                        "globPattern": "/home/myitcv/gostuff/src/github.com/myitcv/playground/**/*.{go,mod,sum}",
                        "kind":        float64(7),
                    },
                },
            },
        },
    },
}

Side note: the working directory of govim (and therefore gopls) in this case was /home/myitcv/gostuff/src/github.com/myitcv/playground, so the first and third watchers appear to be duplicates.

If I understand the glob spec, **/*.{go,mod,sum} means "all .go, .mod and .sum files in all subdirectories".

The recursive descent into subdirectories does not stop at directories whose names start with . or _, or those that contain go.mod files. This can, and does, make this watch very expensive if, for example, you have a directory like node_modules that contains lots of files that are and always will be totally irrelevant to gopls (the advice here is to put a go.mod in that directory).

I suspect this situation is somewhat a function of the LSP spec, but can these globs be made more specific in some way?

An approach to .gitignore patterns would work for example.


cc @stamblerre

FYI @leitzler

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Sep 21, 2020

I think we can ignore any of the directories that match these cases: https://github.com/golang/tools/blob/ccabf82fa1aef57afc4686de4b7881718f7b7d27/internal/lsp/cache/view.go#L662.

Ignoring something like node_modules is more difficult, and likely not doable if you open a module that contains a node_modules directory.

@stamblerre stamblerre modified the milestones: Unreleased, gopls/v1.0.0 Sep 21, 2020
@myitcv
Copy link
Member Author

@myitcv myitcv commented Sep 22, 2020

I think we can ignore any of the directories that match these cases

When you say "we", do you mean gopls? If so, that's too late because the client will already have setup watchers on those directories, and that's the cost I'm raising here.

Ignoring something like node_modules is more difficult, and likely not doable if you open a module that contains a node_modules directory.

As I explained in the original description, the advice with node_modules directories is to place a go.mod file in that directory (rather than special case ignore it), hence pruning it from the watch space.

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Sep 22, 2020

I think we can ignore any of the directories that match these cases

When you say "we", do you mean gopls? If so, that's too late because the client will already have setup watchers on those directories, and that's the cost I'm raising here.

Yep, sorry for not being clearer - I linked that function to show the different cases that we know the Go command ignores and that we could also ignore in the GlobPattern.

Ignoring something like node_modules is more difficult, and likely not doable if you open a module that contains a node_modules directory.

As I explained in the original description, the advice with node_modules directories is to place a go.mod file in that directory (rather than special case ignore it), hence pruning it from the watch space.

Yep--I thought you might be asking for an alternative method of ignoring it.

@stamblerre stamblerre added this to Non-critical in vscode-go: gopls by default Nov 10, 2020
@stamblerre stamblerre moved this from Non-critical to Needs Triage in vscode-go: gopls by default Nov 11, 2020
@stamblerre stamblerre removed this from the gopls/vscode-go milestone Nov 11, 2020
@stamblerre stamblerre removed this from Needs Triage in vscode-go: gopls by default Nov 11, 2020
@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Nov 11, 2020

@myitcv: Would you be willing to contribute a CL for this? We're happy to do it, but it likely won't be prioritized for a while. The file watching glob patterns can be found here, and the cases that should be ignored are here.

@stamblerre stamblerre added this to the gopls/unplanned milestone Nov 11, 2020
@myitcv
Copy link
Member Author

@myitcv myitcv commented Nov 11, 2020

@stamblerre I'm very unlikely to get to this for some time unfortunately. So please feel free to offer to anyone else who might be interested in taking this on. Thanks for checking in any case.

@jhchabran
Copy link

@jhchabran jhchabran commented Feb 23, 2021

Hello, I am interested in taking on this issue.

I took the time to look around and run some tests on my own, and it appears that the glob patterns are not flexible enough to match recursively while excluding the pattern the go command excludes (_, . and testdata) (1).

AFAIK, the remaining approach I see is to watch every folder individually (like it's currently done to handle the deletion events on certain clients) with a less restrictive glob, allowing to match subfolder creation in order to update the watches accordingly.

Because we're manually iterating through all known folders of the workspace, we can reject the ones that need to be ignored. Here is a snippet to illustrate this explanation:

(context: https://github.com/golang/tools/blob/master/internal/lsp/cache/snapshot.go#L674)

var dirNames []string
for uri := range s.allKnownSubdirs(ctx) {
	// only watch it if it's not to be ignored, starts with _, . or is named testdata. 
	if !s.IgnoredFile(uri) { 
		dirNames = append(dirNames, uri.Filename())
	} 
}

// yes, we're watching a/b/* because we want subfolders creation events to refresh the watcher. 
patterns[fmt.Sprintf("{%s}{,/*}", strings.Join(dirNames, ","))] = struct{}{}

Ending the glob with a wildcard (a/b/*) means that gopls will receive unwanted filesystem events. We'd have to discard them obviously (the handler seems the right place to do so).

@stamblerre, Is that the kind of solution you had in mind back then? I am unsure about this trade-off, reducing the cost on the client-side with narrower globs but add unnecessary notifications on the server-side.

I don't want to create unnecessary noise, so if that's ok I can go on and submit a CL!

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Feb 25, 2021

@jhchabran: Thank you so much for doing this detailed investigation! Really appreciate that you've done a deep dive here. I think you are right that we don't want to begin watching everything with the * pattern--that would be too expensive if users have monorepos with lots of different types of files.

And in fact, I don't think it should be necessary to switch to the * pattern instead of the *.{go,mod,sum} pattern--we don't actually need to worry about creation events for new folders (and we currently don't). We should only need to care when a new Go file is created in that subfolder, and at that point, we can update the glob patterns to add that folder (if it's not ignored). Does that make sense? Please let me know if I am missing something.

@jhchabran
Copy link

@jhchabran jhchabran commented Feb 25, 2021

Thanks for your feedback 🙏

TL;DR: I found out that I missed an aspect of gopls behavior in my tests which prevented me to see why your above suggestion works. Still, there's a slight behavior change, but that may be inconsequential. I kept my original explanation because it showcases the subtle change in question.

Original comment, highlights delay in gopls being aware of new files when they are created outside the editor UI

We should only need to care when a new Go file is created in that subfolder, and at that point, we can update the glob patterns to add that folder (if it's not ignored)

The problem I see here is that AFAIU, we can't know about the file being created in the subfolder learn when using only {path1,path2,...}/*.{go,mod,sum} as opposed to **/*.{go,mod,sum}. It won't catch go files that have been created manually in a subfolder of path1, path2, .. because the glob won't match on that subfolder.

But if done through the editor (drag and drop, or right-click create a folder then new file), gopls will receive an event from the editor and thus will refresh the watched folders.

To be clearer, through my tests, I observed that:

  • glob is {/home/me/code/path1,/home/me/code/path2,...}/*.{go,mod,sum}
    • adding /home/me/code/path1/foo.go: ok, glob matches it
    • adding /home/me/code/path1/bar/foo.go: not ok, glob does not matches it
      • but if created through the editor UI, a 'textDocument/didOpen' will be fired by the editor, which will trigger a didChangeWatchedFiles
    • if that was done with mv /tmp/bar /home/me/code/path1, it will be never registered unless the editor opens it. That's where the problem lies.
  • glob is {**/*.{go,mod,sum}
    • adding /home/me/code/path1/foo.go: ok, glob matches it
    • adding /home/me/code/path1/bar/foo.go: ok, glob matches it in both cases (mv or added with the editor ui)

💡 I just discovered while double-checking myself that if I reference somewhere in the code a symbol that has been declared in the newly added subfolder/bar/foo.go (let's say bar.Foo()), the editor will send something (a documentLink/resolve I presume?) which will then lead to refreshing the watched folders since gopls now has knowledge about this new package.

@stamblerre I believe that this 👆 must be what I have been missing since the beginning and now everything makes sense.

➡️ With that in mind, and assuming that it's ok to not have gopls be aware of new files in a newly created package until there's a reference toward it (1) then the costly ** can be removed and we can drop the ignored filenames along the way.

Please tell me if I missed anything!

(1) typing the package name followed by a dot in VScode is enough to achieve that

@danp
Copy link
Contributor

@danp danp commented Feb 25, 2021

I'd be interested to test any glob changes with emacs and lsp-mode/eglot as I'm able. Both have had some trouble in the past (see joaotavora/eglot#602 and emacs-lsp/lsp-mode#1979).

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Feb 26, 2021

@jhchabran: Thank you for continuing to test out these complicated cases! I didn't realize that those glob patterns wouldn't get file creation events for subdirectories like the /home/me/code/path1/bar/foo.go case you described. That would be a regression I think--gopls needs to get that file creation event even if the file is never opened.

It's starting to seem like it would be unnecessarily complicated to get this working, and I bet that other language servers have run into this too. I filed an issue with VS Code to ask how to ignore certain directories in a Glob Pattern: microsoft/language-server-protocol#1210. Let's wait for a response there before moving forward here. Sorry that this issue has turned out to be so complicated! If you'd like to find another issue to work on, that's totally fine and encouraged :)

@jhchabran
Copy link

@jhchabran jhchabran commented Feb 28, 2021

Ok, it totally makes sense! I'll wait for some updates on this and will have a look at the other issues in the meantime then 😊👍

@stamblerre No worries, the issue being more complicated than it looked was part of the fun for me, I'm glad that I may have helped to move further on this topic by digging into it.

I'm quite curious to see what will they answer on the issue in the eventuality of not having missed anything in what can be done with the glob patterns or elsewhere. I believe the topic had been previously discussed over here microsoft/language-server-protocol#354.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants