Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: race-like removal of first line by codeAction #35114

Open
myitcv opened this issue Oct 23, 2019 · 19 comments
Assignees
Milestone

Comments

@myitcv
Copy link
Member

@myitcv myitcv commented Oct 23, 2019

What version of Go are you using (go version)?

$ go version
go version devel +03bb3e9ad1 Wed Oct 16 06:29:51 2019 +0000 linux/amd64
$ go list -m golang.org/x/tools
golang.org/x/tools v0.0.0-20191023143423-ff611c50cd12
$ go list -m golang.org/x/tools/gopls
golang.org/x/tools/gopls v0.1.8-0.20191023143423-ff611c50cd12

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/myitcv/.cache/go-build"
GOENV="/home/myitcv/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/myitcv/gostuff"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/myitcv/gos"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/myitcv/gos/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/myitcv/gostuff/src/github.com/myitcv/govim/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build521776351=/tmp/go-build -gno-record-gcc-switches"

What did you do?

In govim we have a test that effectively "pastes" the following contents into a new, unsaved buffer called main.go (hence does not exist on disk) and then saves the file:

package main

func main() {
}

In the test however we sometimes see the first line, the package clause, being removed by the codeAction that fires on save. Here is the relevant line from the gopls log:

[Trace - 17:14:28.146 PM] Received response 'textDocument/codeAction - (2)' in 73ms.
Result: [{"title":"Organize Imports","kind":"source.organizeImports","edit":{"changes":{"file:///tmp/go-test-script040847720/script-format_on_save_new_file/main.go":[{"range":{"start":{"line":0,"character":0},"end":{"line":1,"character":0}},"newText":""}]}}}]

What did you expect to see?

The file contents preserved by the codeAction.

What did you see instead?

The package clause being removed 1 time in 5 this test runs.


cc @stamblerre

@gopherbot gopherbot added this to the Unreleased milestone Oct 23, 2019
@gopherbot gopherbot added the Tools label Oct 23, 2019
@stamblerre

This comment has been minimized.

Copy link
Contributor

@stamblerre stamblerre commented Oct 24, 2019

Thanks for reporting. Does govim use the opt-in go-diff setting for the diff algorithm?

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Oct 24, 2019

No we do not

Apologies, I meant to link to the entire gopls log file for the test. That's now here:

https://gist.github.com/myitcv/f8d3bd52e3a31c30b2befe7909eae285

I'm not sure this can be diff related though can it? Because it doesn't fail every time, only 1 in 5 times (roughly). Hence my assumption this is a race-like error?

@stamblerre

This comment has been minimized.

Copy link
Contributor

@stamblerre stamblerre commented Oct 24, 2019

Do you have any sense of which CL caused this to become an issue? It does sound like a race, but the behavior here is very strange. My best guess is that there is some sort of race where it tries to do goimports on an empty file, but I'm not convinced that goimports would even work in that case.

Also, why is the didChange split into 2 parts here?

[Trace - 20:58:48.415 PM] Sending notification 'textDocument/didChange'.
Params: {"textDocument":{"version":1,"uri":"file:///tmp/go-test-script389776774/script-format_on_save_new_file/main.go"},"contentChanges":[{"range":{"start":{"line":1,"character":0},"end":{"line":1,"character":0}},"text":"package main\n\nfunc main() {\n}\n"}]}


[Trace - 20:58:48.416 PM] Sending notification 'textDocument/didChange'.
Params: {"textDocument":{"version":2,"uri":"file:///tmp/go-test-script389776774/script-format_on_save_new_file/main.go"},"contentChanges":[{"range":{"start":{"line":0,"character":0},"end":{"line":1,"character":0}},"text":""}]}
@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Oct 27, 2019

Also, why is the didChange split into 2 parts here?

Ooh, I had totally missed this, good spot. The second of those changes sent by govim accounts for the first line being removed.

{"textDocument":{"version":2,"uri":"file:///tmp/go-test-script389776774/script-format_on_save_new_file/main.go"},"contentChanges":[{"range":{"start":{"line":0,"character":0},"end":{"line":1,"character":0}},"text":""}]}

This is actually a programming error in the govim test itself. Now fixed.

But here's the interesting thing: pre this fix both didChange notifications have always been sent to gopls. i.e. this govim test error has been there for some time, even when this test passes in govim. They are always sent in exactly this order as well.

So this I think reveals quite a nasty race bug somewhere in gopls. Because the codeAction that happens subsequently is, for some reason, not acting on the code that results from did(Open|Change) notifications that happened before.

@stamblerre

This comment has been minimized.

Copy link
Contributor

@stamblerre stamblerre commented Oct 28, 2019

Does passing for this test mean that the code remained unchanged? Or that the code action responds with the expected edits? There may be a weird interaction here with the goimports library actually. If it detects a missing package declaration, it will try to fill one in and run goimports with it.

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Oct 29, 2019

The situation is somewhat complicated by the fact there is an error in the govim test, and a further bug in govim. But it does still seem to indicate a problem in gopls. Let me try and describe the scenario:

  • the test creates a new buffer called main.go and pastes in the following contents:
package main

func main() {
}
  • the buffer's contents in Vim are exactly as shown above
  • but because of bug in govim, the two actions that represent the "paste" are relayed to gopls as:
package main

func main() {
}

and

<delete line 1>

Hence gopls believes the file contents are:


func main() {
}
  • the test then runs the code action as part of saving the file
  • in ~1 in 5 cases, the code action returns no error and the following change:
[]protocol.CodeAction{
    {
        Title:       "Organize Imports",
        Kind:        "source.organizeImports",
        Diagnostics: nil,
        IsPreferred: false,
        Edit:        &protocol.WorkspaceEdit{
            Changes: &map[string][]protocol.TextEdit{
                "file:///tmp/go-test-script863275998/script-format_on_save_new_file/main.go": {
                    {
                        Range: protocol.Range{
                            Start: protocol.Position{},
                            End:   protocol.Position{Line:1, Character:0},
                        },
                        NewText: "",
                    },
                },
            },
            DocumentChanges: nil,
        },
        Command: (*protocol.Command)(nil),
    },
}
  • this then causes govim to apply the change, meaning it changes the Vim buffer to:

func main() {
}

But this then breaks the test from govim's perspective because it's expecting the content to be:

package main

func main() {
}

which is how we were alerted to this

  • For the other ~4 in 5 times, we receive an error from the code action:
no parsed files for package mod.com

So I think the question is, why does gopls behave randomly? Because in both cases (test passing or failing) it always sees the same changes from govim.

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Oct 30, 2019

Ok, some further analysis, which invalidates much of what is above. TL;DR version

  • there is no bug/problem with Vim/govim
  • the race-like behaviour does appear to be the root cause of the failures we're seeing

The scenario we have is actually as follows:

  • the test starts with an empty module, i.e. a go.mod but no other files
  • the test creates a new buffer called main.go and reads the following contents:
package main

func main() {
}

This leaves the Vim buffer with the following, note the first blank line:

  
package main

func main() {
}

The test then adjusts for this to remove the first blank line, meaning the buffer is in the following state:

package main

func main() {
}
  • govim sends these two changes (because they are processed as two changes in Vim) to gopls:
{"textDocument":{"version":1,"uri":"file:///tmp/go-test-script121089769/script-format_on_save_new_file/main.go"},"contentChanges":[{"range":{"start":{"line":1,"character":0},"end":{"line":1,"character":0}},"text":"package main\n\nfunc main() {\n}\n"}]}
{"textDocument":{"version":2,"uri":"file:///tmp/go-test-script121089769/script-format_on_save_new_file/main.go"},"contentChanges":[{"range":{"start":{"line":0,"character":0},"end":{"line":1,"character":0}},"text":""}]

Note the first change references LSP line 1.

So the net effect of these two changes leaves gopls in the same state as Vim:

package main

func main() {
}

This is confirmed with some rudimentary logging in gopls

  • the test then runs the code action as part of saving the file
  • in ~1 in 5 cases, the code action returns no error and the following change:
[]protocol.CodeAction{
    {
        Title:       "Organize Imports",
        Kind:        "source.organizeImports",
        Diagnostics: nil,
        IsPreferred: false,
        Edit:        &protocol.WorkspaceEdit{
            Changes: &map[string][]protocol.TextEdit{
                "file:///tmp/go-test-script863275998/script-format_on_save_new_file/main.go": {
                    {
                        Range: protocol.Range{
                            Start: protocol.Position{},
                            End:   protocol.Position{Line:1, Character:0},
                        },
                        NewText: "",
                    },
                },
            },
            DocumentChanges: nil,
        },
        Command: (*protocol.Command)(nil),
    },
}
  • this then causes govim to apply the change, meaning it changes the Vim buffer to:

func main() {
}

This change is then relayed to gopls.

But this then breaks the test from govim's perspective because it's expecting the content to be:

package main

func main() {
}

which is how we were alerted to this

  • For the other ~4 in 5 times, we receive an error from the code action:
no parsed files for package mod.com

So I think the question are:

  • why does gopls behave randomly
  • why does gopls return an error when in the ~4 in 5 cases when it doesn't return any changes?

Because in both cases (test passing or failing) it always sees the same changes from govim and those changes leave it with the same contents each time.

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Oct 30, 2019

Here are two gopls logs from when the govim test passes/fails in case this helps to give any clues:

@stamblerre

This comment has been minimized.

Copy link
Contributor

@stamblerre stamblerre commented Oct 31, 2019

Thanks for the logs. Took a look at them, but nothing stands out yet. I put up a very basic CL with some additional logging as a starting point (see https://golang.org/cl/204457). It would be helpful to know what the imports code action thinks it is doing in this case.

I think that this bug might be a good application for @pjweinb's work on replaying gopls logs.

@stamblerre

This comment has been minimized.

Copy link
Contributor

@stamblerre stamblerre commented Oct 31, 2019

To summarize some offline discussion - there seem to be 2 errors here. The first is the "no parsed files" error for the "passing" case. The second is the modification of the file in the racey case. I am able to reproduce the first scenario with the log replayer, so I will focus on debugging it first.

It also seems like the cause of the race is gopls operating on stale file contents. @myitcv has bisected this race as stemming from https://golang.org/cl/201221.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 31, 2019

Change https://golang.org/cl/204557 mentions this issue: internal/lsp: do not format the file on import organization

gopherbot pushed a commit to golang/tools that referenced this issue Nov 4, 2019
Updates golang/go#30843
Updates golang/go#35114

Change-Id: Id3f66d20b1ada9e53298b2370214b23b87bb0680
Reviewed-on: https://go-review.googlesource.com/c/tools/+/204557
Run-TryBot: Rebecca Stambler <rstambler@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Cottrell <iancottrell@google.com>
@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Nov 7, 2019

Change https://golang.org/cl/204457 mentions this issue: internal/lsp: debugging CL for golang/go#35114

@stamblerre stamblerre self-assigned this Nov 7, 2019
@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Nov 7, 2019

We are seeing another instance of the error:

no parsed files for package mod.com

when calling codeAction to get suggested fixes.

Separately, assuming that the error is indeed an error, this returns us to the question of whether codeAction and Formatting should return errors in the case of parse failures. Because part of the issue with this having gone unnoticed for so long is that, per #35114 (comment), the error returned by codeAction is "swallowed" by govim.

Why do we swallow these errors in govim? If we didn't swallow errors, the the user would be unable to save a file that has parse errors (because most users have format-on-save turned on).

Unless there is some way that we determine that a file has parse errors and hence avoid the formatting/codeAction call on save?

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Nov 10, 2019

Further update, with x/tools and gopls 97ad0ed33101 I get the following panic with the same test:

panic: runtime error: index out of range [0] with length 0

goroutine 41 [running]:
golang.org/x/tools/internal/lsp/cache.(*session).bestView(0xc00025e580, 0xc0004da0a0, 0x4a, 0x0, 0x0)
   /home/myitcv/gostuff/pkg/mod/golang.org/x/tools@v0.0.0-20191109212701-97ad0ed33101/internal/lsp/cache/session.go:218 +0x19b
golang.org/x/tools/internal/lsp/cache.(*session).DidOpen(0xc00025e580, 0xd55d20, 0xc0002aa510, 0xc0004da0a0, 0x4a, 0x0, 0xc00037e390, 0x1, 0x8, 0xadef20, ...)
   /home/myitcv/gostuff/pkg/mod/golang.org/x/tools@v0.0.0-20191109212701-97ad0ed33101/internal/lsp/cache/session.go:255 +0x2b8
golang.org/x/tools/internal/lsp.(*Server).didOpen(0xc0002922a0, 0xd55d20, 0xc0002aa510, 0xc000368340, 0x0, 0xc0004dc000)
   /home/myitcv/gostuff/pkg/mod/golang.org/x/tools@v0.0.0-20191109212701-97ad0ed33101/internal/lsp/text_synchronization.go:29 +0x15e
golang.org/x/tools/internal/lsp.(*Server).DidOpen(0xc0002922a0, 0xd55d20, 0xc0002aa510, 0xc000368340, 0xc000368340, 0x0)
   /home/myitcv/gostuff/pkg/mod/golang.org/x/tools@v0.0.0-20191109212701-97ad0ed33101/internal/lsp/server.go:135 +0x49
golang.org/x/tools/internal/lsp/protocol.serverHandler.Deliver(0xd6ae80, 0xc0002922a0, 0xd55d20, 0xc0002aa510, 0xc0002909c0, 0xc0002aa500, 0xc000260280)
   /home/myitcv/gostuff/pkg/mod/golang.org/x/tools@v0.0.0-20191109212701-97ad0ed33101/internal/lsp/protocol/tsserver.go:111 +0x1e32
golang.org/x/tools/internal/jsonrpc2.(*Conn).Run.func1(0xc0002663c0, 0xc0002909c0, 0xc000292300, 0xd55d20, 0xc0002aa510, 0x0, 0x0, 0xc00017f120)
   /home/myitcv/gostuff/pkg/mod/golang.org/x/tools@v0.0.0-20191109212701-97ad0ed33101/internal/jsonrpc2/jsonrpc2.go:370 +0x170
created by golang.org/x/tools/internal/jsonrpc2.(*Conn).Run
   /home/myitcv/gostuff/pkg/mod/golang.org/x/tools@v0.0.0-20191109212701-97ad0ed33101/internal/jsonrpc2/jsonrpc2.go:354 +0x874
@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Nov 10, 2019

That panic appears to have been introduced in https://go-review.googlesource.com/c/tools/+/204822

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Nov 10, 2019

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Nov 10, 2019

Change https://golang.org/cl/206160 mentions this issue: internal/lsp: debugging for golang/go#35114

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Nov 10, 2019

I've managed to make a bit of progress on this by comparing the gopls logs of a "pass" and "fail" in the govim test. Thanks to @ianthehat for providing more context on slack:

cache ≈ disk (the files on the file system)
session ≈ disk + memory (the cache plus the overlays)
view ≈ disk + memory + config (the config controls how the files are understood, this is the first layer at which you can determine packages)
snapshot ≈ view at a specific point in time (every change to either the disk, memory or config causes a new snapshot)

so for consistent contents you need a snapshot not a view (no token needed, it’s inherent in the snapshot object itself)

originally the snapshot layer was integrated into the view, we only recently split it out, so there are lots of things that take a view that should probably take a snapshot instead, diagnostics is probably one of those things

Just as a refresher (more details in the comments above):

  • this test is about running codeAction followed by Formatting on a file that does not exist on disk
  • at first the file is empty (state A), then its contents get filled but with a blank first line (state B):
  
package main

func main() {
}

Then the first line is removed (state C):

package main

func main() {
}

Then we run codeAction, then (assuming no error from the previous step) we run Formatting.

  • the test "passes" most of the time but then fails ~1 time in 10
  • the test passes because most of the time codeAction returns an error and hence no call to Formatting is made and hence no edits are applied. The error returned is:
no parsed files for package mod.com

This does not appear to be correct, because if we create a file in state C then open Vim we do not see this error in the gopls logs.

  • the test fails when, ~1 time in 10, the code action does not return an error but then Formatting returns edits that appear to have been calculated with respect to the previous version of the buffer. The fact that on some occassions codeAction does not return an error further suggests that it's an error when it does.

Analysis

In https://go-review.googlesource.com/c/tools/+/206160 I've pushed up some backtrace logging to help with debugging.

Here are two gopls log files from a "passing" and failing run:

In the "passing" test the sequence of backtrace logs are, at a high level, as follows:

  • golang.org/x/tools/internal/lsp/cache.(*view).invalidateContent via didOpen
  • golang.org/x/tools/internal/lsp/cache.(*snapshot).load via golang.org/x/tools/internal/lsp.(*Server).diagnostics
  • golang.org/x/tools/internal/lsp/cache.(*view).invalidateContent via didChange
  • golang.org/x/tools/internal/lsp/cache.(*snapshot).load via golang.org/x/tools/internal/lsp.(*Server).diagnostics
  • golang.org/x/tools/internal/lsp/cache.(*view).invalidateContent via didChange
  • golang.org/x/tools/internal/lsp/cache.(*snapshot).load via golang.org/x/tools/internal/lsp.(*Server).diagnostics

Notice the interleaving of the diagnostic runs between the content updates. This results in codeAction returning no parsed files for package mod.com and hence does not call Formatting.

The failing sequence of backtrace logs are, at a high level, as follows:

  • golang.org/x/tools/internal/lsp/cache.(*view).invalidateContent via didOpen
  • golang.org/x/tools/internal/lsp/cache.(*view).invalidateContent via didChange
  • golang.org/x/tools/internal/lsp/cache.(*snapshot).load via golang.org/x/tools/internal/lsp.(*Server).diagnostics
  • golang.org/x/tools/internal/lsp/cache.(*snapshot).load via golang.org/x/tools/internal/lsp.(*Server).diagnostics
  • golang.org/x/tools/internal/lsp/cache.(*view).invalidateContent via didChange
  • golang.org/x/tools/internal/lsp/cache.(*snapshot).load via golang.org/x/tools/internal/lsp.(*Server).diagnostics
  • golang.org/x/tools/internal/lsp/cache.(*snapshot).load via golang.org/x/tools/internal/lsp.(*Server).codeAction

Notice the additional invalidation of content as part of the codeAction call.

Rough conclusions

  • At least based on what I've observed here there appears to be a race created by running the diagnostics. This shouldn't be a problem per se but because other calls do not appear to be fully utilising a snapshot the timing of the diagnostic run has side effects
  • The error returned by codeAction (no parsed files for package mod.com) 9 times in 10 seems to suggest that codeAction is not using the current content for the file (at the time of calling codeAction)
  • The diff returned by Formatting also seems to suggest that Formatting is not using the current content for the file at the time of calling (state C), because it returns a diff consistent with the contents at state B
  • The loading of config appears to be happening from the view and not a snapshot (https://github.com/golang/tools/blob/97ad0ed33101b39103eaf8aeab62b59dd1cfb89e/internal/lsp/cache/load.go#L58). For any code calling this async from the request thread (diagnostics at least falls into this category) I think this will be racey with respect to the expected config.
  • The overlay used as part of a packages.Load call appears to be built from the session and not a snapshot (https://github.com/golang/tools/blob/97ad0ed33101b39103eaf8aeab62b59dd1cfb89e/internal/lsp/cache/view.go#L126). For any code calling this async from the request thread (diagnostics at least falls into this category) I think this will be racey with respect to the expected file contents in the overlay.
@matloob

This comment has been minimized.

Copy link
Contributor

@matloob matloob commented Nov 11, 2019

The panic you're seeing looks like #35468. I'll send a fix for that soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.