Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: client receives old diagnostics #36601

Closed
myitcv opened this issue Jan 16, 2020 · 18 comments
Closed

x/tools/gopls: client receives old diagnostics #36601

myitcv opened this issue Jan 16, 2020 · 18 comments

Comments

@myitcv
Copy link
Member

@myitcv myitcv commented Jan 16, 2020

What version of Go are you using (go version)?

$ go version
go version devel +d2de9bd59c Thu Jan 16 04:02:37 2020 +0000 linux/amd64
$ go list -m golang.org/x/tools
golang.org/x/tools v0.0.0-20200116062425-473961ec044c
$ go list -m golang.org/x/tools/gopls
golang.org/x/tools/gopls v0.1.8-0.20200116062425-473961ec044c

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/myitcv/.cache/go-build"
GOENV="/home/myitcv/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/myitcv/gostuff"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/myitcv/gos"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/myitcv/gos/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/myitcv/gostuff/src/github.com/myitcv/govim/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build511293922=/tmp/go-build -gno-record-gcc-switches"

What did you do?

This may or may not be a problem; raising for completeness given previous CLs have sought to prevent old diagnostics being sent to the client.

Per conversations with @stamblerre we have a defence against this in place in govim so we are not seeing any problems in our tests. i.e. if these old diagnostics are ignored, the non-old diagnostics are as expected.

In a few tests we randomly see gopls sending govim old diagnostics. That is for a given file, we receive diagnostics for a version less than the latest version we received from gopls.

This doesn't appear to be specific to a given test/scenario.

Attached is a gopls log file from a test that looks to verify case insensitive completion:

gopls.log

The sequence of diagnostic notifications here is:

main.go v3
main.go v4
main.go v5
main.go v6
main.go v3
main.go v6
main.go v3
main.go v8
main.go v9

What did you expect to see?

Monotonically increasing version numbers for a file's diagnostics.

What did you see instead?

As above.


cc @stamblerre

FYI @leitzler

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 16, 2020

@stamblerre I'll leave you to decide whether this is v0.3.0 or not. As I mention above we appear to be fine if we ignore these old diagnostics. But as I said, I'm raising for completeness' sake.

@gopherbot gopherbot added this to the Unreleased milestone Jan 16, 2020
@gopherbot gopherbot added the Tools label Jan 16, 2020
@stamblerre stamblerre modified the milestones: Unreleased, gopls unplanned, gopls/v1.0.0 Jan 16, 2020
@stamblerre

This comment has been minimized.

Copy link
Contributor

@stamblerre stamblerre commented Jan 16, 2020

This does seem like a bug, particularly because it's this v3 that keeps coming back. I'm not sure if it's a v0.3.0 blocker, but it's definitely something to look into soon.

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 17, 2020

Following a conversation with @leitzler yesterday, I think we need to consider this a v0.3.0 blocker.

I will try to explain why by making clear my understanding on a few points. Because if any of this is wrong my argument might well fall to pieces!

  1. files that are not open in the editor have v == 0 in gopls
  2. files that are open in the editor have v >= 1 and this version is controlled by the LSP client
  3. open files that are subsequently closed are then have v == 0
  4. gopls only sends diagnostics for a given file if those diagnostics have changed, regardless of whether the file is open in the editor or not (correct as of 872a348c3885)
  5. hence the client might end up in a situation where an open file is v4 but the current diagnostics for that file sent from gopls are v2 (because there were no changes to the diagnostics as a result of the changes in v3 and v4, or any other files for that matter)
  6. diagnostics for a file can change without the version of that file changing, as a result of changes in other files
  7. couple points 4, 5 and 6 with the async nature of diagnostics and the client cannot know whether, at any given point in time, it has the current diagnostics for any given file; it only knows that, under normal operation, the "right" diagnostics will eventually arrive
  8. given point 6, the client cannot know how to order diagnostics for a file when the version number in the diagnostics does not change, regardless of whether the file is open or not
  9. given point 8, the server has to guarantee that it will only ever send diagnostics for any given file according to the order defined by the time of the snapshot against which the diagnostics were calculated
  10. correspondingly, the client has to blindly trust that the server is, at any given point in time, sending it "later" diagnostics for any given file
  11. it's possible that the client will not receive diagnostics for all versions of a file

On Slack last night, I think @heschik reached the same conclusion as point 9:

I think maybe we should put back the recency check on diagnostics, but do it on snapshot ID instead of file version.

Given point 8 above (and the conclusion in point 9) I therefore think it is wrong/impossible for the client to try and drop non-new diagnostics. Hence, per @leitzler's observation, we should not be merging govim/govim#663.

It was suggested in #36452 (comment) that the client should only apply diagnostics for a file if that version matches the current (from the client's perspective) file version. Given points 4, 5 and 7 above, I think we have to conclude this is not possible. Rather that the client has to apply diagnostics on a "best efforts" basis, in the knowledge that, under normal operation, the "right" diagnostics will eventually arrive and "put things right".

My conclusions from the above are therefore:

  • we have to make this issue a blocker for v0.3.0, and per @heschik implement a recency filter based on the time of snapshot against which diagnostics for a file were calculated
  • we should not merge govim/govim#663

Any corrections/comments etc on the above gratefully received 😄

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Jan 17, 2020

Change https://golang.org/cl/215298 mentions this issue: internal/lsp: add recency check to avoid sending old diagnostics

myitcv pushed a commit to myitcvforks/tools that referenced this issue Jan 17, 2020
This change uses snapshot IDs instead of file versions to determine if
diagnostics are stale.

Fixes golang/go#36601.

Change-Id: I7727a30222fecf2d7733fa013c6cee6338ffd968
@myitcv myitcv reopened this Jan 17, 2020
@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 17, 2020

Rather bizarrely, we're still seeing this with golang/tools@633b092.

main.go v2
main.go v3
main.go v4
main.go v5
main.go v6
main.go v3
main.go v8
main.go v9

gopls log file attached: fail.log

@stamblerre stamblerre modified the milestones: gopls/v1.0.0, gopls/v0.3.0 Jan 17, 2020
@stamblerre

This comment has been minimized.

Copy link
Contributor

@stamblerre stamblerre commented Jan 17, 2020

Some quick observations: textDocument/didSave gets sent with versions 3, 5, 7, and 9. The stray diagnostics come for version 3, instead of version 7. My suspicion is that this relates to saving and the special cases with saving somehow.

@zikaeroh

This comment has been minimized.

Copy link
Contributor

@zikaeroh zikaeroh commented Jan 17, 2020

I too see something like this, but without any saving (maybe not the same, then?). A very quick example where I see a change to v3, and then gopls responds with an empty v1. I don't see any diagnostics for v2, presumably because they're the same as the "real" v1.

https://gist.github.com/zikaeroh/4146e6e154d622317322bab903c24fd8

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 17, 2020

I'm able to reliably reproduce this locally and on Travis for various tests. I haven't spotted a pattern yet as to why/when this happens.

Reproducing locally via:

GOVIM_ERRLOGMATCH_WAIT=30s ./_scripts/captureLogs.sh /tmp/blah go test -count=1 ./...

Then finding affected tests via:

grep -r "Received non-new" /tmp/blah

In each script directory beneath /tmp/blah (that location is of course arbitrary) you will find govim, gppls and Vim log files.

My machine is not super powerful which might be significant. Whether you run with race or not also appears to not be significant. But adding -race to the go test call above you can, of course, create increased load which might make this more likely.

Removing the saves (as suggested by @stamblerre) also didn't appear to "fix" things. Instead we get a slightly different pattern of diagnostics for this test:

main.go v2
main.go v3
main.go v4
main.go v5
main.go v6
main.go v3
main.go v8
main.go v5

Indeed I'm not really sure there's much of a pattern.

Given the logic here:

https://github.com/golang/tools/blob/0cba7a3a9ee9d9a5917ed56b9668e7ad35eaa571/internal/lsp/diagnostics.go#L125

we know the snapshot ID of the diagnostics we receive must be greater than or equal to that we received previously. Indeed through some simple logging I can confirm in this case the snapshot ID is indeed greater.

So it appears the question we need to answer is: how can a later snapshot get an earlier version of a file?

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 18, 2020

Some progress.

Last night Rebecca asked (via Slack):

Does that test change files back

As in, "during the test, do any of the files end up with the same contents as previous versions?"

I answered "no" - but that answer was wrong. Really sorry about that @stamblerre.

In fact, v7 of the file is identical to v3. And v9 is identical to v5. And that seems to correspond with the output above.

The good news is that, despite the wrong version numbers in the diagnostics, the result is correct. i.e. the diagnostics are what we are expecting.

But the wrong version numbers seem to suggest to me that we are caching stuff for too long. The fact that we are able to find a cache hit for something that happened four versions ago seems strange - because surely we should only be caching the last diagnostics for any given file? But I'm only guessing as to what might be happening here. Whatever the cause, we should fix the fact that old version numbers are sent as part of the diagnostics.

Therefore, subject to any further issues that @zikaeroh might report, this is probably not a v0.3.0 blocker.

What do you think @stamblerre?

@leitzler

This comment has been minimized.

Copy link
Contributor

@leitzler leitzler commented Jan 18, 2020

So basically back to #36476?

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 18, 2020

Possibly, but now at least the diagnostics are correct, only the version is wrong.

@leitzler

This comment has been minimized.

Copy link
Contributor

@leitzler leitzler commented Jan 18, 2020

👍 I think that I never saw any incorrect diagnostics back then either, but might have missed that.

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 18, 2020

Sorry, accidentally sent that last message too early.

Before our conclusions in #36601 (comment), we were making things worse by ignoring diagnostics for old file versions. It's quite possible, as you say, that the diagnostics were actually correct in those cases (but wouldn't have been guaranteed to be correct in all cases) and we were making them wrong by ignoring the diagnostics marked with old versions.

But given that a) we now filter diagnostics as a function of snapshot ID and b) we have removed the client filter (ignoring for one second that it wouldn't have worked properly in any case), we are at least able to see and hypothesise that only the file version number is wrong.

@zikaeroh

This comment has been minimized.

Copy link
Contributor

@zikaeroh zikaeroh commented Jan 18, 2020

In my case, I don't see that the diagnostics are correct. Here is a screencast and corresponding logs:

Starting with:

package main

func main() {
	x := 123
}

I see an error on x as it is unused. I hit enter after the 123. The error remains, as x is still unused. Now, I hit Ctrl+Z to undo the newline. The error goes away until I change the file to something else.

This is seen in the logs, where they start with didOpen:

{
  "textDocument": {
    "uri": "file:///home/jake/testproj/gopls/lastline/main.go",
    "languageId": "go",
    "version": 1,
    "text": "package main\n\nfunc main() {\n\tx := 123\n}\n"
  }
}

Followed by publishDiagnostics:

{
  "uri": "file:///home/jake/testproj/gopls/lastline/main.go",
  "version": 1,
  "diagnostics": [
    {
      "range": { "start": { "line": 3, "character": 1 }, "end": { "line": 3, "character": 2 } },
      "severity": 1,
      "source": "compiler",
      "message": "x declared but not used"
    }
  ]
}

Now, me hitting enter does:

{
  "textDocument": {
    "uri": "file:///home/jake/testproj/gopls/lastline/main.go",
    "version": 2
  },
  "contentChanges": [
    {
      "range": { "start": { "line": 3, "character": 9 }, "end": { "line": 3, "character": 9 } },
      "rangeLength": 0,
      "text": "\n\t"
    }
  ]
}

With no diagnostics sent, as they appear to be identical. Then, the undo results in the didChange:

{
  "textDocument": {
    "uri": "file:///home/jake/testproj/gopls/lastline/main.go",
    "version": 3
  },
  "contentChanges": [
    {
      "range": { "start": { "line": 3, "character": 9 }, "end": { "line": 4, "character": 1 } },
      "rangeLength": 2,
      "text": ""
    }
  ]
}

And the publishDiagnostics back:

{
  "uri": "file:///home/jake/testproj/gopls/lastline/main.go",
  "version": 1,
  "diagnostics": []
}

gopls is sending back "version 1 has no diagnostics", and not "version 3 has these diagnostics" (or sending nothing, since the diagnostics shouldn't have changed in any of these versions), the effect of which (in my editor) appears to be to just display nothing.

Maybe this is some other bug, as it's both old and wrong diagnostics. I plan to bisect this.

@zikaeroh

This comment has been minimized.

Copy link
Contributor

@zikaeroh zikaeroh commented Jan 18, 2020

Bisect says: https://go-review.googlesource.com/c/tools/+/214586

A log of the same done at the commit before that CL: https://gist.github.com/zikaeroh/deefe190ba7df51ef63e2646b5d3daad

In this log, there's only a single publishDiagnostics for the initial version and the singular error, presumably because 1, 2, and 3 all have the same error set. But in this case, gopls isn't "revising" the diagnostic list for version 1 after the fact.

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 18, 2020

I can confirm we're also see #36601 (comment) in govim as of 0cba7a3a9ee9

@myitcv

This comment has been minimized.

Copy link
Member Author

@myitcv myitcv commented Jan 19, 2020

FWIW we're having success using a partial revert of CL 214586. I've pushed up https://go-review.googlesource.com/c/tools/+/215239 to show that partial revert.

But I totally defer to @stamblerre on the proper fix. We're only doing this temporarily to fix govim master (@zikaeroh pointed out the problem after we had merged our upgrade to x/tools/gopls)

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Jan 21, 2020

Change https://golang.org/cl/215680 mentions this issue: internal/lsp: use correct file identities when computing diagnostics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.