Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/gopls: `go mod tidy` cache entry races with file system changes #40269

Open
leitzler opened this issue Jul 17, 2020 · 23 comments
Open

x/tools/gopls: `go mod tidy` cache entry races with file system changes #40269

leitzler opened this issue Jul 17, 2020 · 23 comments
Labels
Milestone

Comments

@leitzler
Copy link
Contributor

@leitzler leitzler commented Jul 17, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.4 linux/amd64

golang.org/x/tools/gopls@v0.0.0-20200716193322-f2c07d7d8ec1

What did you do?

After updating x/tools & gopls from 0cc1aa72b347 to f2c07d7d8ec1 in govim we started to see flakes in our tempmodfile tests:

--- FAIL: TestScripts (5.29s)
    main_test.go:77: using gopls at "/tmp/gobin-gopls-installdir164365085/gopls"
    --- FAIL: TestScripts/scenario_tempmodfile (0.00s)
        --- FAIL: TestScripts/scenario_tempmodfile/modfile_changes (56.47s)
            testscript.go:382: 
                WORK=/artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes
                # A simple test that verifies we get a diagnostic for imports
                # that are missing from our go.mod file (0.000s)
                # Open main.go and save to ensure we are actually doing something via gopls (0.225s)
                # Sleep because we don't have an event we are waiting for (30.000s)
                # Verify that there have been no changes to go.mod (0.000s)
                # Verify that we diagnostics for the import in main.go that can't
                # be satisfied by the go.mod on disk (25.785s)
                > vimexprwait errors.golden GOVIMTest_getqflist()
                [stderr]
                -[]
                +[
                +  {
                +    "bufname": "main.go",
                +    "col": 8,
                +    "lnum": 3,
                +    "module": "",
                +    "nr": 0,
                +    "pattern": "",
                +    "text": "example.com/blah is not in your go.mod file.",
                +    "type": "",
                +    "valid": 1,
                +    "vcol": 0
                +  }
                +]
                [exit status 1]
                FAIL: testdata/scenario_tempmodfile/modfile_changes.txt:20: unexpected command failure

Essence of the test case that runs with tempModfile set:

# Open main.go and save to ensure we are actually doing something via gopls
vim ex 'e main.go'
vim ex 'w'

# Sleep because we don't have an event we are waiting for
sleep $GOVIM_ERRLOGMATCH_WAIT

# Verify that there have been no changes to go.mod
cmp go.mod go.mod.golden

# Verify that we diagnostics for the import in main.go that can't
# be satisfied by the go.mod on disk
vimexprwait errors.golden GOVIMTest_getqflist()

-- go.mod --
module mod.com

-- go.mod.golden --
module mod.com

-- main.go --
package main

import "example.com/blah"

func main() {
    println(blah.Name)
}
-- errors.golden --
[
  {
    "bufname": "main.go",
    "col": 8,
    "lnum": 3,
    "module": "",
    "nr": 0,
    "pattern": "",
    "text": "example.com/blah is not in your go.mod file.",
    "type": "",
    "valid": 1,
    "vcol": 0
  }
]

Full logs from gopls is included in govim/govim#909.

@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 17, 2020

FYI - just updated the update pull request to include the latest commits as well, now at 6ddee643 - govim/govim#908

@myitcv
Copy link
Member

@myitcv myitcv commented Jul 17, 2020

A further three flakes as part of the merge of upgrading to golang/tools@6ddee64: https://github.com/govim/govim/actions/runs/172568134

@stamblerre stamblerre modified the milestones: Unreleased, gopls/v0.4.4 Jul 17, 2020
@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 17, 2020

Thanks for the report. I'm a bit confused by the error messages here:


[Trace - 22:08:45.140 PM] Received notification 'window/logMessage'.
Params: {"type":1,"message":"2020/07/16 22:08:45 warning: diagnose go.mod: err: exit status 1: stderr: go: finding module for package example.com/blah\ngo: found example.com/blah in example.com/blah v1.0.0\nmod.com: no Go source files\n\n\tdirectory=/artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes\n"}


[Trace - 22:08:45.141 PM] Received notification 'window/logMessage'.
Params: {"type":1,"message":"2020/07/16 22:08:45 warning: diagnose go.mod: err: exit status 1: stderr: go: finding module for package example.com/blah\ngo: found example.com/blah in example.com/blah v1.0.0\nmod.com: no Go source files\n\n\tdirectory=/artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes\n"}

How does govim specify the GOPROXY?

Worth noting that we have an equivalent test in the gopls regression test suite, which we haven't noticed to be flaky: test here.

@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 17, 2020

The tests uses Rogers goproxytest fork - github.com/rogpeppe/go-internal/goproxytest so the module is defined as:

$ cat example.com_blah_v1.0.0.txt
-- .mod --
module example.com/blah

-- .info --
{"Version":"v1.0.0","Time":"2018-10-22T18:45:39Z"}

-- go.mod --
module example.com/blah

-- main.go --
package blah

const Name = "example.com/blah"
@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 17, 2020

Regarding the error messages you see, I had another failure log opened and it didn't show there. I also opened up the gopls log linked from govim/govim#909 (comment) and no such entry there either.

I'm not sure that it has anything to do with the flakes.

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 17, 2020

How long do the tests wait for the diagnostics? Is the main.go file on disk when the tests starts, or does it only get saved later on?

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 17, 2020

Basically, go mod tidy is going to fail if main.go is not on disk yet (hence the mod.com: no Go source files error message). It will run successfully any time after that (since it seems like main.go is then saved, and a number of other file changes then come in). Each didChangeWatchedFiles request will cancel the previous one, so the correct diagnostics should only show up after the work from the final one (looks like there are 5) is completed. Depending on how long the tests wait for diagnostics, I could see a scenario where the diagnostics never finish being computed before the server is shut down.

@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 17, 2020

Using the govim.log from govim/govim#909 (comment) so that we have a common point to look at, I copied the file to a gist as well.

main.go is on the disk from the start, it is opened in the editor:
https://gist.github.com/leitzler/651250b175f76cb722ed4dd9599277eb#file-govim-log-L587
govim sends DidOpen to gopls:
https://gist.github.com/leitzler/651250b175f76cb722ed4dd9599277eb#file-govim-log-L622
main.go is saved:
https://gist.github.com/leitzler/651250b175f76cb722ed4dd9599277eb#file-govim-log-L653
CodeAction() called:
https://gist.github.com/leitzler/651250b175f76cb722ed4dd9599277eb#file-govim-log-L660
...
a 30s delay:
https://gist.github.com/leitzler/651250b175f76cb722ed4dd9599277eb#file-govim-log-L816

and then the rest of the log trying to see if gopls did send any diagnostics, that doesn't happen

@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 17, 2020

Each didChangeWatchedFiles request will cancel the previous one, so the correct diagnostics should only show up after the work from the final one

Comparing the log above against a test where it works, both of them has the same didChangeWatchedFiles call as the last one:

not working:

2020-07-17T09:01:41.863543_#3: gopls.DidChangeWatchedFiles() call; params:
2020-07-17T09:01:41.863543_#3: &protocol.DidChangeWatchedFilesParams{
2020-07-17T09:01:41.863543_#3:     Changes: {
2020-07-17T09:01:41.863543_#3:         {URI:"file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go", Type:2},
2020-07-17T09:01:41.863543_#3:     },
2020-07-17T09:01:41.863543_#3: }

working:

2020-07-17T10:41:35.622519_#4: gopls.DidChangeWatchedFiles() call; params:
2020-07-17T10:41:35.622519_#4: &protocol.DidChangeWatchedFilesParams{
2020-07-17T10:41:35.622519_#4:     Changes: {
2020-07-17T10:41:35.622519_#4:         {URI:"file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go", Type:2},
2020-07-17T10:41:35.622519_#4:     },
2020-07-17T10:41:35.622519_#4: }

and for reference, when it works the PublishDiagnostics seems to arrive just a few milliseconds later:

2020-07-17T10:41:35.632730_#4: PublishDiagnostics callback: &protocol.PublishDiagnosticsParams{
2020-07-17T10:41:35.632730_#4:     URI:         "file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go",
2020-07-17T10:41:35.632730_#4:     Version:     0,
2020-07-17T10:41:35.632730_#4:     Diagnostics: {
2020-07-17T10:41:35.632730_#4:         {
2020-07-17T10:41:35.632730_#4:             Range: protocol.Range{
2020-07-17T10:41:35.632730_#4:                 Start: protocol.Position{Line:2, Character:7},
2020-07-17T10:41:35.632730_#4:                 End:   protocol.Position{Line:2, Character:25},
2020-07-17T10:41:35.632730_#4:             },
2020-07-17T10:41:35.632730_#4:             Severity:           2,
2020-07-17T10:41:35.632730_#4:             Code:               nil,
2020-07-17T10:41:35.632730_#4:             Source:             "go mod tidy",
2020-07-17T10:41:35.632730_#4:             Message:            "example.com/blah is not in your go.mod file",
2020-07-17T10:41:35.632730_#4:             Tags:               nil,
2020-07-17T10:41:35.632730_#4:             RelatedInformation: nil,
2020-07-17T10:41:35.632730_#4:         },
2020-07-17T10:41:35.632730_#4:     },
2020-07-17T10:41:35.632730_#4: }
@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 17, 2020

It wasn't obvious in my previous comments; the test waits 30 seconds for the diagnostics to complete so that shouldn't be an issue.

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 17, 2020

Ok, thanks for the clarifications. I will try to write a gopls regression test that exactly mimics this flow to see if I can reproduce the issue. I am still a little bit confused though... here are the file change events that gopls sees for main.go:

[Trace - 09:01:41.667 AM] Sending notification 'textDocument/didOpen'.
Params: {"textDocument":{"uri":"file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go","languageId":"go","version":1,"text":"package main\n\nimport \"example.com/blah\"\n\nfunc main() {\n\tprintln(blah.Name)\n}\n"}}


[Trace - 09:01:41.860 AM] Sending notification 'workspace/didChangeWatchedFiles'.
Params: {"changes":[{"uri":"file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go","type":3}]}


[Trace - 09:01:41.862 AM] Sending notification 'workspace/didChangeWatchedFiles'.
Params: {"changes":[{"uri":"file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go","type":1}]}


[Trace - 09:01:41.862 AM] Sending notification 'workspace/didChangeWatchedFiles'.
Params: {"changes":[{"uri":"file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go","type":2}]}


[Trace - 09:01:41.863 AM] Sending notification 'textDocument/didSave'.
Params: {"textDocument":{"version":1,"uri":"file:///artefacts/govim/cmd/govim/scenario_tempmodfile/script-modfile_changes/main.go"}}

So, from what I can tell, we have:

  • main.go is opened
  • main.go is deleted from disk
  • main.go is created on disk
  • main.go is changed on disk
  • main.go is saved

Obviously gopls should handle whatever file changes come its way, but I just wanted to make sure this is intentional.

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 17, 2020

To reproduce this, I think need information about the file changes that happen on disk (how is the file text edited?). What state are the files in when the code action and formatting requests happen?

@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 18, 2020

I'm not surprised by the file events, even if I can't explain from the top of my head why we get a delete/create pair. File watching is a mess.. The sequence looks the same both when the test pass and fail, and there hasn't been any changes recently on the govim side with regards to this.

Just some details on how events are mapped on Linux, where the flakes appear:

  • Rename / Remove => Remove
  • Chmod / Write => Changed
  • Create => Created

(how is the file text edited?)

The file isn't edited at all, it's just opened up in a running editor, and then saved back to disk. That's it.

Recap with the same log as before:

  1. "User" to VIM - "Open file"
  2. VIM to govim - an event that the file is opened
  3. govim to gopls - DidOpen()
  4. user to VIM - "Save file"
  5. VIM to govim - event that the file is being saved
  6. govim to gopls - CodeAction()
  7. govim to gopls - Formatting()
  8. govim to gopls - DidChangeWatchedFiles() - detected file removed/renamed
  9. govim to gopls - DidChangeWatchedFiles() - detected file created
  10. govim to gopls - DidChangeWatchedFiles() - detected file chmod/write
  11. VIM to govim - event that the file has been saved
  12. govim to gopls - DidSave()
  13. govim to gopls - DidChangeWatchedFiles() - detected file chmod/write

When the test fails it waits here for 30 seconds for diagnostics, as in the log linked. Otherwise gopls sends a PublishDiagnostics request after a few milliseconds here.

The order of events is the same when it works and when it doesn't.

@myitcv
Copy link
Member

@myitcv myitcv commented Jul 19, 2020

Thanks for the detailled analysis, @leitzler.

@stamblerre - I think the file change events are a red herring because the file is open in editor (and gopls knows that). Previously govim used to filter file change events for files already open, but per @findleyr they are no longer filtered govim/govim@fc0e07f (note this change was made back in February).

I'm not surprised by the file events, even if I can't explain from the top of my head why we get a delete/create pair. File watching is a mess.

I think it's probably a combination of how Vim writes files in certain situations and file watching. Both are a mess, but per my comment above, I think these details are actually irrelevant in this case.

The sequence looks the same both when the test pass and fail, and there hasn't been any changes recently on the govim side with regards to this.

This point is also relevant.

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 19, 2020

@stamblerre - I think the file change events are a red herring because the file is open in editor (and gopls knows that).

This is true, but the go command still only sees files that are on disk. For the -modfile flag, we do use whatever content gopls is aware of so it should work either way. Still, my suspicion really is that the file change events are playing in some way with a cache key or affecting the way the output of the go command. We already have the equivalent regression test (without the file change events), and we've never seen it flake.

In any case, it would be really good to know what the exact file content changes are, because then I can reproduce this consistently with a gopls regression test. We have a little bit more control with those tests (we use progress notifications), so I will be able to reproduce this consistently, rather than debugging a flake.

@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 19, 2020

In any case, it would be really good to know what the exact file content changes are

The content never change.

Is it something missing in my comments above? Or some logging that we can activate that would help?

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 19, 2020

I'm wondering about the DidChangeWatchedFiles notifications, since they don't have content associated with them. When the file is created again, does it have content in it or is it empty? What is the content that's written to the file in the two DidChangeWatchedFiles notifications that are changes (not creation/deletion)? Or is it just always the same?

@leitzler
Copy link
Contributor Author

@leitzler leitzler commented Jul 19, 2020

Always the same, the remove/create events are either an artifact from gvim or the filesystem watcher. It's just the same file with the same content through the entire test.

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 19, 2020

Change https://golang.org/cl/243537 mentions this issue: internal/lsp: handle on-disk file deletions for opened files

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 19, 2020

I think I caught the issue in https://golang.org/cl/243537. I've marked it only as "Updates" because I can't be sure that was the only issue, so once it's merged I'll follow up to see if that fixed the issue.

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 19, 2020

Actually, this race exposes a deeper issue, which is that the way we cache go mod tidy diagnostics is not fully correct. We use the state of the workspace at the given snapshot to construct a cache key, but the go mod tidy command may actually run at a much later point in time. The contents of the files on disk at the point may be completely different -- that's the issue here -- main.go is deleted by the time we run the command, and as a result, we have the wrong cache key.

The best solution I can think of right now is to check if any state has changed on-disk before we run go mod tidy, and if so, to return an error and purge the entry from the cache. This wouldn't fully fix the race, but it might mitigate it. @heschik, @findleyr: What do you think?

Yet another case that could be fixed with #39958...

@stamblerre stamblerre modified the milestones: gopls/v0.4.4, gopls/v0.5.0 Jul 20, 2020
gopherbot pushed a commit to golang/tools that referenced this issue Jul 21, 2020
Previously, we only updated the opened file's overlay, but not the
snapshot. This meant that the snapshot was still operating with stale
data. Invalidating the snapshot creates a new snapshot with the correct
set of overlays.

The test is skipped because it will flake until we have a better caching
strategy for `go mod tidy` results.

Updates golang/go#40269

Change-Id: Ia8d1ae75127a1d18d8877923e7a5b25b7bd965ac
Reviewed-on: https://go-review.googlesource.com/c/tools/+/243537
Run-TryBot: Rebecca Stambler <rstambler@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Jul 22, 2020

https://golang.org/cl/243537 addressed some of the issues I noticed while debugging this test, but it doesn't resolve the problem. I think the next step here will be to add an API to the cache that allows us to delete entries, so that we can check if the workspace state changed between the time we cached and the time we ran the command.

I think that will likely happen after @heschik's changes to the cache, so in the meantime, I am hoping this is enough of an edge case that it shouldn't be a critical concern. Setting v1.0.0 as the milestone, as I think it would still be best to have this in by then.

@stamblerre stamblerre modified the milestones: gopls/v0.5.0, gopls/v1.0.0 Jul 22, 2020
@stamblerre stamblerre changed the title x/tools/gopls: possible race when using tempModfile x/tools/gopls: `go mod tidy` race with file system changes Jul 22, 2020
@stamblerre stamblerre changed the title x/tools/gopls: `go mod tidy` race with file system changes x/tools/gopls: `go mod tidy` cache entry races with file system changes Jul 22, 2020
@myitcv
Copy link
Member

@myitcv myitcv commented Jul 30, 2020

For completeness we are still seeing this as of 6467de6f59a7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.