Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: 'Access is denied' when renaming module cache directory #36568

Closed
kiwionly opened this issue Jan 15, 2020 · 88 comments
Closed

cmd/go: 'Access is denied' when renaming module cache directory #36568

kiwionly opened this issue Jan 15, 2020 · 88 comments
Labels
ExpertNeeded FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@kiwionly
Copy link

What version of Go are you using (go version)?

13.6

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

Window 10, intel

go env Output
$ go env

What did you do?

I had a mod file which is 1.13

When I run go build, it download the dependency, however, it end with error like this. e.g.

cannot load golang.org/x/net/html: rename C:\Users\502380\go\pkg\mod\golang.org\x\net@v0.0.0-20190620200207-3b0461eec859.tmp-839788141 C:\Users\502380\go\pkg\mod\golang.org\x\net@v0.0.0-20190620200207-3b0461eec859:

What did you expect to see?

build should successful without any error.

What did you see instead?

#This is running in jetbrain goland

go: extracting golang.org/x/net v0.0.0-20190620200207-3b0461eec859
build command-line-arguments: cannot load golang.org/x/net/html: rename C:\Users\502380\go\pkg\mod\golang.org\x\net@v0.0.0-20190620200207-3b0461eec859.tmp-839788141 C:\Users\502380\go\pkg\mod\golang.org\x\net@v0.0.0-20190620200207-3b0461eec859: Access is denied.

@kiwionly kiwionly changed the title Cannot rename dependency mod from name@version.tmp-xxx to name@version Cannot build: when rename dependency mod from name@version.tmp-xxx to name@version, fail Jan 15, 2020
@cagedmantis cagedmantis changed the title Cannot build: when rename dependency mod from name@version.tmp-xxx to name@version, fail cmd/go: Cannot build: when rename dependency mod from name@version.tmp-xxx to name@version, fail Jan 15, 2020
@cagedmantis cagedmantis added this to the Backlog milestone Jan 16, 2020
@cagedmantis cagedmantis added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jan 16, 2020
@cagedmantis cagedmantis changed the title cmd/go: Cannot build: when rename dependency mod from name@version.tmp-xxx to name@version, fail cmd/go: cannot build when rename dependency mod from name@version.tmp-xxx to name@version Jan 16, 2020
@cagedmantis
Copy link
Contributor

@kiwionly Could you provide an example of the code that produced the error, preferably as a Go playground link?

@jayconrod
Copy link
Contributor

Seems like the same issue as microsoft/vscode-go#2897. We've never been able to reproduce this.

Could you post a listing of the directory C:\Users\502380\go\pkg\mod\golang.org\x\?
Are you able to reproduce this or did it just happen once? How about after clearing the module cache with go clean -modcache?

@jayconrod jayconrod changed the title cmd/go: cannot build when rename dependency mod from name@version.tmp-xxx to name@version cmd/go: ' Jan 16, 2020
@jayconrod jayconrod changed the title cmd/go: ' cmd/go: 'Access is denied' when renaming module cache directory Jan 16, 2020
@kiwionly
Copy link
Author

kiwionly commented Jan 17, 2020

@cagedmantis @jayconrod I am doing go build with module using Jetbrain Goland, some how command prompt also able to produce the error.

Note that my application contain a lot of dependency, and i need to rerun > go build , sometime it success, sometime it will fail with error message below ( depend on the package, some package will have issue, then I run go build again, no more issue/)

Here is the error when running in command prompt using go get, but actually the error is same ( except go build is loading a lot of dependency compare to go get below, but github.com/go-playground/locales is the one that always fail )

C:\Users\502380\GolandProjects\myproject>go get github.com/go-playground/universal-translator
go: finding github.com/go-playground/universal-translator v0.17.0
go: downloading github.com/go-playground/universal-translator v0.17.0
go: extracting github.com/go-playground/universal-translator v0.17.0
go: downloading github.com/go-playground/locales v0.13.0
go: extracting github.com/go-playground/locales v0.13.0
..\..\go\pkg\mod\github.com\go-playground\universal-translator@v0.17.0\errors.go:7:2: rename C:\Users\502380\go\pkg\mod\github.com\go-playground\locales@v0.13.0.tmp-022886503
 C:\Users\502380\go\pkg\mod\github.com\go-playground\locales@v0.13.0: Access is denied.

@bcmills
Copy link
Contributor

bcmills commented Jan 22, 2020

For the rename of that directory, we use a separate lockfile:

unlock, err := lockVersion(mod)

// Check whether the directory was populated while we were waiting on the lock.
fi, err = os.Stat(dir)

if err := os.Rename(tmpDir, dir); err != nil {

So the only possibilities I can see for this failure are:

  1. A bug in our lockedfile.Mutex or os.Rename implementation is causing the file-locks to not actually synchronize the other I/O operations.
  2. Something (perhaps Goland, or perhaps an antivirus scanner?) is accessing one or both of the directories to be renamed without holding the associated file lock.
  3. A bug in the Windows kernel or filesystem implementation is causing the file-locks to not actually synchronize the other I/O operations.

Of course, it is also possible that I have missed something.

For cause (1), we should fix our implementation.

For cause (2) or (3), we should ensure that a bug is filed against the appropriate other project, and then we could perhaps use robustio.Rename instead of os.Rename if we believe that will help. However, I would want some way to empirically reproduce and measure the problem and the mitigation to verify that it actually does help.

CC @jstarks @alexbrainman @zx2c4 @networkimprov in case they have any insights into how to debug this.

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. DevExp anything around developer experience ExpertNeeded and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jan 22, 2020
@networkimprov
Copy link

networkimprov commented Jan 22, 2020

@bcmills is this the relevant WinAPI call for lockedfile.Mutex? How is it used here?
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-lockfileex

Since the error is intermittent, it seems like retrying os.Rename() might be the solution. Unrelated directory updates could have saturated the NTFS journal (@jstarks is that a reasonable guess?)

A way to test that might be a set of goroutines, each repeatedly renaming a separate file between two unique names in the same directory; there's no possibility of name collisions, only concurrent directory updates.

PS: If that's the problem, I don't think it's a bug in NTFS per se...

@bcmills
Copy link
Contributor

bcmills commented Jan 22, 2020

Yes, lockedfile.Mutex does use LockFileEx internally. (The implementation is in src/cmd/go/internal/lockedfile/internal/filelock/filelock_windows.go.)

@bcmills
Copy link
Contributor

bcmills commented Jan 22, 2020

What would it mean to saturate the NTFS journal? If the program is outpacing the buffering capacity of the filesystem, I would expect windows.MoveFileEx to block until the buffer clears far enough to enqueue the operation...

@networkimprov
Copy link

My guess about NTFS may be wrong. It's the only reason I could imagine for unexplained rename failures. My test concept above might prove it...

@networkimprov
Copy link

Some info on NTFS journaling is here, tho not linked to a particular Windows version. It describes "Log Full", but isn't clear about the end result for filesystem ops which encounter it. Some are retried, but not all? I can't tell.

http://www.ntfs.com/transaction.htm

@alexbrainman
Copy link
Member

CC @jstarks @alexbrainman @zx2c4 @networkimprov in case they have any insights into how to debug this.

I don't have any good ideas.

I am not familiar with that code. And I don't have free time to learn the code to debug it.

Alex

@bcmills bcmills removed the DevExp anything around developer experience label Jan 23, 2020
@networkimprov
Copy link

@bcmills here's a test app which repeatedly renames a set of files in a single directory. It defaults to 10 concurrent renaming loops, and a command line argument lets you pick the quantity.

I tried this using Go 1.13.3 on Win7 with 10, 33, 100, and 250 loops, but saw no errors. Maybe it behaves differently on Win10, or a higher loop count is required? Maybe it should also write files, not just change directory entries? Or maybe it disproves my hypothesis :-)

I watched the Task Manager while it ran, and noticed that SearchIndexer.exe also consumes a lot of cpu time.

package main

import ( "fmt"; "os"; "strconv" )

const kDir = "zwinrnd/"

func main() {
   var err error
   aGr := 10
   if len(os.Args) > 1 {
      aGr, err = strconv.Atoi(os.Args[1])
      if err != nil { panic(err) }
   }
   err = os.MkdirAll(kDir, 0700)
   if err != nil { panic(err) }
   for a := 0; a < aGr; a++ {
      go func(c int) {
         cA, cB := 'a', 'b'
         cFd, err := os.Create(kDir + fmt.Sprintf("%d-%c", c, cA))
         if err != nil && !os.IsExist(err) { panic(err) }
         cFd.Close()
         for {
            err = os.Rename(kDir + fmt.Sprintf("%d-%c", c, cA),
                            kDir + fmt.Sprintf("%d-%c", c, cB))
            if err != nil {
               fmt.Println(err)
            } else {
               if cA == 'a' { cA, cB = 'b', 'a'
               } else       { cA, cB = 'a', 'b' }
            }
         }
      }(a)
   }
   select {}
}

@qmuntal
Copy link
Contributor

qmuntal commented Feb 11, 2020

I see this error many times in my Windows 10 workstation when downloading dependencies that are not cached locally, and it also happen to all my colleagues. In our case it seems to be related to the on-access scanner enterprise antivirus detecting as a threat the go modules cache renames. It is happening since the adoption of modules on Go 1.11.

The only solution I've found is to create an scanning exception for the go/pkg/mod folder that I activate when this error occurs.

@alfmos
Copy link

alfmos commented Feb 11, 2020

I don’t know if this can be useful. I posted on vscode-go a way to reproduce the issue.
microsoft/vscode-go#2897

@bcmills
Copy link
Contributor

bcmills commented Feb 12, 2020

@alfmos, do you have repro steps that do not involve indirection through VSCode? Ideally we should have a regression test that we can check in to the main go repository and run on the Go project's builders, which means that it cannot depend on going through a specific third-party IDE.

@alfmos
Copy link

alfmos commented Feb 18, 2020

@bcmills I've just tried from cmd and I get the error.

go version: 1.13.7 windows/amd64
go env:
C:\Users\alfonso moscato>go env
set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\alfonso moscato\AppData\Local\go-build
set GOENV=C:\Users\alfonso moscato\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=d:\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=NUL
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=d:\temp\go-build604923562=/tmp/go-build -gno-record-gcc-switches

Steps to reproduce the issue:
C:\Users\alfonso moscato>go get -v github.com/mdempsky/gocode
go: finding github.com/mdempsky/gocode latest
go: finding golang.org/x/tools latest
go: extracting golang.org/x/tools v0.0.0-20200216192241-b320d3a0f5a2
build github.com/mdempsky/gocode: cannot load golang.org/x/tools/go/gcexportdata: rename d:\go\pkg\mod\golang.org\x\tools@v0.0.0-20200216192241-b320d3a0f5a2.tmp-116032742 d:\go\pkg\mod\golang.org\x\tools@v0.0.0-20200216192241-b320d3a0f5a2: Access is denied.

Tell me what data you need and I'll collect them.
I could let you inspect my pc from remote if needed

@jayconrod
Copy link
Contributor

I spent some more timing looking at this today. Based on the MoveFileEx documentation and various threads on Windows Q&A sites and mailing list, I think the most likely explanation for this issue is that another process (probably antivirus or search indexer) is opening files in the module directory before it's completely extracted.

Here's our normal sequence of operations at a high level:

  • os.Stat the module version directory, for example, %GOPATH%\pkg\mod\golang.org\pkg\mod@v0.2.0. If it exists, we assume it's complete.
  • Download the module version's zip file, if it's not already in the cache.
  • os.OpenFile on the module version's lock file, possibly creating it.
  • LockFileEx to lock the lock file.
  • os.Stat the module version directory again. If it exists, another process got the lock and extracted the module zip first, so we assume the directory is complete.
  • os.RemoveAll on temporary module version directories with a ".tmp-" suffix. These were created by other processes which failed. We hold the lock for this version, so these directories should not be actively used. Errors are ignored here.
  • Create a temporary module version directory with a ".tmp-" suffix. Extract the module zip file there.
  • os.Rename the temporary module version directory to its final location. This calls MoveFileEx, which is where we're encountering the error.
  • Make the directory and its contents read-only.
  • UnlockFileEx on the lock file.
  • Close on the lock file.

We're getting ERROR_ACCESS_DENIED on the os.Rename call. I've verified this error is reported when os.Rename is called on a directory when another process has a file open in that directory. So while I can't confirm this is what's happening, it seems likely. Note that by default, the module cache is not in a hidden directory: %USERPROFILE%\pkg\mod will probably be read by search indexers, backup tools, and anything else that watches the user's home directory.

I thought about an alternate flow where we extract a module version to its final location, then create a special file indicating everything has been extracted. However, if we encounter an error extracting the module, we would need to call os.RemoveAll on the partial directory the next time, and that would fail if other processes have files open. So it's just a different kind of flaky.


I don't think we can fix this, but some mitigations are possible.

First, we can check if the error is ERROR_ACCESS_DENIED and add some information to the error message for this specific case.

Second, we can retry os.Rename a few times before reporting an error. It looks like cmd/go/internal/robustio.Rename can already do this.

@S1erraQ
Copy link

S1erraQ commented Apr 27, 2020

@S1erraQ if you had the problem before, ensure cleaning your complete mod cache before retrying

I did try that before posting. No effect. Also the output above is in a fresh go folder as shown.

@jayconrod
Copy link
Contributor

@S1erraQ Thanks, that confirms this is the same issue.

I missed something though: Go 1.14.2 is compatible with the new behavior triggered by modcacheunzipinplace=1, but it doesn't actually implement that new behavior yet.

If you're comfortable living on the bleeding edge, gotip can build a Go toolchain from the master branch which has the new behavior. This will also be in the 1.15 beta, which should be out in June.

@Hades32
Copy link

Hades32 commented Apr 27, 2020

I missed something though: Go 1.14.2 is compatible with the new behavior triggered by modcacheunzipinplace=1, but it doesn't actually implement that new behavior yet.

Really? I specifically searched for the commit and traced it to release 1.14.2 and could confirm a change in behavior ONLY after setting the env. Seemed like a big coincident...

@jayconrod
Copy link
Contributor

@Hades32 Probably a coincidence. There is a random factor here.

1.14.2 has CL 221820, which adds compatibility with the new behavior. It also increases the retry interval from 500ms to 2000ms.

CL 221157 is the new behavior on master. It hasn't been backported since it's too big/risky.

@Hades32
Copy link

Hades32 commented Apr 28, 2020

@jayconrod thanks for the explanation. I guess then the 2s interval is what truly helped me...

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/230537 mentions this issue: doc/go1.15: add notes for GOMODCACHE, modcacheunzipinplace

gopherbot pushed a commit that referenced this issue Apr 28, 2020
For #36568
For #34527

Change-Id: Ieea4b4a7644e9c957f48d08d2e172e39b571502f
Reviewed-on: https://go-review.googlesource.com/c/go/+/230537
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
xujianhai666 pushed a commit to xujianhai666/go-1 that referenced this issue May 21, 2020
For golang#36568
For golang#34527

Change-Id: Ieea4b4a7644e9c957f48d08d2e172e39b571502f
Reviewed-on: https://go-review.googlesource.com/c/go/+/230537
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@herohenu
Copy link

@kiwionly do u use Ubuntu 18.04 on windows 10 ?
do u use Mcafee Endpoint Security ?
if so just search and find ubuntu1804.exe in your system
then right click ubuntu1804.exe-> properties -> compatibility(second tab) -> Setting(select run as administration ) ,then click Application (A) , all is ok ;

if it still build fail ,please click security tab , and click Advaced(V)
make sure your account access has Complete control ,default all allow is TrustedInstaller user ,this will prevented by Mcafee ;

wish it helped all who met this err ,I had troubled by it for months ;

@kiwionly
Copy link
Author

kiwionly commented Jul 1, 2020

@kiwionly do u use Ubuntu 18.04 on windows 10 ?
do u use Mcafee Endpoint Security ?
if so just search and find ubuntu1804.exe in your system
then right click ubuntu1804.exe-> properties -> compatibility(second tab) -> Setting(select run as administration ) ,then click Application (A) , all is ok ;

if it still build fail ,please click security tab , and click Advaced(V)
make sure your account access has Complete control ,default all allow is TrustedInstaller user ,this will prevented by Mcafee ;

wish it helped all who met this err ,I had troubled by it for months ;

@herohenu I build GO on window 10 with McAfee, unfortunately I no admin right to change any settings.

Anyway, thx for your remind.

@jasonmf
Copy link

jasonmf commented Aug 26, 2020

Also hit this issue:

  • Windows 10 1903/18362.1016 with debian under WSL 1.
  • go version go1.15 linux/amd64
  • Running McAfee AV. There's both the on-demand scanning and currently a schedule scan running
  • I had gone so far as to rm -rf ~/go/* but that didn't help
  • Worked around with: GODEBUG=modcacheunzipinplace=1 go mod tidy

For fellow McAfee sufferers, this vague page may be related: https://kc.mcafee.com/corporate/index?page=content&id=KB91411

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/258798 mentions this issue: cmd/go/internal/modfetch: always extract module directories in place

@egodigitus
Copy link

Is the fix from https://golang.org/cl/258798 part of release go1.16beta1 ?

@bcmills
Copy link
Contributor

bcmills commented Jan 27, 2021

@egodigitus, yes, that change is included in go1.16beta1.

@egodigitus
Copy link

go1.16beta1 fixed the bug for me 👍
OS: Windows 10
Running golang inside WSL Ubuntu 20.04

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ExpertNeeded FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
None yet
Development

No branches or pull requests