Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: file write blocks subsequent Open() with "sharing violation" in OneDrive #35219

Open
networkimprov opened this issue Oct 28, 2019 · 22 comments

Comments

@networkimprov
Copy link

@networkimprov networkimprov commented Oct 28, 2019

What version of Go are you using (go version)?

$ go version
go version go1.13.3 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Windows 10, amd64 (go env not available on that machine)

What did you do?

EDIT: updated reproducer

package main

import ( "io/ioutil"; "time" )

func main() {
   aFile := "z35219.txt"
   for {
      err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
      if err != nil { panic(err) }
      aTmr := time.NewTimer(10*time.Second)
      for aRead := true; aRead; {
         select {
         case <-aTmr.C:
            aRead = false
         default:
            _, err := ioutil.ReadFile(aFile)
            if err != nil { panic(err) }
         }
      }
   }
}

What did you expect to see?

No error; this is the case on Win7.

What did you see instead?

On Win10:

panic: open z35219.txt: The process cannot access the file because
 it is being used by another process.

goroutine 1 [running]: main.main() /.../z35219.go:17 +0x10a

Discussion

I see this intermittently in a real app.

EDIT: the following is related to the initial reproducer
I wondered whether GetFileAttributesEx() changed in Windows 10 (or 8) to disallow concurrent read access. But the above program still fails (maybe less often?) on Win10 if I patch stat() to only call CreateFile(), like so:

const READ_CONTROL, FILE_READ_ATTRIBUTES uint32 = 0x20000, 0x80 // from WinAPI docs
h, err = CreateFile(namep, READ_CONTROL|FILE_READ_ATTRIBUTES, FILE_SHARE_READ, nil,
                    OPEN_EXISTING, createFileAttrs, 0)

Those arguments are suggested here: https://stackoverflow.com/questions/52813820/createfile-fails-because-of-sharin-violation-because-some-other-os-process-is-us

Stat(): https://golang.org/src/os/stat_windows.go#L76
Open(): https://golang.org/src/syscall/syscall_windows.go#L266

@jstarks any ideas?

cc @alexbrainman @zx2c4 @bcmills
@gopherbot add release-blocker OS-Windows

@networkimprov networkimprov changed the title syscall: GetFileAttributes() blocks concurrent Open() with "sharing violation" os: Stat() blocks concurrent Open() with "sharing violation" Oct 28, 2019
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 28, 2019

Is this problem new in 1.13?

@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Oct 28, 2019

I'm unable to reproduce this on the latest version of Windows 10 with Go 1.13.1 (I should probably update...).

@networkimprov, do you have third-party antivirus software installed?

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 28, 2019

Hi John, how long did you let the program run? It could take several minutes to show up

The machine seeing these failures is not mine (I only have Win7) but I've forwarded your Q.

Note Go 1.13.3 has a significant change to the Windows runtime, tho it's prob not a factor here.

@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Oct 29, 2019

I let it run overnight with no repro.

@slonopotamus

This comment has been minimized.

Copy link

@slonopotamus slonopotamus commented Oct 29, 2019

@networkimprov

You only get SHARING_VIOLATION from CreateFile if someone has the same file already open in a such access mode that is not listed in your sharing modes.

For example, I hit SHARING_VIOLATION in git-lfs/git-lfs#3880:
one process calls os.Rename while another tries to os.Open destination file.

os.Stat doesn't do CreateFile, so I think your problem is caused by something different. You could try using Process Monitor to record kernel calls related to your file to find out who is opening in a mode that interferes with your app.

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 29, 2019

Thanks for suggestion re Process Monitor.

Assuming this is due to a legitimate system config (ie not malware) any suggestions for fixes? Patching syscall.Open() to retry N times on sharing-violation in a loop with runtime.Gosched()?

one process calls os.Rename while another tries to os.Open destination file.

I tried really hard to get this fixed... #32088 :-(

@slonopotamus

This comment has been minimized.

Copy link

@slonopotamus slonopotamus commented Oct 29, 2019

From my current knowledge, retrying is the only working option. You may want to look at robustio.go file from Go, it has related stuff, but not exposed for public usage.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Oct 29, 2019

But the above program still fails (maybe less often?) on Win10 if I patch stat() to only call CreateFile(),

Did you mean patch open? What business would stat have calling CreateFile?

@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Oct 29, 2019

stat calls CreateFile in a fallback path since apparently there are cases where GetFileAttributesEx fails.

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 29, 2019

@rsc WinAPI CreateFile() doesn't create files, it opens them :-)

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 29, 2019

The box showing this reportedly runs Norton Lifelock.
From systeminfo command: OS Version: 10.0.18362 N/A Build 18362

This is a better reproducer. The problem is triggered by ioutil.WriteFile().

package main

import ( "io/ioutil"; "time" )

func main() {
   aFile := "z35219.txt"
   for {
      err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
      if err != nil { panic(err) }
      aTmr := time.NewTimer(10*time.Second)
      for aRead := true; aRead; {
         select {
         case <-aTmr.C:
            aRead = false
         default:
            _, err := ioutil.ReadFile(aFile)
            if err != nil { panic(err) }
         }
      }
   }
}
@networkimprov networkimprov changed the title os: Stat() blocks concurrent Open() with "sharing violation" os: file open+write+close blocks subsequent Open() with "sharing violation" Oct 29, 2019
@iWdGo

This comment has been minimized.

Copy link
Contributor

@iWdGo iWdGo commented Oct 30, 2019

Issue doesn't reproduce on my laptop. The latest submitted code ran for a couple of hours without issue.

>ver

Microsoft Windows [Version 10.0.18362.418]
@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Oct 30, 2019

In the past I have seen antivirus software keep files open for analysis after the file has been closed, blocking subsequent opens. I do not know whether Norton Lifelock has this problem. It would be interesting to temporarily disable all such security software on the machine (this can be difficult!) and try the repro again. If it stops reproing, then the best course of action would be to report the issue to Symantec.

If it continues to repro, I would suggest looking for any other atypical driver software in the file system stack. Try running fltmc.exe to see if there are any interesting file system filters installed, for example--compare the result to a stock Windows installation.

In any case, I don't think there's any evidence of a Go bug here, and it seems inappropriate for Go to try to work around reliability issues with basic file system APIs. These bugs need to be fixed in the components that are at fault.

@slonopotamus

This comment has been minimized.

Copy link

@slonopotamus slonopotamus commented Oct 30, 2019

it seems inappropriate for Go to try to work around reliability issues with basic file system APIs

Go does that for its internal needs though.

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 30, 2019

Thanks to the folks trying to reproduce this. It's apparently due to OneDrive! After patching syscall.Open() to retry on sharing-violation, the reproducer eventually fails in Read() with "cloud operation was not completed before the time-out period". (Wow, crappy caching :-)

Here's my syscall.Open() patch:

diff --git a/src/syscall/syscall_windows.go b/src/syscall/syscall_windows.go
index af562a8..aa537e2 100644
--- a/src/syscall/syscall_windows.go
+++ b/src/syscall/syscall_windows.go
@@ -265,6 +265,8 @@ func makeInheritSa() *SecurityAttributes {
 }

+// enable retry on ERROR_SHARING_VIOLATION
+var Open_RetryErrorShare = false

 func Open(path string, mode int, perm uint32) (fd Handle, err error) {
        if len(path) == 0 {
@@ -311,8 +313,14 @@ func Open(path string, mode int, perm uint32) (fd Handle, err error) {
        default:
                createmode = OPEN_EXISTING
        }
-       h, e := CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL, 0)
-       return h, e
+       for i := 0; true; i++ {
+               fd, err = CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL, 0)
+               if err != Errno(32) || !Open_RetryErrorShare || i == 999 {
+                       break
+               }
+               runtime.Gosched()
+       }
+       return
 }

 func Read(fd Handle, p []byte) (n int, err error) {
@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Oct 30, 2019

Interesting. Are you trying this on a path that is backed up by OneDrive?

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 30, 2019

Yes, the reproducers were all running within a OneDrive tree. We're trying them again now in a local-only path...

@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Oct 30, 2019

I'll try to get a repro on OneDrive and report the issue internally.

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 31, 2019

We haven't reproduced either failure since moving the directory out of the OneDrive tree.

How likely this is to appear might depend on the quality of your connection to OneDrive, traffic on the LAN, etc.

@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Oct 31, 2019

This might be a more effective reproducer:

package main

import ( "io/ioutil"; "os"; "time" )

func main() {
   aFile := "z35219.txt"
   err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
   if err != nil { panic(err) }
   go func() {
      aTmr := time.NewTicker(5*time.Second)
      for {
         <-aTmr.C
         err := ioutil.WriteFile(aFile, []byte("abcdefghijklmnopqrstuvwxyz"), 0600)
         if err != nil { panic(err) }
      }
   }()
   for {
      aFd, err := os.Open(aFile)
      if err != nil { panic(err) }
      err = aFd.Close()
      if err != nil { panic(err) }
   }
}
@jstarks

This comment has been minimized.

Copy link

@jstarks jstarks commented Oct 31, 2019

I was able to repro this on OneDrive. We're looking into it internally.

@networkimprov networkimprov changed the title os: file open+write+close blocks subsequent Open() with "sharing violation" os: file write blocks subsequent Open() with "sharing violation" in OneDrive Oct 31, 2019
@bcmills bcmills added this to the Backlog milestone Nov 5, 2019
@networkimprov

This comment has been minimized.

Copy link
Author

@networkimprov networkimprov commented Nov 13, 2019

@jstarks, any developments? Is Go implicated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.