-
Notifications
You must be signed in to change notification settings - Fork 1k
dep init freeze for hours #947
Comments
hi! thanks for the issue. this is quite odd. did you run this on an existing project, or something new? if the former, what if any tool were you using previously? is this reproducible? no places in dep jump to mind where it could hang prior to generating any output. unless...@matjam, maybe a lockfile thing? strace output would also probably help enormously to clear this up. |
might be. maybe I should add some stderr logging if its waiting for >60 seconds .. at least until you figure out the notification framework thing. |
I have the same problem! I use go in docker container and it is a very simple program. |
@matjam yeah, i'll relent and break my rule a bit more - we can add some print statements around the lock. probably fire one right away if it decides we have to wait, then...maybe one every 10s or so thereafter? |
@sdboyer okay, I'll give you a PR with something that does that. I'll use 15s if only to be contrary. We are just assuming that it's the locking thing, but, it's the most likely culprit. I did do some research; the locking library is mostly correct in it's behaviour; there is a potential race there, but I've not been able to quantify what the real risk is of hitting it. What is the addage? Race conditions are so rare that they always happen? Something like that. |
waiting for something, we at least print a message to stderr about it. Hopefully will make situations like what is described in golang#947 obvious.
ok, we have the warning message printing now when the lockfile is busy. @jonahfang, @cpapidas - could you update and see if you're still experiencing the indefinite hang? |
@sdboyer It keep saying and still hang:
|
@jonahfang OK, can we get some more details about the system?
@sdboyer Its this code in the lockfile package: // return value intentionally ignored, as ignoring it is part of the algorithm
_ = os.Link(tmplock.Name(), name)
fiTmp, err := os.Lstat(tmplock.Name())
if err != nil {
return err
}
fiLock, err := os.Lstat(name)
if err != nil {
// tell user that a retry would be a good idea
if os.IsNotExist(err) {
return ErrNotExist
}
return err
} So it's failing to link. @nightlyone any ideas? |
@jonahfang also, can you try the following $ go get github.com/nightlyone/lockfile
$ cd $GOPATH/src/github.com/nightlyone/lockfile
$ go test you should see output like this: $ go test
PASS
ok github.com/nightlyone/lockfile 0.011s |
@matjam , the test result of nightlyone/lockfile as follows:
|
@jonahfang yeah something is wrong with that filesystem. still need to see output of "mount" |
I am being run inside a container, there is not an existing sm.lock file. I use image |
I run docker image in MacBookPro using Docker Toolbox:
|
My guess is your container doesn't allow writes to the filesystem for whatever reason? |
But I use glide and godep well. |
But when it's not in a container, it works fine. So... |
I use glide and godep in the same docker container well. |
That really means nothing, just because other things work, doesn't mean the container isn't broken, or the way you're using it isn't broken, or that there isn't something wrong with your kernel, or ... That "operation not permitted" message is coming from the kernel. It's a response to a system call for doing file operations on /go/src/github.com/nightlyone/lockfile/test_lockfile.pid.397504553 if the kernel is saying "no" that's a problem with either the container you're using, or docker, or something else on your system. |
My docker container use ubuntu:14.04,and filesystem mounted througth VirtualBox volume. |
It's because it's trying to do a hard link inside the container. That container, or your system, doesn't allow that, so it's failing. |
@sdboyer I think we need to change the way we do locking, this is another issue that's being caused by the hard link in the lockfile package; there was the ReFS issue #913 and now this. @jonahfang are you running Windows 10 with ReFS for the underlying filesystem by any chance? |
No, I use mac only. |
Well, the container, or something, is not allow hard links in the filesystem, so that's why it's failing. |
@jonahfang Can you please grab this library and test it in your container. You should get the following output: $ go get gopkg.in/check.v1
$ go get github.com/theckman/go-flock
$ cd $GOPATH/src/github.com/theckman/go-flock
$ go test
OK: 7 passed
PASS
ok github.com/theckman/go-flock 0.012s |
@matjam go test passed! |
@matjam Any progress of this issue? |
I need someone with ReFS on Windows to test the same module before I can recommend to @sdboyer that we switch to go-flock. You'll have to find a workaround in the meantime. Not running inside a the container that you're using, for example. Other containers seem to handle links just fine. |
Was there a patch for this? I have this exact problem, running a docker container inside an ubuntu trusty vagrant box on a mac. |
Update :
And nothing in the vendor folder... So, better but not a workaround yet. |
Well, It looks like the linkat(AT_FDCWD, "/shared/gopath/pkg/dep/sm.lock.735551337", AT_FDCWD, "/shared/gopath/pkg/dep/sm.lock", 0) = -1 EPERM (Operation not permitted) /shared (vboxsf) is mounted by virtualbox and my host machine is the Mac osx. |
@cpapidas Would appreciate it if you could test this PR and confirm that it works for you. |
@matjam I had tried your PR and I always get the same error.
also if I run the dep with "official" master version I get the following error
|
@cpapidas can you give us information about what system you're running dep on? Is this being run inside a container? |
Yes, I run this inside a docker container. All the golang configuration (GOPATH, etc ...) exists only in the container. docker version Also, have in mind that I run the docker from mac OS 10.12 or Vagrant 1.9.3. dockerfile
The run file just builds and run the golang program. Please if you want anything else, let me know. |
@cpapidas Can you give me the output of "docker info"? |
|
Okay. So, apparently aufs doesn't do either fnctl style flocking or hard linking in a standard way. The notes in the "incompatible" section are relevant, if not particularly clear: http://aufs.sourceforge.net/aufs.html @sdboyer we will need to do something else. We can't use a global lock file like this as a way to arbitrate who gets to write changes. |
Please note that the bug seems to only reproduce on non-Linux docker hosts. So I believe it is specific to vboxfs and not to aufs. |
@nightlyone actually I can reproduce in Termux, which is not running Docker. |
I'm facing the same problem with Mac OS (host) -> Vagrant (virtualbox) -> Ubuntu 16 (guest). No docker. The $GOPATH folder inside ubuntu is a virtual synced folder from the host OS. Running dep on the same folder directly on the host OS works. |
if you're running into this problem, a workaround should be to use the latest tip and set just, make very sure that you don't run two dep processes at once. it's reasonably likely to corrupt your |
FWIW ran into the same issue reported here while building I added my notes in case this helps future people landing on this erro, work-around works. > uname -a
FreeBSD wintermute.skunkwerks.at 12.0-CURRENT FreeBSD 12.0-CURRENT #17 r325010+e50db1d5c304(master): Thu Oct 26 13:59:17 UTC 2017 root@wintermute:/usr/obj/usr/src/sys/GENERIC amd64
> go version
go version go1.9.1 freebsd/amd64
> dep version
dep:
version : devel
build date :
git hash :
go version : go1.9.1
go compiler : gc
platform : freebsd/amd64
requested test repos> go get github.com/nightlyone/lockfile
> cd $GOPATH/src/github.com/nightlyone/lockfile
> go test
Error locking lockfile: link /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid.431102591 /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid: operation not permitted
--- FAIL: TestBasicLockUnlock (0.00s)
--- FAIL: TestRogueDeletion (0.00s)
lockfile_test.go:129: link /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid.949387529 /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid: operation not permitted
--- FAIL: TestRogueDeletionDeadPid (0.00s)
lockfile_test.go:158: link /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid.448157908 /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid: operation not permitted
--- FAIL: TestRemovesStaleLockOnDeadOwner (0.00s)
lockfile_test.go:202: link /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid.959225638 /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid: operation not permitted
--- FAIL: TestInvalidPidLeadToReplacedLockfileAndSuccess (0.00s)
lockfile_test.go:231: unexpected error: link /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid.332173640 /repos/go/src/github.com/nightlyone/lockfile/test_lockfile.pid: operation not permitted
FAIL
exit status 1
FAIL github.com/nightlyone/lockfile 0.004s
> go get gopkg.in/check.v1
> go get github.com/theckman/go-flock
> cd $GOPATH/src/github.com/theckman/go-flock
> go test
OK: 7 passed
PASS
ok github.com/theckman/go-flock 0.003s results when using workaround:
\o/ thanks! |
oi. yeah, another reason to swap out the lib we use to do this 😢 |
This issue is still occurring, has there been any progress at to change locking mechanism? This is happening on linux, no container |
@RichyHBM Did you try |
Yes, that works as expected but it shouldnt be the fix |
Lock mechanism still problem with docker(golang:1.10-alpine) on Windows10Pro(hyper-v enable). |
I get this on windows on an exFAT filesystem as well |
What version of Go (
go version
) anddep
(git describe --tags
) are you using?go version go1.8 linux/amd64
dep latest version
What
dep
command did you run?$ dep init -v
No Output
What did you expect to see?
The new files
*Gopkg.toml
*Gopkg.lock
*vendor(folder)
What did you see instead?
Nothing the command stuck there for hours and I have to press ctr+c to exit and have back the terminal.
The text was updated successfully, but these errors were encountered: