Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go/internal/lockedfile: spurious EDEADLK failures on AIX and Solaris #32817

Open
bcmills opened this issue Jun 27, 2019 · 4 comments

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Jun 27, 2019

From https://build.golang.org/log/f268986231f8f43781dacb0f5fe56772143ca24b (on aix-ppc64):

go test proxy running at GOPROXY=http://127.0.0.1:34444/mod
--- FAIL: TestScript (0.00s)
    --- FAIL: TestScript/mod_concurrent (1.51s)
        script_test.go:191: 
            # Concurrent builds should succeed, even if they need to download modules. (0.924s)
            > go build ./x &
            > go build ./y
            [stderr]
            go: can't lock version list lockfile: Lock $WORK/gopath/pkg/mod/cache/download/golang.org/x/text/@v/list.lock: deadlock condition if locked
            [exit status 1 (core dumped)]
            FAIL: testdata/script/mod_concurrent.txt:5: unexpected command failure
            
FAIL
FAIL	cmd/go	106.612s

I suspect that this is actually an AIX kernel bug (namely, being overly pessimistic about the deadlock properties of file locks in a multi-threaded program), but perhaps we should consider some sort of workaround.

CC @Helflym @jayconrod

@Helflym

This comment has been minimized.

Copy link
Contributor

@Helflym Helflym commented Jun 28, 2019

Yeah, I'm already aware of this bug. I know that it can happen sometimes on AIX 7.2 TL0.
However, I think that's the first time I'm seeing it on AIX 7.2 TL2 (the AIX version on the builder)...

I've made a old workaround a few months back (cf CL 152718). It could be updated and merged if we need a quick workaround while we're trying to find a true fix (if any).

@bradfitz

This comment has been minimized.

Copy link
Member

@bradfitz bradfitz commented Nov 4, 2019

Approximately the same thing on Illumos:

https://build.golang.org/log/2b8036635d51414fe4f6f3421dee2e27bc2a6577

It returns deadlock situation detected/avoided, which is error EDEADLK.

/cc @jclulow

@bcmills bcmills changed the title cmd/go: TestScript/mod_concurrent flake on aix-ppc64 builder cmd/go/internal/lockedfile: spurious EDEADLK failures on AIX and Solaris Nov 15, 2019
@bcmills bcmills added the OS-Solaris label Nov 15, 2019
@bcmills

This comment has been minimized.

Copy link
Member Author

@bcmills bcmills commented Nov 15, 2019

Illumos, at least, seems to have an flock implementation now, and the other platforms that provide flock do not seem to exhibit EDEADLK flakes.

So probably the ideal solution on that platform is to implement syscall.Flock (filed as #35618) and adjust the tags for lockedfile/internal/filelock to use the flock variant on illumos.

@bcmills bcmills added the OS-illumos label Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.