Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: get hangs without error output from unknown revisions #53905

Open
Justin-W opened this issue Jul 15, 2022 · 6 comments
Open

cmd/go: get hangs without error output from unknown revisions #53905

Justin-W opened this issue Jul 15, 2022 · 6 comments
Labels
GoCommand modules NeedsInvestigation
Milestone

Comments

@Justin-W
Copy link

@Justin-W Justin-W commented Jul 15, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18.4 darwin/arm64

$ git version
git version 2.37.1

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

% go env
GO111MODULE="auto"
GOARCH="arm64"
GOBIN=""
GOCACHE="REDACTED"
GOENV="REDACTED"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="REDACTED"
GONOPROXY="REDACTED"
GONOSUMDB="REDACTED"
GOOS="darwin"
GOPATH="REDACTED"
GOPRIVATE="REDACTED"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.18.4"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/vy/ryp4d95n75q5j6kf58hc551m0000gn/T/go-build559884269=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

  1. Create/identify 3 go modules/repos: A depended on by B depended on by C (so C indirectly depends on A)
  2. make B's go.mod reference to A invalid
    (for us, A's git history was rewritten; but you could probably manually edit B's go.mod with a invalid pseudo-version to get the same effect)
  3. run go get -u ./... in either B or C (or any module that depends on either of them)
  4. wait for the go process to hang with high CPU, then terminate it
  5. run go list -m -u -json all | jq --raw-output 'select((.Indirect or .Main)|not) | .Path' | xargs -t -I % go get -u -t %@latest, to find any "error:" or "panic:" output
  6. Revert any changes to go.* files resulting from step 5.

What did you expect to see?

Either some output indicating the dependency reference problem(s), or else normal operation of the go get ... command in step 3.

What did you see instead?

Step 3 starts running normally (there may or may not be some output, downloads, etc., with typical levels of CPU by the process), but at some point all output ceases and CPU spikes to 350%-450% and stays basically flat at that level until the go process is terminated.

Step 5 did not hang, and output contained several sections with errors like these:

go: downloading github.com/private-org/module-two v0.0.0-20220713230938-ae346380ff64
go: github.com/private-org/module-one@v0.0.0-20220203083605-df6cd891c11a: invalid version: unknown revision df6cd891c11a
go get -u -t github.com/private-org/module-three@latest
go: github.com/private-org/module-one@v0.0.0-20220203083605-df6cd891c11a: invalid version: unknown revision df6cd891c11a

Workaround

  1. Add a replace directive for each of the "invalid version"s identified in step 5 to the go.mod files of each dependent repo (including indirectly-dependent ones).
  2. Repeat steps 3 through 7 until step 3 finishes normally (i.e. without hanging)

Notes:

  • The hanging behavior happened consistently (100% repeatable; always hanged) on only some of the M1 macs tested. Others with identical hardware and similar go, git, etc. versions, even when running exactly the same commands on the same repos. Not sure why the issue only affected some envs and not others.
  • The root cause for us was that 3 repos had at least 1 pseudo-version reference (by other repos) to a commit which no longer existed in the referenced repo. Once the workaround above was completed for each missing revision, the unexpected go get behavior ceased.
    • I only identified a current go.mod reference to 1 of the 4 "missing revision"s which were in the error output. The other 3 seemed to be referenced exclusively from historical versions of go.mod and go.sum files (e.g. the git history of no-longer-referenced versions of certain modules), but not by any versions which were still directly referenced. IDK how un-referenced versions' go.* contents could still be affecting go get behavior, but it seemed to be the case. Very weird. If true, more details would be appreciated, since that might indicate that the replace directives need to remain in place forever (which would be unfortunate).
  • All repos involved are private repos, but IDK whether that matters or not. I suspect not. (All of the usual "private repo"-related config issues are not an issue for us, since all commands work fine on some machines, and work fine for repos with un-corrupted go.* files even on affected envs.)
  • The hanging happened with latest git and go versions (installed yesterday morning), and also with older versions of go (v1.18) and git. (IDK whether the git version matters or not, since upgrading seemed to have no effect.)
  • All of the repos involved had go 1.17 or go 1.18 directives in their go.mod files. We weren't attempting to preserve pre-v1.17 compatibility.
  • Adding git tags with names equal to the missing pseudoversions (e.g. v0.0.0-20220203083605-df6cd891c11a) did not work as a workaround.
    • It would be nice to have a tag-based way to retroactively "fix" go.mod refs which get broken as a result of commits which get irrevocably deleted. That would be much preferred vs the replace directives workaround.
  • Adding retract directives to the repos which were the source of the "missing" versions did not resolve the problem. (Would've been much nicer if it had, so that the same replace directives didn't need to be replicated in dozens of dependent repos.)
    • I know that retract isn't intended to prevent a dependent repo from referencing a retracted version once it already is, but since I was actually trying to upgrade all the deps, it would've been nice if the go get -u ./... command simply updated the "bad" refs with good ones (and didn't hang), which would have allowed me to iteratively upgrade all dependencies to stop using the bad revisions (without the need to manually identify and replace the "missing" revisions).
@Justin-W
Copy link
Author

@Justin-W Justin-W commented Jul 15, 2022

FYI: Step 3 of the steps to reproduce produced no output related to the hanging behavior or to the underlying invalid versions being referenced, even with the -v flag and/or the -t flag specified. Having the command output some kind of useful error message would have significantly lessened the effort involved in troubleshooting this issue.

@mknyszek mknyszek added NeedsInvestigation GoCommand labels Jul 15, 2022
@mknyszek mknyszek added this to the Backlog milestone Jul 15, 2022
@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Jul 15, 2022

CC @bcmills @matloob

@bcmills
Copy link
Member

@bcmills bcmills commented Jul 18, 2022

Without a self-contained example that (at least sometimes) reproduces the issue it's hard to know where to start.

@Justin-W, is it possible that one or more of the attempts had the modules within a workspace (defined by a go.work file)?
If so, this would match #53874 (fixed at HEAD).

@bcmills bcmills added the WaitingForInfo label Jul 18, 2022
@Justin-W
Copy link
Author

@Justin-W Justin-W commented Jul 19, 2022

@Justin-W, is it possible that one or more of the attempts had the modules within a workspace (defined by a go.work file)? If so, this would match #53874 (fixed at HEAD).

Nope. No use of workspaces and no go.work files. Also, the modules did not contain vendored deps.

@bcmills bcmills removed the WaitingForInfo label Jul 19, 2022
@bcmills
Copy link
Member

@bcmills bcmills commented Jul 19, 2022

@Justin-W, what happens if you send SIGQUIT or SIGSEGV to the go command once it hangs? (If you get a goroutine dump, please attach it here.)

@bcmills bcmills added the WaitingForInfo label Jul 19, 2022
@Justin-W
Copy link
Author

@Justin-W Justin-W commented Jul 19, 2022

@Justin-W, what happens if you send SIGQUIT or SIGSEGV to the go command once it hangs? (If you get a goroutine dump, please attach it here.)

@bcmills This is the SIGQUIT dump when the process for go get -u ./... was "stable" at ~400% CPU:

% go get -u ./...
^\SIGQUIT: quit
PC=0x1aca41628 m=0 sigcode=0

goroutine 0 [idle]:
syscall.syscall(0x140000c4540?, 0x14000000000?, 0x57?, 0x57?)
        /usr/local/go/src/runtime/sys_darwin.go:22 +0x38 fp=0x14001136ba0 sp=0x14001136b40 pc=0x100dcca88
syscall.Close(0x400112d620?)
        /usr/local/go/src/syscall/zsyscall_darwin_arm64.go:509 +0x38 fp=0x14001136bd0 sp=0x14001136ba0 pc=0x100de2038
internal/poll.(*FD).destroy(0x14001419f20)
        /usr/local/go/src/internal/poll/fd_unix.go:84 +0x58 fp=0x14001136c00 sp=0x14001136bd0 pc=0x100e2f848
internal/poll.(*FD).decref(0x14001419f20?)
        /usr/local/go/src/internal/poll/fd_mutex.go:213 +0x8c fp=0x14001136c20 sp=0x14001136c00 pc=0x100e2e55c
internal/poll.(*FD).Close(0x14001419f20)
        /usr/local/go/src/internal/poll/fd_unix.go:107 +0x4c fp=0x14001136c50 sp=0x14001136c20 pc=0x100e2f8dc
os.(*file).close(0x14001419f20)
        /usr/local/go/src/os/file_unix.go:252 +0x78 fp=0x14001136cb0 sp=0x14001136c50 pc=0x100e3c008
os.(*File).Close(0x1013cf648?)
        /usr/local/go/src/os/file_posix.go:25 +0x28 fp=0x14001136cd0 sp=0x14001136cb0 pc=0x100e3a608
go/build.(*Context).matchFile(0x1016ada20, {0x14000ca2640, 0x44}, {0x1400112d365, 0x12}, 0x14000ca2656?, 0x140019804e0, 0x14000020dc0)
        /usr/local/go/src/go/build/build.go:1437 +0x7c0 fp=0x14001136d90 sp=0x14001136cd0 pc=0x100ee5390
go/build.(*Context).Import(0x1016ada20, {0x1011f1ce6, 0x1}, {0x14000ca25f0, 0x44}, 0x0)
        /usr/local/go/src/go/build/build.go:848 +0x1858 fp=0x14001137480 sp=0x14001136d90 pc=0x100edf688
go/build.(*Context).ImportDir(...)
        /usr/local/go/src/go/build/build.go:485
cmd/go/internal/modload.resolveLocalPackage({0x1013d3860, 0x14000122000}, {0x14001bada50?, 0xd?}, 0x14000e61738?)
        /usr/local/go/src/cmd/go/internal/modload/load.go:503 +0xdc fp=0x14001137650 sp=0x14001137480 pc=0x101104a6c
cmd/go/internal/modload.LoadPackages.func1(0x0?, 0x14001a028f0)
        /usr/local/go/src/cmd/go/internal/modload/load.go:277 +0x7e8 fp=0x140011377e0 sp=0x14001137650 pc=0x101104098
cmd/go/internal/modload.LoadPackages({0x1013d3860?, 0x14000122000}, {{0x0, 0x0}, 0x140002da0f0, 0x0, {0x0, 0x0}, 0x1, 0x0, ...}, ...)
        /usr/local/go/src/cmd/go/internal/modload/load.go:359 +0x27c fp=0x14001137aa0 sp=0x140011377e0 pc=0x101102cac
cmd/go/internal/modget.(*resolver).loadPackages(0x140002c0000, {0x1013d3860?, 0x14000122000}, {0x14000fa4cf0, 0x1, 0x1}, 0x140013cf480)
        /usr/local/go/src/cmd/go/internal/modget/get.go:1138 +0xfc fp=0x14001137c00 sp=0x14001137aa0 pc=0x1011cedbc
cmd/go/internal/modget.(*resolver).findAndUpgradeImports(0x140002c0000, {0x1013d3860, 0x14000122000}, {0x14000138048, 0x1, 0x1013b9440?})
        /usr/local/go/src/cmd/go/internal/modget/get.go:1093 +0x228 fp=0x14001137cb0 sp=0x14001137c00 pc=0x1011ce688
cmd/go/internal/modget.runGet({0x1013d3860, 0x14000122000}, 0x140001504b0?, {0x1400013e030, 0x1, 0x1})
        /usr/local/go/src/cmd/go/internal/modget/get.go:350 +0x3ac fp=0x14001137d80 sp=0x14001137cb0 pc=0x1011c902c
main.invoke(0x1016a1b60, {0x1400013e010, 0x3, 0x3})
        /usr/local/go/src/cmd/go/main.go:218 +0x2b0 fp=0x14001137e50 sp=0x14001137d80 pc=0x1011f11f0
main.main()
        /usr/local/go/src/cmd/go/main.go:175 +0x770 fp=0x14001137f70 sp=0x14001137e50 pc=0x1011f0c60
runtime.main()
        /usr/local/go/src/runtime/proc.go:250 +0x250 fp=0x14001137fd0 sp=0x14001137f70 pc=0x100da2490
runtime.goexit()
        /usr/local/go/src/runtime/asm_arm64.s:1263 +0x4 fp=0x14001137fd0 sp=0x14001137fd0 pc=0x100dd0404

goroutine 1 [syscall]:
syscall.syscall(0x140000c4540?, 0x14000000000?, 0x57?, 0x57?)
        /usr/local/go/src/runtime/sys_darwin.go:22 +0x38 fp=0x14001136ba0 sp=0x14001136b40 pc=0x100dcca88
syscall.Close(0x400112d620?)
        /usr/local/go/src/syscall/zsyscall_darwin_arm64.go:509 +0x38 fp=0x14001136bd0 sp=0x14001136ba0 pc=0x100de2038
internal/poll.(*FD).destroy(0x14001419f20)
        /usr/local/go/src/internal/poll/fd_unix.go:84 +0x58 fp=0x14001136c00 sp=0x14001136bd0 pc=0x100e2f848
internal/poll.(*FD).decref(0x14001419f20?)
        /usr/local/go/src/internal/poll/fd_mutex.go:213 +0x8c fp=0x14001136c20 sp=0x14001136c00 pc=0x100e2e55c
internal/poll.(*FD).Close(0x14001419f20)
        /usr/local/go/src/internal/poll/fd_unix.go:107 +0x4c fp=0x14001136c50 sp=0x14001136c20 pc=0x100e2f8dc
os.(*file).close(0x14001419f20)
        /usr/local/go/src/os/file_unix.go:252 +0x78 fp=0x14001136cb0 sp=0x14001136c50 pc=0x100e3c008
os.(*File).Close(0x1013cf648?)
        /usr/local/go/src/os/file_posix.go:25 +0x28 fp=0x14001136cd0 sp=0x14001136cb0 pc=0x100e3a608
go/build.(*Context).matchFile(0x1016ada20, {0x14000ca2640, 0x44}, {0x1400112d365, 0x12}, 0x14000ca2656?, 0x140019804e0, 0x14000020dc0)
        /usr/local/go/src/go/build/build.go:1437 +0x7c0 fp=0x14001136d90 sp=0x14001136cd0 pc=0x100ee5390
go/build.(*Context).Import(0x1016ada20, {0x1011f1ce6, 0x1}, {0x14000ca25f0, 0x44}, 0x0)
        /usr/local/go/src/go/build/build.go:848 +0x1858 fp=0x14001137480 sp=0x14001136d90 pc=0x100edf688
go/build.(*Context).ImportDir(...)
        /usr/local/go/src/go/build/build.go:485
cmd/go/internal/modload.resolveLocalPackage({0x1013d3860, 0x14000122000}, {0x14001bada50?, 0xd?}, 0x14000e61738?)
        /usr/local/go/src/cmd/go/internal/modload/load.go:503 +0xdc fp=0x14001137650 sp=0x14001137480 pc=0x101104a6c
cmd/go/internal/modload.LoadPackages.func1(0x0?, 0x14001a028f0)
        /usr/local/go/src/cmd/go/internal/modload/load.go:277 +0x7e8 fp=0x140011377e0 sp=0x14001137650 pc=0x101104098
cmd/go/internal/modload.LoadPackages({0x1013d3860?, 0x14000122000}, {{0x0, 0x0}, 0x140002da0f0, 0x0, {0x0, 0x0}, 0x1, 0x0, ...}, ...)
        /usr/local/go/src/cmd/go/internal/modload/load.go:359 +0x27c fp=0x14001137aa0 sp=0x140011377e0 pc=0x101102cac
cmd/go/internal/modget.(*resolver).loadPackages(0x140002c0000, {0x1013d3860?, 0x14000122000}, {0x14000fa4cf0, 0x1, 0x1}, 0x140013cf480)
        /usr/local/go/src/cmd/go/internal/modget/get.go:1138 +0xfc fp=0x14001137c00 sp=0x14001137aa0 pc=0x1011cedbc
cmd/go/internal/modget.(*resolver).findAndUpgradeImports(0x140002c0000, {0x1013d3860, 0x14000122000}, {0x14000138048, 0x1, 0x1013b9440?})
        /usr/local/go/src/cmd/go/internal/modget/get.go:1093 +0x228 fp=0x14001137cb0 sp=0x14001137c00 pc=0x1011ce688
cmd/go/internal/modget.runGet({0x1013d3860, 0x14000122000}, 0x140001504b0?, {0x1400013e030, 0x1, 0x1})
        /usr/local/go/src/cmd/go/internal/modget/get.go:350 +0x3ac fp=0x14001137d80 sp=0x14001137cb0 pc=0x1011c902c
main.invoke(0x1016a1b60, {0x1400013e010, 0x3, 0x3})
        /usr/local/go/src/cmd/go/main.go:218 +0x2b0 fp=0x14001137e50 sp=0x14001137d80 pc=0x1011f11f0
main.main()
        /usr/local/go/src/cmd/go/main.go:175 +0x770 fp=0x14001137f70 sp=0x14001137e50 pc=0x1011f0c60
runtime.main()
        /usr/local/go/src/runtime/proc.go:250 +0x250 fp=0x14001137fd0 sp=0x14001137f70 pc=0x100da2490
runtime.goexit()
        /usr/local/go/src/runtime/asm_arm64.s:1263 +0x4 fp=0x14001137fd0 sp=0x14001137fd0 pc=0x100dd0404

r0      0x0
r1      0x0
r2      0x0
r3      0x1016adda0
r4      0x1500
r5      0x14000002230
r6      0x1016de37e
r7      0x1
r8      0x1016aeba0
r9      0x14001136b60
r10     0x3
r11     0x0
r12     0x100de5000
r13     0x16f093120
r14     0x0
r15     0x140001a607f
r16     0x6
r17     0x1aca41620
r18     0x0
r19     0x14
r20     0x20
r21     0x4
r22     0x14000120930
r23     0x14000146ce0
r24     0x22
r25     0x14000120930
r26     0x1013cc6c0
r27     0x840
r28     0x1016adda0
r29     0x16f093108
lr      0x100dd186c
sp      0x16f093100
pc      0x1aca41628
fault   0x1aca41628

@bcmills bcmills removed the WaitingForInfo label Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GoCommand modules NeedsInvestigation
Projects
None yet
Development

No branches or pull requests

3 participants