Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: time out on LUCI Linux/PPC64x builder #65171

Open
cherrymui opened this issue Jan 19, 2024 · 20 comments
Open

x/build: time out on LUCI Linux/PPC64x builder #65171

cherrymui opened this issue Jan 19, 2024 · 20 comments
Assignees
Labels
arch-ppc64x Builders x/build issues (builders, bots, dashboards) FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@cherrymui
Copy link
Member

cherrymui commented Jan 19, 2024

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || `running too slowly` || status == "ABORT")
@cherrymui cherrymui changed the title cmd/cgo/internal/test*: time out on LUCI gotip-linux-ppc64le builder all: time out on LUCI gotip-linux-ppc64le builder Jan 19, 2024
@cherrymui cherrymui changed the title all: time out on LUCI gotip-linux-ppc64le builder x/build: time out on LUCI gotip-linux-ppc64le builder Jan 19, 2024
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Jan 19, 2024
@gopherbot gopherbot added this to the Unreleased milestone Jan 19, 2024
@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ==  "gotip-linux-ppc64le" && (`test timed out` || `SIGQUIT` || `context deadline exceeded`)
2023-12-06 21:50 gotip-linux-ppc64le go@e914671f cmd/cgo/internal/testshared.TestIssue62277 (log)
=== RUN   TestIssue62277
panic: test timed out after 9m0s
running tests:
	TestIssue62277 (1s)

goroutine 142 gp=0xc0001d61c0 m=2 mp=0xc000044808 [running]:
panic({0x1c37c0?, 0xc0000b20c0?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:779 +0x174 fp=0xc000182ee0 sp=0xc000182e20 pc=0x4d1e4
testing.(*M).startAlarm.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/testing/testing.go:2366 +0x338 fp=0xc000182fc0 sp=0xc000182ee0 pc=0x104058
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc000185f20 sp=0xc000185eb0 pc=0x13390c
os/exec.(*Cmd).Start.func2(0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc000185f98 sp=0xc000185f20 pc=0x1344e8
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc000185fc0 sp=0xc000185f98 pc=0x134488
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000185fc0 sp=0xc000185fc0 pc=0x8bb54
created by os/exec.(*Cmd).Start in goroutine 139
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2023-12-12 20:05 gotip-linux-ppc64le go@962dade4 cmd/cgo/internal/testshared.TestIssue44031 (log)
=== RUN   TestIssue44031
SIGQUIT: quit
PC=0x8d22c m=0 sigcode=0

goroutine 0 gp=0x360f40 m=0 mp=0x361bc0 [idle]:
runtime.futex(0x361d00, 0x80, 0x0, 0x0, 0x0, 0x0)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/sys_linux_ppc64x.s:644 +0x1c fp=0x7fffe1a6a2c8 sp=0x7fffe1a6a2c8 pc=0x8d22c
runtime.futexsleep(0x1?, 0x55d30?, 0x361bc0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/os_linux.go:69 +0x30 fp=0x7fffe1a6a328 sp=0x7fffe1a6a2c8 pc=0x4a5d0
runtime.notesleep(0x361d00)
...
r20  0x4	r21  0x2
r22  0x7fffe1a5aa00	r23  0x1
r24  0x0	r25  0x0
r26  0x0	r27  0xe09d261a
r28  0xd617e65a	r29  0xc00002f7a8
r30  0x360f40	r31  0x1cda0
pc   0x8d22c	ctr  0x0
link 0x4a5d0	xer  0x0
ccr  0x34420884	trap 0xc00
*** Test killed with quit: ran too long (10m0s).
2023-12-15 18:30 gotip-linux-ppc64le go@5e939b3a cmd/cgo/internal/testcarchive.TestPIE (log)
=== RUN   TestPIE
    carchive_test.go:956: [go tool cgo -objdir /home/swarming/.swarming/w/ir/x/t/carchive_test2687818152/_obj664230977 -exportheader p.h p/p.go]
panic: test timed out after 9m0s
running tests:
	TestPIE (25s)

goroutine 94 gp=0xc0002fe1c0 m=3 mp=0xc000045008 [running]:
panic({0x1803a0?, 0xc000182730?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:779 +0x174 fp=0xc000089ee0 sp=0xc000089e20 pc=0x4ceb4
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc00028ef20 sp=0xc00028eeb0 pc=0x10814c
os/exec.(*Cmd).Start.func2(0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc00028ef98 sp=0xc00028ef20 pc=0x108d28
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc00028efc0 sp=0xc00028ef98 pc=0x108cc8
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc00028efc0 sp=0xc00028efc0 pc=0x8b874
created by os/exec.(*Cmd).Start in goroutine 91
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2023-12-15 18:30 gotip-linux-ppc64le go@5e939b3a cmd/cgo/internal/testshared.TestGlobal (log)
=== RUN   TestGlobal
SIGQUIT: quit
PC=0x8d22c m=0 sigcode=0

goroutine 0 gp=0x360f40 m=0 mp=0x361bc0 [idle]:
runtime.futex(0x361d00, 0x80, 0x0, 0x0, 0x0, 0x0)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/sys_linux_ppc64x.s:644 +0x1c fp=0x7fffe0703828 sp=0x7fffe0703828 pc=0x8d22c
runtime.futexsleep(0x7d870?, 0x55d30?, 0x361bc0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/os_linux.go:69 +0x30 fp=0x7fffe0703888 sp=0x7fffe0703828 pc=0x4a5d0
runtime.notesleep(0x361d00)
...
r20  0x4	r21  0x1
r22  0x7fffe06f3f60	r23  0x1
r24  0x0	r25  0x0
r26  0x0	r27  0xdf0a0c9
r28  0x9e3a2147	r29  0xdca58061
r30  0x360f40	r31  0x1cda0
pc   0x8d22c	ctr  0x0
link 0x4a5d0	xer  0x0
ccr  0x34420888	trap 0xc00
*** Test killed with quit: ran too long (10m0s).
2023-12-15 20:51 gotip-linux-ppc64le go@f8170cc0 cmd/go.TestScript (log)
=== RUN   TestScript/cgo_path
=== PAUSE TestScript/cgo_path
=== CONT  TestScript/cgo_path
    script_test.go:132: 2023-12-15T22:29:34Z
    script_test.go:134: $WORK=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-1289633616/tmpdir1500409099/cgo_path280284189
    script_test.go:156: 
        PATH=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-1289633616/tmpdir1500409099/testbin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/cache/tools/bin:/home/swarming/.swarming/w/ir/bbagent_utility_packages:/home/swarming/.swarming/w/ir/bbagent_utility_packages/bin:/home/swarming/.swarming/w/ir/cipd_bin_packages:/home/swarming/.swarming/w/ir/cipd_bin_packages/bin:/home/swarming/.swarming/w/ir/cache/cipd_client:/home/swarming/.swarming/w/ir/cache/cipd_client/bin:/home/swarming/.swarming/cipd_cache/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        HOME=/no-home
        CCACHE_DISABLE=1
        GOARCH=ppc64le
...
        r24  0x0	r25  0x0
        r26  0x0	r27  0x110
        r28  0x81e9bc6b	r29  0xc0000581e0
        r30  0x10cf6040	r31  0x100418f4
        pc   0x100082f4	ctr  0x0
        link 0x1000828c	xer  0x0
        ccr  0x34428888	trap 0xc00
        [exit status 2]
    script_test.go:156: FAIL: testdata/script/cgo_path.txt:14: context deadline exceeded
--- FAIL: TestScript/cgo_path (7.88s)

watchflakes

@pmur
Copy link
Contributor

pmur commented Jan 22, 2024

I wonder if LUCI needs to set GO_TEST_TIMEOUT_SCALE=2 for ppc64{,le}? I don't think I see that set in the logs. The POWER8 builders can be particularly slow. I think the VMs are also subject to sharing CPU time with whatever else is running at OSU.

@cherrymui
Copy link
Member Author

Yeah, GO_TEST_TIMEOUT_SCALE=2 doesn't seem to be set. The old ppc64le builders set GO_TEST_TIMEOUT_SCALE=2, so we probably should do that for LCUI builders as well.

cc @mknyszek

@cherrymui cherrymui added arch-ppc64x NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 23, 2024
@mknyszek
Copy link
Contributor

I'll send a CL for this today. Thanks.

@gopherbot
Copy link

Change https://go.dev/cl/557857 mentions this issue: main.star: add a host-based test timeout scale

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 23, 2024
@dmitshur dmitshur added the FixPending Issues that have a fix which has not yet been reviewed or submitted. label Jan 23, 2024
gopherbot pushed a commit to golang/build that referenced this issue Jan 24, 2024
This change adds a baseline test time timeout scale that's derived from
the host to better align with the old infrastructure. Such hosts are
listed in SLOW_HOSTS along with their timeout scale.

This CL also changes the way timeout scaling factors compose. Instead of
taking the highest one, this change multiplies them. Although the only
case where this matters seems to mostly work fine with taking the
maximum (-longtest-race) it seems easier to reason about the composition
of scaling factors this way.

Fixes golang/go#65171.

Change-Id: If95fc8a036c2a735396c854cbcf09af7cba0c9f3
Reviewed-on: https://go-review.googlesource.com/c/build/+/557857
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ==  "gotip-linux-ppc64le" && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || status == "ABORT")
2023-12-15 20:30 gotip-linux-ppc64le go@3313bbb4 runtime.TestCgoPprofPIE (log)
=== RUN   TestCgoPprofPIE
=== PAUSE TestCgoPprofPIE
=== CONT  TestCgoPprofPIE
    crash_test.go:152: running /home/swarming/.swarming/w/ir/x/w/goroot/bin/go build -o /home/swarming/.swarming/w/ir/x/t/go-build49346469/testprogcgo_-buildmode=pie.exe -buildmode=pie

watchflakes

@cherrymui
Copy link
Member Author

cherrymui commented Jan 25, 2024

Closing. Should be fixed by the CL above. (Apparently gopherbot doesn't do it because it is checked in to a branch.)

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64le` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || status == "ABORT")
2023-12-19 15:57 go1.22-linux-ppc64le release-branch.go1.22@fb23428a cmd/cgo/internal/testshared.TestIssue44031 [ABORT] (log)
=== RUN   TestIssue44031
    shared_test.go:89: go install -buildmode=shared -linkshared ./issue44031/a
    shared_test.go:89: go install -buildmode=shared -linkshared ./issue44031/b
panic: test timed out after 9m0s
running tests:
	TestIssue44031 (51s)

goroutine 167 gp=0xc000123880 m=4 mp=0xc000080008 [running]:
panic({0x1c37c0?, 0xc000020240?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:779 +0x174 fp=0xc0000d2ee0 sp=0xc0000d2e20 pc=0x4d1c4
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc0000d1f20 sp=0xc0000d1eb0 pc=0x133a6c
os/exec.(*Cmd).Start.func2(0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc0000d1f98 sp=0xc0000d1f20 pc=0x134648
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc0000d1fc0 sp=0xc0000d1f98 pc=0x1345e8
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc0000d1fc0 sp=0xc0000d1fc0 pc=0x8bbb4
created by os/exec.(*Cmd).Start in goroutine 141
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec

watchflakes

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64le` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || status == "ABORT")
2024-01-31 21:44 x_tools-go1.20-linux-ppc64le tools@5f906919 release-branch.go1.20@746a0727 x/tools/gopls/internal/test/integration/workspace.TestReloadOnlyOnce/default [ABORT] (log)
=== RUN   TestReloadOnlyOnce/default
DETECTED A HANGING GO COMMAND

The gopls test runner has detected a hanging go command. In order to debug
this, the output of ps and lsof/fstat is printed below.

See golang/go#54461 for more details.

ps axo ppid,pid,command:
-------------------------
...
panic: running lsof: exec: "lsof": executable file not found in $PATH

goroutine 14940 [running]:
golang.org/x/tools/internal/gocommand.HandleHangingGoCommand(0xc001a27d10)
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/internal/gocommand/invoke.go:439 +0x390
golang.org/x/tools/internal/gocommand.runCmdContext({0xe60a08, 0xc001f13f20}, 0xc00245f760)
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/internal/gocommand/invoke.go:375 +0x3d4
golang.org/x/tools/internal/gocommand.(*Invocation).run(0xc000d6b450, {0xe60a08, 0xc001f13f20}, {0xe57c80?, 0xc000208870}, {0xe57c80?, 0xc0002088d0})
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/internal/gocommand/invoke.go:270 +0xebc
golang.org/x/tools/internal/gocommand.(*Invocation).runWithFriendlyError(0xc000d6b408?, {0xe60a08, 0xc001f13f20}, {0xe57c80?, 0xc000208870?}, {0xe57c80?, 0xc0002088d0?})
...
golang.org/x/tools/gopls/internal/cache.(*Snapshot).WorkspaceMetadata(0xc003026a20, {0xe60a08?, 0xc001f12c30?})
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/gopls/internal/cache/snapshot.go:1083 +0x68
golang.org/x/tools/gopls/internal/server.(*server).diagnose(0xc001c0f200, {0xe60a08, 0xc002a28d20}, 0xc003026a20)
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/gopls/internal/server/diagnostics.go:366 +0x4cc
golang.org/x/tools/gopls/internal/server.(*server).diagnoseSnapshot(0xc001c0f200, 0xc003026a20, {0xc00184bb70, 0x1, 0x1}, 0x989680)
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/gopls/internal/server/diagnostics.go:223 +0x574
golang.org/x/tools/gopls/internal/server.(*server).diagnoseChangedViews.func1(0xc003026a20, {0xc00184bb70?, 0xc00464b860?, 0xc000e21f88?})
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/gopls/internal/server/diagnostics.go:135 +0xe8
created by golang.org/x/tools/gopls/internal/server.(*server).diagnoseChangedViews
	/home/swarming/.swarming/w/ir/x/w/targetrepo1846607283/gopls/internal/server/diagnostics.go:132 +0x518
2024-01-31 21:44 x_tools-go1.21-linux-ppc64le tools@5f906919 release-branch.go1.21@916e6cdd x/tools/gopls/internal/test/integration/misc.TestHoverBrokenImport_Issue60592/default [ABORT] (log)
=== RUN   TestHoverBrokenImport_Issue60592/default
DETECTED A HANGING GO COMMAND

The gopls test runner has detected a hanging go command. In order to debug
this, the output of ps and lsof/fstat is printed below.

See golang/go#54461 for more details.

ps axo ppid,pid,command:
-------------------------
...
panic: running lsof: exec: "lsof": executable file not found in $PATH [recovered]
	panic: /home/swarming/.swarming/w/ir/x/w/targetrepo4225489592/gopls/internal/protocol/tsserver.go:100: server panic in textDocument/hover request

goroutine 26274 [running]:
golang.org/x/tools/gopls/internal/util/bug.report({0xc00ca10c00, 0x2a})
	/home/swarming/.swarming/w/ir/x/w/targetrepo4225489592/gopls/internal/util/bug/bug.go:80 +0x4f0
golang.org/x/tools/gopls/internal/util/bug.Reportf({0xc57ce5?, 0x12?}, {0xc0014b62b0?, 0x51b0c?, 0x0?})
	/home/swarming/.swarming/w/ir/x/w/targetrepo4225489592/gopls/internal/util/bug/bug.go:51 +0x58
golang.org/x/tools/gopls/internal/protocol.serverDispatch.func1()
	/home/swarming/.swarming/w/ir/x/w/targetrepo4225489592/gopls/internal/protocol/tsserver.go:100 +0xa8
...
      1 4174727 [link] <defunct>
      1 4184384 [compile] <defunct>
      1 4184389 [compile] <defunct>
      1 4184390 [compile] <defunct>
      1 4184421 [asm] <defunct>
      1 4185715 [compile] <defunct>
      1 4190463 [compile] <defunct>
      1 4190464 [compile] <defunct>
      1 4190465 [compile] <defunct>
      1 4190466 [compile] <defunct>
2024-02-02 15:22 gotip-linux-ppc64le go@149db960 cmd/cgo/internal/testcarchive.TestPreemption [ABORT] (log)
=== RUN   TestPreemption
=== PAUSE TestPreemption
=== CONT  TestPreemption
panic: test timed out after 9m0s
running tests:
	TestDeepStack (1s)
	TestPreemption (1s)
	TestSIGPROF (1s)
	TestSharedObject (1s)

...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc000107f20 sp=0xc000107eb0 pc=0x10879c
os/exec.(*Cmd).Start.func2(0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc000107f98 sp=0xc000107f20 pc=0x109378
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc000107fc0 sp=0xc000107f98 pc=0x109318
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000107fc0 sp=0xc000107fc0 pc=0x8bec4
created by os/exec.(*Cmd).Start in goroutine 99
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-02 15:22 gotip-linux-ppc64le go@149db960 cmd/cgo/internal/testshared.TestTestInstalledShared [ABORT] (log)
=== RUN   TestTestInstalledShared
panic: test timed out after 9m0s
running tests:
	TestTestInstalledShared (14s)

goroutine 147 gp=0xc0005028c0 m=2 mp=0xc000044808 [running]:
panic({0x1c3740?, 0xc00011e8d0?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc0002a2ee0 sp=0xc0002a2e20 pc=0x4d374
testing.(*M).startAlarm.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/testing/testing.go:2366 +0x338 fp=0xc0002a2fc0 sp=0xc0002a2ee0 pc=0x104758
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc0002a0f20 sp=0xc0002a0eb0 pc=0x13409c
os/exec.(*Cmd).Start.func2(0x2107a8?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc0002a0f98 sp=0xc0002a0f20 pc=0x134c78
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc0002a0fc0 sp=0xc0002a0f98 pc=0x134c18
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc0002a0fc0 sp=0xc0002a0fc0 pc=0x8c184
created by os/exec.(*Cmd).Start in goroutine 96
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-02 15:22 gotip-linux-ppc64le go@149db960 cmd/go.TestScript (log)
=== RUN   TestScript/test_ppc64le_cgo_inline_plt
=== PAUSE TestScript/test_ppc64le_cgo_inline_plt
=== CONT  TestScript/test_ppc64le_cgo_inline_plt
    script_test.go:132: 2024-02-02T16:42:50Z
    script_test.go:134: $WORK=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-815285317/tmpdir2819220712/test_ppc64le_cgo_inline_plt696775735
    script_test.go:156: 
        PATH=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-815285317/tmpdir2819220712/testbin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/cache/tools/bin:/home/swarming/.swarming/w/ir/bbagent_utility_packages:/home/swarming/.swarming/w/ir/bbagent_utility_packages/bin:/home/swarming/.swarming/w/ir/cipd_bin_packages:/home/swarming/.swarming/w/ir/cipd_bin_packages/bin:/home/swarming/.swarming/w/ir/cache/cipd_client:/home/swarming/.swarming/w/ir/cache/cipd_client/bin:/home/swarming/.swarming/cipd_cache/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        HOME=/no-home
        CCACHE_DISABLE=1
        GOARCH=ppc64le
...
        r22  0x7fffd77d03a0	r23  0x1
        r24  0x0	r25  0xf0
        r26  0x100	r27  0x110
        r28  0x15de4464	r29  0xc0000557c0
        r30  0x10d062a0	r31  0x10012cf0
        pc   0x10087cfc	ctr  0x0
        link 0x10041d70	xer  0x0
        ccr  0x34428888	trap 0xc00
    script_test.go:156: FAIL: testdata/script/test_ppc64le_cgo_inline_plt.txt:17: go build -ldflags=-linkmode=internal: exit status 2
--- FAIL: TestScript/test_ppc64le_cgo_inline_plt (78.92s)

watchflakes

@gopherbot gopherbot reopened this Feb 2, 2024
@pmur
Copy link
Contributor

pmur commented Feb 2, 2024

Is there more context to be found about what test is suspected of hanging, and what the timeout is?

I am rebuilding the container images to include lsof.

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64le` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || status == "ABORT")
2024-02-07 20:08 gotip-linux-ppc64le go@6abeffb1 cmd/go.TestScript (log)
=== RUN   TestScript/work_sum
=== PAUSE TestScript/work_sum
=== CONT  TestScript/work_sum
    script_test.go:132: 2024-02-07T21:01:31Z
    script_test.go:134: $WORK=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-2449912314/tmpdir353651227/work_sum4267116283
    script_test.go:156: 
        PATH=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-2449912314/tmpdir353651227/testbin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/cache/tools/bin:/home/swarming/.swarming/w/ir/bbagent_utility_packages:/home/swarming/.swarming/w/ir/bbagent_utility_packages/bin:/home/swarming/.swarming/w/ir/cipd_bin_packages:/home/swarming/.swarming/w/ir/cipd_bin_packages/bin:/home/swarming/.swarming/w/ir/cache/cipd_client:/home/swarming/.swarming/w/ir/cache/cipd_client/bin:/home/swarming/.swarming/cipd_cache/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        HOME=/no-home
        CCACHE_DISABLE=1
        GOARCH=ppc64le
...
        r22  0x7fffef8d03a0	r23  0x1
        r24  0x0	r25  0x0
        r26  0x0	r27  0x2e
        r28  0x696168636c6f6f74	r29  0x0
        r30  0x10d062a0	r31  0x10012cf0
        pc   0x10087d1c	ctr  0x0
        link 0x10041d70	xer  0x0
        ccr  0x34428888	trap 0xc00
    script_test.go:156: FAIL: testdata/script/work_sum.txt:3: go run .: exit status 2
--- FAIL: TestScript/work_sum (17.94s)
2024-02-07 20:54 gotip-linux-ppc64le go@c0984005 cmd/cgo/internal/testshared.TestIssue44031 [ABORT] (log)
=== RUN   TestIssue44031
    shared_test.go:89: go install -buildmode=shared -linkshared ./issue44031/a
panic: test timed out after 9m0s
running tests:
	TestIssue44031 (42s)

goroutine 174 gp=0xc0000a0540 m=5 mp=0xc000200008 [running]:
panic({0x1c3740?, 0xc000090360?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc0002e0ee0 sp=0xc0002e0e20 pc=0x4d374
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc0002dbf20 sp=0xc0002dbeb0 pc=0x1340bc
os/exec.(*Cmd).Start.func2(0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc0002dbf98 sp=0xc0002dbf20 pc=0x134c98
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc0002dbfc0 sp=0xc0002dbf98 pc=0x134c38
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc0002dbfc0 sp=0xc0002dbfc0 pc=0x8c1a4
created by os/exec.(*Cmd).Start in goroutine 169
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-07 21:04 gotip-linux-ppc64le go@1ffc1104 cmd/cgo/internal/testshared.TestGeneratedHash [ABORT] (log)
=== RUN   TestGeneratedHash
panic: test timed out after 9m0s
running tests:
	TestGeneratedHash (14s)

goroutine 151 gp=0xc000003880 m=6 mp=0xc000042808 [running]:
panic({0x1c3740?, 0xc0000904a0?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc0002b1ee0 sp=0xc0002b1e20 pc=0x4d374
testing.(*M).startAlarm.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/testing/testing.go:2366 +0x338 fp=0xc0002b1fc0 sp=0xc0002b1ee0 pc=0x104778
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc0002abf20 sp=0xc0002abeb0 pc=0x1340bc
os/exec.(*Cmd).Start.func2(0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc0002abf98 sp=0xc0002abf20 pc=0x134c98
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc0002abfc0 sp=0xc0002abf98 pc=0x134c38
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc0002abfc0 sp=0xc0002abfc0 pc=0x8c1a4
created by os/exec.(*Cmd).Start in goroutine 148
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-07 21:04 gotip-linux-ppc64le go@1ffc1104 cmd/go.TestScript (log)
=== RUN   TestScript/mod_get_main
=== PAUSE TestScript/mod_get_main
=== CONT  TestScript/mod_get_main
    script_test.go:132: 2024-02-08T00:09:25Z
    script_test.go:134: $WORK=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-3231826461/tmpdir3382976003/mod_get_main3987484501
    script_test.go:156: 
        PATH=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-3231826461/tmpdir3382976003/testbin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/cache/tools/bin:/home/swarming/.swarming/w/ir/bbagent_utility_packages:/home/swarming/.swarming/w/ir/bbagent_utility_packages/bin:/home/swarming/.swarming/w/ir/cipd_bin_packages:/home/swarming/.swarming/w/ir/cipd_bin_packages/bin:/home/swarming/.swarming/w/ir/cache/cipd_client:/home/swarming/.swarming/w/ir/cache/cipd_client/bin:/home/swarming/.swarming/cipd_cache/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        HOME=/no-home
        CCACHE_DISABLE=1
        GOARCH=ppc64le
...
        r22  0xc0002ca3a0	r23  0xc0002cc6b8
        r24  0xe0	r25  0xf0
        r26  0x100	r27  0x110
        r28  0xd86fc7b50b1d5124	r29  0x15a526a63c
        r30  0xc0001fc1c0	r31  0x100a0e8c
        pc   0x100082f4	ctr  0x0
        link 0x100082bc	xer  0x0
        ccr  0x24448824	trap 0xc00
    script_test.go:156: FAIL: testdata/script/mod_get_main.txt:16: go get rsc.io/x: exit status 2
--- FAIL: TestScript/mod_get_main (11.33s)
2024-02-07 21:50 gotip-linux-ppc64le go@adbf71eb cmd/cgo/internal/testcarchive.TestCachedInstall [ABORT] (log)
=== RUN   TestCachedInstall
    carchive_test.go:1163: [go install -buildmode=c-archive ./libgo]
panic: test timed out after 9m0s
running tests:
	TestCachedInstall (12s)

goroutine 114 gp=0xc000122380 m=3 mp=0xc000045008 [running]:
panic({0x180320?, 0xc0001125c0?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc000108ee0 sp=0xc000108e20 pc=0x4d064
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc000209f20 sp=0xc000209eb0 pc=0x1087bc
os/exec.(*Cmd).Start.func2(0x1c0028?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc000209f98 sp=0xc000209f20 pc=0x109398
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc000209fc0 sp=0xc000209f98 pc=0x109338
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000209fc0 sp=0xc000209fc0 pc=0x8bee4
created by os/exec.(*Cmd).Start in goroutine 80
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-07 21:50 gotip-linux-ppc64le go@adbf71eb cmd/cgo/internal/testplugin.TestIssue62430 [ABORT] (log)
=== RUN   TestIssue62430
( GOPATH=$TMPDIR PWD=$TMPDIR/src/testplugin LD_LIBRARY_PATH=$TMPDIR/src/testplugin /home/swarming/.swarming/w/ir/x/w/goroot/bin/go build -gcflags '' -buildmode=plugin -o issue62430.so ./issue62430/plugin.go )
( GOPATH=$TMPDIR PWD=$TMPDIR/src/testplugin LD_LIBRARY_PATH=$TMPDIR/src/testplugin /home/swarming/.swarming/w/ir/x/w/goroot/bin/go build -gcflags '' -o issue62430.exe ./issue62430/main.go )
SIGQUIT: quit
PC=0x137b0 m=0 sigcode=0

goroutine 229 gp=0xc000082a80 m=0 mp=0x2ae5c0 [syscall]:
syscall.Syscall6(0x110, 0x1, 0x21edfb, 0xc0000dda80, 0x1000004, 0x0, 0x0)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/syscall/syscall_linux.go:91 +0x28 fp=0xc0000dda28 sp=0xc0000dd9b8 pc=0x9a758
os.(*Process).blockUntilWaitable(0xc00019e180)
...
r20  0xc0000ddb00	r21  0x12
r22  0xc0000dc3a0	r23  0xc0000ddaa8
r24  0x0	r25  0x0
r26  0x0	r27  0xd7b6b5c2
r28  0x5720d75b	r29  0xfcb23ae3
r30  0xc000082a80	r31  0x9a778
pc   0x137b0	ctr  0x0
link 0x1377c	xer  0x0
ccr  0x34420804	trap 0xc00
*** Test killed with quit: ran too long (10m0s).
2024-02-07 21:50 gotip-linux-ppc64le go@adbf71eb cmd/cgo/internal/testshared.TestImplicitInclusion [ABORT] (log)
=== RUN   TestImplicitInclusion
panic: test timed out after 9m0s
running tests:
	TestImplicitInclusion (3s)

goroutine 107 gp=0xc000594e00 m=3 mp=0xc000045008 [running]:
panic({0x1c3740?, 0xc000206120?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc00021fee0 sp=0xc00021fe20 pc=0x4d374
testing.(*M).startAlarm.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/testing/testing.go:2366 +0x338 fp=0xc00021ffc0 sp=0xc00021fee0 pc=0x104778
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc000081f20 sp=0xc000081eb0 pc=0x1340bc
os/exec.(*Cmd).Start.func2(0x210740?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc000081f98 sp=0xc000081f20 pc=0x134c98
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc000081fc0 sp=0xc000081f98 pc=0x134c38
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000081fc0 sp=0xc000081fc0 pc=0x8c1a4
created by os/exec.(*Cmd).Start in goroutine 104
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-07 23:54 gotip-linux-ppc64le go@cde38c96 cmd/cgo/internal/testcarchive.TestSignalForwardingGo [ABORT] (log)
=== RUN   TestSignalForwardingGo
    carchive_test.go:676: go build -buildmode=c-archive -o libgo2.a ./libgo2
panic: test timed out after 9m0s
running tests:
	TestSignalForwardingGo (0s)

goroutine 100 gp=0xc000246700 m=2 mp=0xc000044808 [running]:
panic({0x180320?, 0xc00010a2b0?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc00011bee0 sp=0xc00011be20 pc=0x4d064
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc000106f20 sp=0xc000106eb0 pc=0x1087bc
os/exec.(*Cmd).Start.func2(0x1c00a0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc000106f98 sp=0xc000106f20 pc=0x109398
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc000106fc0 sp=0xc000106f98 pc=0x109338
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000106fc0 sp=0xc000106fc0 pc=0x8bee4
created by os/exec.(*Cmd).Start in goroutine 98
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-07 23:54 gotip-linux-ppc64le go@cde38c96 cmd/cgo/internal/testerrors.TestMallocCrashesOnNil [ABORT] (log)
=== RUN   TestMallocCrashesOnNil
=== PAUSE TestMallocCrashesOnNil
=== CONT  TestMallocCrashesOnNil
panic: test timed out after 9m0s
running tests:
	TestMallocCrashesOnNil (2m52s)

goroutine 217 gp=0xc000427a40 m=6 mp=0xc0001d0008 [running]:
panic({0x180580?, 0xc000204040?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc00006bee0 sp=0xc00006be20 pc=0x4cd94
...
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/proc.go:402 +0x114 fp=0xc000507ee0 sp=0xc000507eb0 pc=0x509f4
runtime.gcBgMarkWorker(0xc00001d140)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/mgc.go:1333 +0x104 fp=0xc000507f98 sp=0xc000507ee0 pc=0x2e924
runtime.gcBgMarkStartWorkers.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/mgc.go:1249 +0x4c fp=0xc000507fc0 sp=0xc000507f98 pc=0x2e80c
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000507fc0 sp=0xc000507fc0 pc=0x8bac4
created by runtime.gcBgMarkStartWorkers in goroutine 143
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/mgc.go:1249 +0x170
2024-02-07 23:54 gotip-linux-ppc64le go@cde38c96 cmd/cgo/internal/testplugin.TestIssue25756 [ABORT] (log)
=== RUN   TestIssue25756
( GOPATH=$TMPDIR PWD=$TMPDIR/src/testplugin LD_LIBRARY_PATH=$TMPDIR/src/testplugin /home/swarming/.swarming/w/ir/x/w/goroot/bin/go build -gcflags '' -buildmode=plugin -o life.so ./issue25756/plugin )
SIGQUIT: quit
PC=0x8d54c m=0 sigcode=0

goroutine 0 gp=0x2add40 m=0 mp=0x2ae5c0 [idle]:
runtime.futex(0x2ae700, 0x80, 0x0, 0x0, 0x0, 0x0)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/sys_linux_ppc64x.s:644 +0x1c fp=0x7fffe653fd88 sp=0x7fffe653fd88 pc=0x8d54c
runtime.futexsleep(0x0?, 0x55b60?, 0x2ae5c0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/os_linux.go:69 +0x30 fp=0x7fffe653fde8 sp=0x7fffe653fd88 pc=0x4a440
...
r20  0x4	r21  0x3
r22  0x7fffe65304c0	r23  0x1
r24  0x0	r25  0x2917a0
r26  0x16	r27  0x9740b12b
r28  0x37935770	r29  0xc00002f7e0
r30  0x2add40	r31  0x1cce0
pc   0x8d54c	ctr  0x0
link 0x4a440	xer  0x0
ccr  0x34420888	trap 0xc00
*** Test killed with quit: ran too long (10m0s).
2024-02-08 03:02 gotip-linux-ppc64le go@1400b268 cmd/cgo/internal/testcarchive.TestCompileWithoutShared [ABORT] (log)
=== RUN   TestCompileWithoutShared
    carchive_test.go:1094: [go build -buildmode=c-archive -gcflags=-shared=false -o libgo2.a ./libgo2]
panic: test timed out after 9m0s
running tests:
	TestCompileWithoutShared (53s)

goroutine 130 gp=0xc000082380 m=3 mp=0xc000045008 [running]:
panic({0x180340?, 0xc00009e040?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc000070ee0 sp=0xc000070e20 pc=0x4d0a4
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc00006bf20 sp=0xc00006beb0 pc=0x1087ac
os/exec.(*Cmd).Start.func2(0x1c0000?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc00006bf98 sp=0xc00006bf20 pc=0x109388
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc00006bfc0 sp=0xc00006bf98 pc=0x109328
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc00006bfc0 sp=0xc00006bfc0 pc=0x8bf24
created by os/exec.(*Cmd).Start in goroutine 108
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-08 03:02 gotip-linux-ppc64le go@1400b268 cmd/cgo/internal/testshared.TestGlobal [ABORT] (log)
=== RUN   TestGlobal
    shared_test.go:89: go install -buildmode=shared -linkshared ./globallib
panic: test timed out after 9m0s
running tests:
	TestGlobal (7s)

goroutine 155 gp=0xc0001a8000 m=6 mp=0xc000080808 [running]:
panic({0x1c3760?, 0xc000096590?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc000186ee0 sp=0xc000186e20 pc=0x4d3d4
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc00010bf20 sp=0xc00010beb0 pc=0x1340cc
os/exec.(*Cmd).Start.func2(0x5074636f4d74?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc00010bf98 sp=0xc00010bf20 pc=0x134ca8
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc00010bfc0 sp=0xc00010bf98 pc=0x134c48
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc00010bfc0 sp=0xc00010bfc0 pc=0x8c204
created by os/exec.(*Cmd).Start in goroutine 99
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-08 03:02 gotip-linux-ppc64le go@1400b268 cmd/go.TestScript (log)
=== RUN   TestScript/test_ppc64le_cgo_inline_plt
=== PAUSE TestScript/test_ppc64le_cgo_inline_plt
=== CONT  TestScript/test_ppc64le_cgo_inline_plt
    script_test.go:132: 2024-02-08T04:38:20Z
    script_test.go:134: $WORK=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-3634367634/tmpdir2719121339/test_ppc64le_cgo_inline_plt170024116
    script_test.go:156: 
        PATH=/home/swarming/.swarming/w/ir/x/t/cmd-go-test-3634367634/tmpdir2719121339/testbin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/x/w/goroot/bin:/home/swarming/.swarming/w/ir/cache/tools/bin:/home/swarming/.swarming/w/ir/bbagent_utility_packages:/home/swarming/.swarming/w/ir/bbagent_utility_packages/bin:/home/swarming/.swarming/w/ir/cipd_bin_packages:/home/swarming/.swarming/w/ir/cipd_bin_packages/bin:/home/swarming/.swarming/w/ir/cache/cipd_client:/home/swarming/.swarming/w/ir/cache/cipd_client/bin:/home/swarming/.swarming/cipd_cache/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
        HOME=/no-home
        CCACHE_DISABLE=1
        GOARCH=ppc64le
...
        r22  0xc0004123a0	r23  0xc0004145b0
        r24  0x0	r25  0x0
        r26  0x0	r27  0x0
        r28  0x0	r29  0x1
        r30  0xc00009aa80	r31  0x100a0ed8
        pc   0x10008310	ctr  0x0
        link 0x100082dc	xer  0x0
        ccr  0x34428884	trap 0xc00
    script_test.go:156: FAIL: testdata/script/test_ppc64le_cgo_inline_plt.txt:17: go build -ldflags=-linkmode=internal: exit status 2
--- FAIL: TestScript/test_ppc64le_cgo_inline_plt (65.75s)
2024-02-08 17:31 gotip-linux-ppc64le go@210f051d cmd/cgo/internal/testshared.TestIssue44031 [ABORT] (log)
=== RUN   TestIssue44031
    shared_test.go:89: go install -buildmode=shared -linkshared ./issue44031/a
    shared_test.go:89: go install -buildmode=shared -linkshared ./issue44031/b
panic: test timed out after 9m0s
running tests:
	TestIssue44031 (25s)

goroutine 182 gp=0xc0001d2a80 m=5 mp=0xc000080808 [running]:
panic({0x1c3760?, 0xc00020a2a0?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc00033dee0 sp=0xc00033de20 pc=0x4d3d4
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc00006df20 sp=0xc00006deb0 pc=0x1340cc
os/exec.(*Cmd).Start.func2(0x0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc00006df98 sp=0xc00006df20 pc=0x134ca8
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc00006dfc0 sp=0xc00006df98 pc=0x134c48
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc00006dfc0 sp=0xc00006dfc0 pc=0x8c204
created by os/exec.(*Cmd).Start in goroutine 112
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-08 17:31 gotip-linux-ppc64le go@210f051d runtime.TestCgoPprofPIE [ABORT] (log)
=== RUN   TestCgoPprofPIE
=== PAUSE TestCgoPprofPIE
=== CONT  TestCgoPprofPIE
    crash_test.go:152: running /home/swarming/.swarming/w/ir/x/w/goroot/bin/go build -o /home/swarming/.swarming/w/ir/x/t/go-build2660985437/testprogcgo_-buildmode=pie.exe -buildmode=pie
2024-02-08 18:16 gotip-linux-ppc64le go@4e91c569 cmd/cgo/internal/testshared.TestIssue47873 [ABORT] (log)
=== RUN   TestIssue47873
    shared_test.go:89: go install -buildmode=shared -linkshared ./issue47837/a
panic: test timed out after 9m0s
running tests:
	TestIssue47873 (37s)

goroutine 128 gp=0xc0001bc540 m=3 mp=0xc000045008 [running]:
panic({0x1c3760?, 0xc000020080?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:768 +0x174 fp=0xc0000d3ee0 sp=0xc0000d3e20 pc=0x4d3d4
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc000087f20 sp=0xc000087eb0 pc=0x1340cc
os/exec.(*Cmd).Start.func2(0x2106e0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc000087f98 sp=0xc000087f20 pc=0x134ca8
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc000087fc0 sp=0xc000087f98 pc=0x134c48
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000087fc0 sp=0xc000087fc0 pc=0x8c204
created by os/exec.(*Cmd).Start in goroutine 170
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec

watchflakes

@pmur
Copy link
Contributor

pmur commented Feb 8, 2024

Hrm, I wonder if this is timing out more frequently because the jobs are not running on a tmpfs. The only tmpfs I have mounted to the containers is /workdir (as was used by the old CI), it doesn't look to be used.

Is there an option to move the work/caches to a tmpfs? My initial though is to mount /home/swarming as a tmpfs.

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64le` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || `running too slowly` || status == "ABORT")
2024-01-30 14:41 gotip-linux-ppc64le go@b89ad464 go/printer.TestFiles/statements.input (log)
=== RUN   TestFiles/statements.input
=== PAUSE TestFiles/statements.input
=== CONT  TestFiles/statements.input
    printer_test.go:160: testdata/statements.input: running too slowly
--- FAIL: TestFiles/statements.input (10.00s)

watchflakes

@gopherbot
Copy link

Change https://go.dev/cl/563396 mentions this issue: main.star: apply timeout scale to base builders, scale netbsd-arm64

gopherbot pushed a commit to golang/build that referenced this issue Feb 12, 2024
Factor the timeout_scale (int) → GO_TEST_TIMEOUT_SCALE env (string)
conversion out of make_run_mod, so that it has effect on slow hosts
whose builders don't have any run mods.

Also mark the netbsd-arm64 host as slow. It's timing out during some
tests, and the old dashboard had a timeout scale value of 10 set for
it. To confirm that high of a value is needed, start with 2 and bump
it further as needed.

For golang/go#65171.

Change-Id: Ib1206bf603502ba951cee025da0db24e4342214e
Reviewed-on: https://go-review.googlesource.com/c/build/+/563396
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@dmitshur
Copy link
Contributor

dmitshur commented Feb 12, 2024

Reports above seem to be because the GO_TEST_TIMEOUT_SCALE=2 env var wasn't fully propagated to the environment of one of the two builders. That should be fixed for future builds (CL 563396). Closing again optimistically, watchflakes will reopen if there turns out to be more to do after that.

@pmur
Copy link
Contributor

pmur commented Feb 12, 2024

I've also changed the docker configuration to mount /home/swarming as a tmpfs.

@cherrymui cherrymui changed the title x/build: time out on LUCI gotip-linux-ppc64le builder x/build: time out on LUCI Linux/PPC64x builder Feb 15, 2024
@gopherbot gopherbot reopened this Feb 15, 2024
@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || `running too slowly` || status == "ABORT")
2024-02-13 04:23 x_tools-go1.22-linux-ppc64le tools@afd84280 release-branch.go1.22@20107e05 x/tools/gopls/internal/test/integration/misc.TestRunGovulncheckStd/default [ABORT] (log)
=== RUN   TestRunGovulncheckStd/default
signal: killed
2024-02-13 04:23 x_tools-go1.22-linux-ppc64le tools@afd84280 release-branch.go1.22@20107e05 x/tools/gopls/internal/util/safetoken.TestGoplsSourceDoesNotCallTokenFileMethods [ABORT] (log)
=== RUN   TestGoplsSourceDoesNotCallTokenFileMethods
signal: killed
2024-02-14 20:25 gotip-linux-ppc64-power10 go@d90a57ff cmd/go.TestBadCommandLines [ABORT] (log)
=== RUN   TestBadCommandLines
=== PAUSE TestBadCommandLines
2024-02-14 20:25 gotip-linux-ppc64-power10 go@d90a57ff cmd/go/internal/modfetch/codehost.TestLatest [ABORT] (log)
=== RUN   TestLatest
=== PAUSE TestLatest
2024-02-14 21:42 x_tools-gotip-linux-ppc64-power10 tools@fef8b627 go@d90a57ff x/tools/gopls/internal/test/integration/misc.TestRunGovulncheckStd [ABORT] (log)
=== RUN   TestRunGovulncheckStd
2024-02-14 21:42 x_tools-gotip-linux-ppc64-power10 tools@fef8b627 go@d90a57ff x/tools/gopls/internal/test/marker.Test [ABORT] (log)
=== RUN   Test
2024-02-14 21:42 x_tools-gotip-linux-ppc64-power10 tools@fef8b627 go@d90a57ff x/tools/gopls/internal/util/safetoken.TestGoplsSourceDoesNotCallTokenFileMethods [ABORT] (log)
=== RUN   TestGoplsSourceDoesNotCallTokenFileMethods
signal: killed

watchflakes

@pmur
Copy link
Contributor

pmur commented Feb 15, 2024

The latest batch of failures is the same issue as #65725. I only adjusted the RAM limits on the ppc64le LUCI containers. Looking through the logs, the ppc64-power10 LUCI containers also need more RAM.

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || `running too slowly` || status == "ABORT")
2024-02-15 17:22 x_tools-go1.21-linux-ppc64-power10 tools@7240af8b release-branch.go1.21@b214108e x/tools/gopls/internal/test/integration/misc.TestInconsistentVendoring [ABORT] (log)
=== RUN   TestInconsistentVendoring
2024-02-15 17:22 x_tools-go1.21-linux-ppc64-power10 tools@7240af8b release-branch.go1.21@b214108e x/tools/gopls/internal/test/marker.Test [ABORT] (log)
=== RUN   Test
2024-02-15 17:22 x_tools-go1.21-linux-ppc64-power10 tools@7240af8b release-branch.go1.21@b214108e x/tools/gopls/internal/util/safetoken.TestGoplsSourceDoesNotCallTokenFileMethods [ABORT] (log)
=== RUN   TestGoplsSourceDoesNotCallTokenFileMethods
signal: killed

watchflakes

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- builder ~  `(gotip|go1\.\d\d)-linux-ppc64` && (`test timed out` || `SIGQUIT` || `context deadline exceeded` || `running too slowly` || status == "ABORT")
2024-02-08 16:18 go1.22-linux-ppc64le release-branch.go1.22@20107e05 cmd/cgo/internal/testcarchive.TestDeepStack [ABORT] (log)
=== RUN   TestDeepStack
=== PAUSE TestDeepStack
=== CONT  TestDeepStack
2024-02-08 16:18 go1.22-linux-ppc64le release-branch.go1.22@20107e05 cmd/cgo/internal/testshared.TestIssue62277 [ABORT] (log)
=== RUN   TestIssue62277
    shared_test.go:89: go install -buildmode=shared -linkshared ./issue62277/p
panic: test timed out after 9m0s
running tests:
	TestIssue62277 (7s)

goroutine 200 gp=0xc0000a0e00 m=0 mp=0x361c60 [running]:
panic({0x1c3740?, 0xc000204430?})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/panic.go:779 +0x174 fp=0xc000089ee0 sp=0xc000089e20 pc=0x4d164
testing.(*M).startAlarm.func1()
...
os/exec.(*Cmd).writerDescriptor.func1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:577 +0x4c fp=0xc000085f20 sp=0xc000085eb0 pc=0x133c6c
os/exec.(*Cmd).Start.func2(0x312e302e302e3732?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:724 +0x48 fp=0xc000085f98 sp=0xc000085f20 pc=0x134848
os/exec.(*Cmd).Start.gowrap1()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:736 +0x58 fp=0xc000085fc0 sp=0xc000085f98 pc=0x1347e8
runtime.goexit({})
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/asm_ppc64x.s:1018 +0x4 fp=0xc000085fc0 sp=0xc000085fc0 pc=0x8bdb4
created by os/exec.(*Cmd).Start in goroutine 195
	/home/swarming/.swarming/w/ir/x/w/goroot/src/os/exec/exec.go:723 +0x8ec
2024-02-16 15:53 go1.22-linux-ppc64-power10 release-branch.go1.22@fb86598c cmd/go.TestBadCommandLines [ABORT] (log)
=== RUN   TestBadCommandLines
=== PAUSE TestBadCommandLines
2024-02-16 15:53 go1.22-linux-ppc64-power10 release-branch.go1.22@fb86598c cmd/go/internal/modfetch/codehost.TestReadFile [ABORT] (log)
=== RUN   TestReadFile
=== PAUSE TestReadFile
=== CONT  TestReadFile
--- PASS: TestReadFile (0.00s)
2024-02-16 15:53 go1.22-linux-ppc64-power10 release-branch.go1.22@fb86598c cmd/go/internal/vcweb/vcstest.TestScripts [ABORT] (log)
=== RUN   TestScripts
2024-02-16 17:49 x_tools-go1.22-linux-ppc64le tools@c61f99f1 release-branch.go1.22@fb86598c x/tools/gopls/internal/test/marker.Test [ABORT] (log)
=== RUN   Test
2024-02-19 07:24 gotip-linux-ppc64le go@35fa852d cmd/go.TestBadCommandLines [ABORT] (log)
=== RUN   TestBadCommandLines
=== PAUSE TestBadCommandLines
2024-02-19 07:24 gotip-linux-ppc64le go@35fa852d cmd/link.TestCGOLTO [ABORT] (log)
=== RUN   TestCGOLTO
=== PAUSE TestCGOLTO
=== CONT  TestCGOLTO
    cgo_test.go:32: /home/swarming/.swarming/w/ir/x/w/goroot/bin/go env CC: "gcc"
    cgo_test.go:32: /home/swarming/.swarming/w/ir/x/w/goroot/bin/go env CGO_CFLAGS: "-O2 -g"

watchflakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-ppc64x Builders x/build issues (builders, bots, dashboards) FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: Done
Status: Done
Development

No branches or pull requests

5 participants