Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: OOM on linux-ppc64le-power10osu builder #58261

Open
gopherbot opened this issue Feb 2, 2023 · 8 comments
Open

x/build: OOM on linux-ppc64le-power10osu builder #58261

gopherbot opened this issue Feb 2, 2023 · 8 comments
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@gopherbot
Copy link
Contributor

gopherbot commented Feb 2, 2023

#!watchflakes
post <- builder == "linux-ppc64le-power10osu" && log ~ `signal: killed`

Issue created automatically to collect these failures.

Example (log):

# go run run.go -- rotate1.go
exit status 1
command-line-arguments: /workdir/go/pkg/tool/linux_ppc64le/compile: signal: killed

watchflakes

@gopherbot gopherbot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 2, 2023
@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "rotate1.go" && test == ""
2023-01-31 16:54 linux-ppc64le-power10osu go@55a33d88 rotate1.go (log)
# go run run.go -- rotate1.go
exit status 1
command-line-arguments: /workdir/go/pkg/tool/linux_ppc64le/compile: signal: killed

watchflakes

@cherrymui cherrymui changed the title rotate1.go: unrecognized failures x/build: OOM on linux-ppc64le-power10osu builder Feb 2, 2023
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Feb 2, 2023
@gopherbot gopherbot added this to the Unreleased milestone Feb 2, 2023
@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
post <- builder == "linux-ppc64le-power10osu" && log ~ `signal: killed`
2023-01-31 16:53 linux-ppc64le-power10osu go@47e205c3 (log)
FAIL
2023/01/31 17:17:29 Failed: exit status 1
go tool dist: FAILED
2023-01-31 16:53 linux-ppc64le-power10osu go@43115ff0 (log)
FAIL
2023/01/31 17:17:50 Failed: exit status 1
go tool dist: FAILED
2023-02-01 21:30 linux-ppc64le-power10osu go@bd749504 chanlinear.go (log)
# go run run.go -- chanlinear.go
signal: killed
2023-02-01 21:30 linux-ppc64le-power10osu go@cda461bb index0.go (log)
# go run run.go -- index0.go
exit status 1
command-line-arguments: /workdir/go/pkg/tool/linux_ppc64le/compile: signal: killed

watchflakes

@bcmills
Copy link
Contributor

bcmills commented Feb 2, 2023

(attn @golang/ppc64)

@pmur
Copy link
Contributor

pmur commented Feb 2, 2023

Thanks. I suspect this was VM configuration issue which is hopefully resolved now.

Background, the VM was setup with too many vcpus (160), and only 30GB of RAM. At the time, OSU had some issues configuring the vcpu count. It's set appropriately now (24, as it needed to be a multiple of 8). Those unused cores may have caused excess RAM usage on the host.

@cherrymui
Copy link
Member

Thanks @pmur ! Sounds like we can close this for now. We can reopen if this happens again.

@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
post <- builder == "linux-ppc64le-power10osu" && log ~ `signal: killed`
2023-10-11 21:58 linux-ppc64le-power10osu text@6c97a165 go@31887586 (log)
FAIL
2023-10-11 21:58 linux-ppc64le-power10osu text@6c97a165 go@ea14b633 (log)
FAIL
2023-10-11 21:58 linux-ppc64le-power10osu text@6c97a165 go@3b303fa9 (log)
FAIL

watchflakes

@gopherbot gopherbot reopened this Nov 13, 2023
@pmur
Copy link
Contributor

pmur commented Nov 13, 2023

I killed any compile process running for more than 24 hours on the PPC64 builders due to #64067. The above are not related.

@pmur pmur closed this as completed Nov 13, 2023
@gopherbot gopherbot reopened this Feb 28, 2024
@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
post <- builder == "linux-ppc64le-power10osu" && log ~ `signal: killed`
2024-02-28 16:44 linux-ppc64le-power10osu go@db8c6c8c runtime.TestGdbBacktrace (log)
--- FAIL: TestGdbBacktrace (274.70s)
    runtime-gdb_test.go:78: gdb version 12.1
    runtime-gdb_test.go:468: GDB command timed out after 4m34.325312969s: /usr/bin/gdb -nx -batch -iex add-auto-load-safe-path /workdir/go/src/runtime -ex set startup-with-shell off -ex break main.eee -ex run -ex backtrace -ex continue /workdir/tmp/TestGdbBacktrace3768967929/001/a.exe
    runtime-gdb_test.go:473: gdb output:
        Loading Go Runtime support.
        Breakpoint 1 at 0x780f4: file /workdir/tmp/TestGdbBacktrace3768967929/001/main.go, line 17.
        [New LWP 51798]
        [New LWP 51799]
        [New LWP 51800]
        [New LWP 51801]

        Thread 1 "a.exe" hit Breakpoint 1, 0x00000000000780f4 in main.eee (~r0=<optimized out>) at /workdir/tmp/TestGdbBacktrace3768967929/001/main.go:17
        17	func eee() bool { return true }
        #0  0x00000000000780f4 in main.eee (~r0=<optimized out>) at /workdir/tmp/TestGdbBacktrace3768967929/001/main.go:17
        #1  0x00000000000780d4 in main.ddd (~r0=<optimized out>) at /workdir/tmp/TestGdbBacktrace3768967929/001/main.go:14
        #2  0x0000000000078084 in main.ccc (~r0=<optimized out>) at /workdir/tmp/TestGdbBacktrace3768967929/001/main.go:11
        #3  0x0000000000078044 in main.bbb (~r0=<optimized out>) at /workdir/tmp/TestGdbBacktrace3768967929/001/main.go:8
        #4  0x0000000000078004 in main.aaa (~r0=<optimized out>) at /workdir/tmp/TestGdbBacktrace3768967929/001/main.go:5
        #5  0x0000000000078124 in main.main () at /workdir/tmp/TestGdbBacktrace3768967929/001/main.go:22
        [LWP 51801 exited]
        [LWP 51798 exited]
        [LWP 51796 exited]
        [LWP 51799 exited]
    runtime-gdb_test.go:492: gdb exited with error: signal: killed

watchflakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Done
Development

No branches or pull requests

4 participants