#!watchflakes
default <- goarch ~ `ppc64` && date != "" && date > "2023-05-22" && date < "2023-06-01"
https://go.dev/cl/495596 added a default.pgo profile for cmd/compile, enabling a PGO build of the compiler (as long as -pgo=none is not set explicitly).
This caused a variety of crashes in the compiler on ppc64{le} builders:
2023-05-18T16:55:07-88f89d8/linux-ppc64le-power9osu
2023-05-18T13:41:27-33a601b/aix-ppc64
2023-05-18T12:52:14-7b0835d/aix-ppc64
2023-05-18T10:23:17-75add1c/aix-ppc64
2023-05-18T09:16:07-774f602/linux-ppc64-sid-power10
2023-05-18T09:16:07-774f602/linux-ppc64le-buildlet
2023-05-18T09:15:25-27906bb/aix-ppc64
2023-05-18T09:15:25-27906bb/linux-ppc64-sid-power10
2023-05-18T01:40:37-6ed8474/linux-ppc64-sid-buildlet
2023-05-18T01:40:37-6ed8474/linux-ppc64-sid-power10
2023-05-18T00:35:53-956d31e/linux-ppc64le-buildlet
2023-05-17T22:11:31-0b86a04/linux-ppc64le-buildlet
2023-05-17T21:53:11-c426c87/linux-ppc64-sid-buildlet
2023-05-17T21:53:11-c426c87/linux-ppc64le-power10osu
2023-05-17T21:44:30-2693ade/linux-ppc64-sid-power10
That CL also caused #60263. Since it caused several issues, the CL was reverted in https://go.dev/cl/496185. #60263 has since been fixed.
Given these failures are all on ppc64{le}, @dr2chase and I suspect that they are due to a bad ppc64-specific optimization (SSA rule, e.g.) that is tickled by the additional inlining caused by PGO.
I have had some success reproducing these crashes. Running all.bash in a loop on three linux-ppc64-sid-power10 builders concurrently with GOGC=5 usually gets me a failure in <30 minutes. Not stellar, but I think workable.
We should then be able to bisect down to a bad function with GOSSAHASH applied in inlineCostOK to enable/disable PGO-based inlining. (Also set -d=pgodevirtualize=0 to disable PGO-based devirtualization, which was submitted in https://go.dev/cl/492436, after the bad CL above was reverted).
Given that there is a path forward to debugging ppc64, and we'd like more soak time on the primary ports, I intend to resubmit https://go.dev/cl/495596, which will make ppc64 flaky until this issue is resolved. (GOARCH=ppc64{le} could also temporarily change the default of -pgo from auto to none if necessary).
cc @golang/ppc64 @dr2chase @aclements @cherrymui
https://go.dev/cl/495596 added a
default.pgoprofile forcmd/compile, enabling a PGO build of the compiler (as long as-pgo=noneis not set explicitly).This caused a variety of crashes in the compiler on ppc64{le} builders:
2023-05-18T16:55:07-88f89d8/linux-ppc64le-power9osu
2023-05-18T13:41:27-33a601b/aix-ppc64
2023-05-18T12:52:14-7b0835d/aix-ppc64
2023-05-18T10:23:17-75add1c/aix-ppc64
2023-05-18T09:16:07-774f602/linux-ppc64-sid-power10
2023-05-18T09:16:07-774f602/linux-ppc64le-buildlet
2023-05-18T09:15:25-27906bb/aix-ppc64
2023-05-18T09:15:25-27906bb/linux-ppc64-sid-power10
2023-05-18T01:40:37-6ed8474/linux-ppc64-sid-buildlet
2023-05-18T01:40:37-6ed8474/linux-ppc64-sid-power10
2023-05-18T00:35:53-956d31e/linux-ppc64le-buildlet
2023-05-17T22:11:31-0b86a04/linux-ppc64le-buildlet
2023-05-17T21:53:11-c426c87/linux-ppc64-sid-buildlet
2023-05-17T21:53:11-c426c87/linux-ppc64le-power10osu
2023-05-17T21:44:30-2693ade/linux-ppc64-sid-power10
That CL also caused #60263. Since it caused several issues, the CL was reverted in https://go.dev/cl/496185. #60263 has since been fixed.
Given these failures are all on ppc64{le}, @dr2chase and I suspect that they are due to a bad ppc64-specific optimization (SSA rule, e.g.) that is tickled by the additional inlining caused by PGO.
I have had some success reproducing these crashes. Running all.bash in a loop on three
linux-ppc64-sid-power10builders concurrently withGOGC=5usually gets me a failure in <30 minutes. Not stellar, but I think workable.We should then be able to bisect down to a bad function with GOSSAHASH applied in inlineCostOK to enable/disable PGO-based inlining. (Also set
-d=pgodevirtualize=0to disable PGO-based devirtualization, which was submitted in https://go.dev/cl/492436, after the bad CL above was reverted).Given that there is a path forward to debugging ppc64, and we'd like more soak time on the primary ports, I intend to resubmit https://go.dev/cl/495596, which will make ppc64 flaky until this issue is resolved. (GOARCH=ppc64{le} could also temporarily change the default of -pgo from auto to none if necessary).
cc @golang/ppc64 @dr2chase @aclements @cherrymui