x/build: frequent failures on freebsd-arm-paulzhol builder with "signal: killed" since 2021-12-23 #50540
The most recent ones which look like:
Are due to out of memory during bootstrap. This also sometimes happened mid test like:
They were due to the loss of a network connectivity to a block device used as swap, I've restored it. But it stil seems 1.18 is a much heavy memory user.
There used to be a way to control the bots by canceling and triggering a run a new. I think I've been locked out of that for several years. And besides it can never keep up with all the builds when it takes more than 3 houres per run.
That could be due to #44167 — including non-heap sources of GC work in the pacing decisions can cause less frequent collection when those sources are a significant fraction of memory usage.
The command that does that is
I think that's fine in general, as long as the builder makes regular forward progress? (It seems ok to be missing test runs in the middle of a burst of changes as long as we occasionally get an up-to-date run at the end of the burst.)
I understand the Soon label was added to look into the sudden increase in failing builds starting with late December, and it seems it was determined to be related to an increase in memory use in Go 1.18. (If that's not working as expected, it probably needs a separate issue.) I'll remove the Soon label, since there doesn't appear to be a clear time-sensitive action that must be taken in the order of days or hours here.
As Bryan mentioned,
It seems the current state of the builder is that it is missing. From https://farmer.golang.org/#pools:
2022-04-27T00:09 is the most recent. I'm not sure I can do anything about it.
The path forward for FreeBSD on ARMv7 is probably running it virtualized under ARM64 anyway.
I think I've found the root cause. I've managed to reproduce it by running
The iscsid daemon (ISCSI initiatior control plane) plus the dhcient daemon were swapped out due to high memory presure. Later the lease can't be renewed and ISCSI can't re-establish the connection to the block device used for swap.
I've switched the following sysctls:
The first one is meant to prevent whole process swapping (seperate from paging in FreeBSD) of runnable but inactive processes like iscsid.
Additionally I've moved