Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: "fatal error: out of memory" on windows-arm64-11 #51019

Open
bcmills opened this issue Feb 4, 2022 · 21 comments
Open

x/build: "fatal error: out of memory" on windows-arm64-11 #51019

bcmills opened this issue Feb 4, 2022 · 21 comments
Labels
arch-arm64 Builders NeedsInvestigation OS-Windows
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Feb 4, 2022

greplogs --dashboard -md -l -e '(?ms)\Awindows-arm64.*^fatal error: out of memory' --since=2021-01-01

2022-02-04T14:02:15-25d2ab2-4afcc9f/windows-arm64-11

We may need to reconfigure the builder to either turn down the build/test parallelism or have more RAM available.

There is only one of these failures in the logs, but OTOH this builder has only ever run x/tools 12 times — so that's an 8% failure rate for this repo so far. 😅

(attn @golang/release)

@bcmills bcmills added arch-arm64 Builders OS-Windows NeedsInvestigation labels Feb 4, 2022
@gopherbot gopherbot added this to the Unreleased milestone Feb 4, 2022
@heschi heschi added the WaitingForInfo label Feb 10, 2022
@heschi
Copy link
Contributor

@heschi heschi commented Feb 10, 2022

So far I'm not seeing any recurrences on what I assume is a much higher number of runs. We can keep an eye on it but right now I'm inclined to leave it alone.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 11, 2022

So far I'm not seeing any recurrences

Here's one running x/crypto:

greplogs --dashboard -md -l -e '(?ms)\Awindows-arm64.*^fatal error: out of memory' --since=2022-02-05

2022-02-10T15:16:21-f4118a5-656d3f4/windows-arm64-11

@bcmills bcmills changed the title x/build: "fatal error: out of memory" building x/tools on windows-arm64-11 x/build: "fatal error: out of memory" building tests on windows-arm64-11 Feb 11, 2022
@bcmills bcmills changed the title x/build: "fatal error: out of memory" building tests on windows-arm64-11 x/build: "fatal error: out of memory" building on windows-arm64-11 Feb 11, 2022
@bcmills bcmills changed the title x/build: "fatal error: out of memory" building on windows-arm64-11 x/build: "fatal error: out of memory" on windows-arm64-11 Feb 11, 2022
@bcmills bcmills removed the WaitingForInfo label Feb 11, 2022
@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 11, 2022

From the sheer number of packages that failed in each of those logs, I suspect that the parallelism is being set to high. What's the CPU-to-RAM ratio for this builder? (Maybe we could scale down GOMAXPROCS?)

@heschi
Copy link
Contributor

@heschi heschi commented Feb 11, 2022

About 12G RAM for 8 cores, which seems pretty plausible to me? There isn't much precedent for tweaking GOMAXPROCS but I guess it's worth a try.

...are the crypto tests really that memory hungry though? Smells weird.

@dmitshur
Copy link
Contributor

@dmitshur dmitshur commented Feb 11, 2022

CL 381514 for #50084 is some recent precedent.

@gopherbot
Copy link

@gopherbot gopherbot commented Feb 11, 2022

Change https://go.dev/cl/385182 mentions this issue: dashboard: reduce GOMAXPROCS on Windows 11 ARM64

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 11, 2022

are the crypto tests really that memory hungry though? Smells weird.

I agree. It looks like the actual OOM happened while recompiling packages in std, so probably more about compiler memory usage than test memory usage per se — but it's not clear to me why those packages would have been stale in the first place. 🤔

gopherbot pushed a commit to golang/build that referenced this issue Feb 11, 2022
The Windows 11 ARM64 builder is experiencing occasional OOMs while
building tests. Reducing GOMAXPROCS will reduce the go command's
parallelism and hopefully prevent them.

For golang/go#51019.

Change-Id: Ia4bfdddaca178c130b9b57087a66a54cff903a05
Reviewed-on: https://go-review.googlesource.com/c/build/+/385182
Trust: Heschi Kreinick <heschi@google.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
Auto-Submit: Heschi Kreinick <heschi@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 15, 2022

Unfortunately still OOMing even with GOMAXPROCS=4.

greplogs --dashboard -md -l -e '(?ms)\Awindows-arm64.*^fatal error: out of memory' --since=2022-02-12

2022-02-15T14:54:27-76bd8ea/windows-arm64-11

It's not at all clear to me why this is happening for the -11 builder but not the -10 builder — are they running on different host configurations?

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 15, 2022

Oh, hrm. The failure condition in that last one is a bit different — it OOMed during bootstrapping. 🤔

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 22, 2022

Three more OOMs over the weekend: one during bootstrapping in the main repo, and two during x/tools builds.

greplogs --dashboard -md -l -e '(?ms)\Awindows-arm64.*^fatal error: .*(?:out of memory|cannot allocate memory)' --since=2022-02-16

2022-02-20T20:58:11-851ecea/windows-arm64-11
2022-02-17T17:37:24-1f3875c-eaf0405/windows-arm64-11
2022-02-17T17:37:03-fd59bdf-eaf0405/windows-arm64-11

@bcmills
Copy link
Member Author

@bcmills bcmills commented Mar 17, 2022

Still ongoing:

greplogs --dashboard -md -l -e '(?ms)\Awindows-arm64.*^fatal error: .*(?:out of memory|cannot allocate memory)' --since=2022-02-23

2022-03-15T13:54:34-6799a7a-e475cf2/windows-arm64-11
2022-03-14T09:19:01-b769efc-ab0f761/windows-arm64-11
2022-03-12T23:32:36-842d37e/windows-arm64-11

@heschi
Copy link
Contributor

@heschi heschi commented Mar 17, 2022

We got bit twice during the release during bootstrap too. But I have no idea what to do about it.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 12, 2022

Still happening quite frequently, but only on the -11 builder. Does it have the same hardware configuration as the -10 builder?

greplogs --dashboard -md -l -e '(?ms)\Awindows-arm64.*^fatal error: .*(?:out of memory|cannot allocate memory)' --since=2022-03-17

2022-04-11T15:41:56-32de2b0/windows-arm64-11
2022-04-11T02:55:52-a6f6932/windows-arm64-11
2022-04-11T01:24:31-b6fb3af/windows-arm64-11
2022-04-02T14:28:33-8a816d5/windows-arm64-11
2022-03-21T13:26:21-86b02b3-7eaad60/windows-arm64-11
2022-03-21T13:26:21-86b02b3-4aa1efe/windows-arm64-11
2022-03-19T23:49:55-fa8efc1/windows-arm64-11

@heschi
Copy link
Contributor

@heschi heschi commented Apr 12, 2022

Yep, same qemu script. My best guess is some kind of OS issue/conflict with the emulator, but I have no idea how to prove or disprove that belief.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 18, 2022

I wonder if this is somehow related to #49564, in that they both involve unexpected OOM failures on Windows.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 18, 2022

@golang/release, is there a way to get the runtime to dump the current heap size when it fails with cannot allocate memory? It would probably be useful to know whether these OOMs are occurring due to wildly oversized heaps like the one in #49564 (comment).

@bcmills
Copy link
Member Author

@bcmills bcmills commented May 3, 2022

Here's a new (but likely related) failure mode:

windows-arm64-11 at a41e37f56a4fc2523ac88a76bf54ba3e45dcf533
…
Building Go cmd/dist using C:\workdir\go1.4
go tool compile: fork/exec C:\workdir\go1.4\pkg\tool\windows_arm64\compile.exe: The paging file is too small for this operation to complete.

greplogs -l -e 'The paging file is too small' --since=2022-01-01
2022-05-03T12:34:17-a41e37f/windows-arm64-11
2022-04-24T01:22:21-86c51ed-96c8cc7/windows-arm64-11
2022-04-18T12:04:50-8db23f8-91b9915/windows-arm64-11
2022-04-15T15:57:52-2c73f5f/windows-arm64-11
2022-03-21T13:26:21-86b02b3-7eaad60/windows-arm64-11
2022-03-21T13:26:21-86b02b3-4aa1efe/windows-arm64-11
2022-03-14T09:19:01-b769efc-ab0f761/windows-arm64-11
2022-02-17T17:37:03-fd59bdf-eaf0405/windows-arm64-11
2022-02-15T14:54:27-76bd8ea/windows-arm64-11
2022-02-10T15:16:21-f4118a5-656d3f4/windows-arm64-11
2022-02-04T14:02:15-25d2ab2-4afcc9f/windows-arm64-11

@bcmills
Copy link
Member Author

@bcmills bcmills commented May 3, 2022

The above failure mode suggests that there is a problem with the builder itself, not (just) #52433, since that failure occurred during bootstrapping using the old and venerable go1.4 toolchain.

@bcmills
Copy link
Member Author

@bcmills bcmills commented May 16, 2022

@gopherbot
Copy link

@gopherbot gopherbot commented May 26, 2022

Change https://go.dev/cl/408702 mentions this issue: dashboard: add known issue for windows-arm64-11

gopherbot pushed a commit to golang/build that referenced this issue May 26, 2022
For golang/go#52653.
Updates golang/go#51019.

Change-Id: Ie57f7b2c2b6d4c3cc4b5f5f886773dff2a36a61e
Reviewed-on: https://go-review.googlesource.com/c/build/+/408702
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Alex Rakoczy <alex@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 Builders NeedsInvestigation OS-Windows
Projects
None yet
Development

No branches or pull requests

4 participants