Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: mind combined memory usage when concurrently processing large payloads #51057

Open
dmitshur opened this issue Feb 7, 2022 · 3 comments
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@dmitshur
Copy link
Contributor

dmitshur commented Feb 7, 2022

As part of WriteTgz/PutTar operations, the coordinator handles payloads that may be large (for example, a Go tree tarball is 23 MB, and the x/website tree is even larger). If the entire payload size is read into memory during the processing of these operations, then multiple operations made in parallel can lead to coordinator using up much more memory than it does under a more average workload, and exceeding its current 8 GB memory limit.

For now, we can double that limit to give it more room to handle workloads that may happen as part of normal Go development and debugging, without getting killed due to OOM.

Keeping this issue open to see what else we may need to do. There are various opportunities to investigate and consider:

  • see if these operations can be optimized to stream data rather than hold all bytes in memory during processing
  • limit concurrency of these requests
  • use disk when processing large tarballs
  • keep a larger memory allocation

CC @golang/release, @thanm, @aclements.

@dmitshur dmitshur added Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 7, 2022
@dmitshur dmitshur added this to the Backlog milestone Feb 7, 2022
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/383874 mentions this issue: cmd/coordinator: double memory limit

@dmitshur dmitshur modified the milestones: Backlog, Unreleased Feb 7, 2022
gopherbot pushed a commit to golang/build that referenced this issue Feb 8, 2022
We're seeing times when the coordinator runs into the current memory
limit, especially when some of its more memory-hungry operations are
under active use. Increase the memory available to it to help it get
through those workloads without the OOMKilling.

For golang/go#51057.

Change-Id: I1e0a5b63cf1106dc52f4abc85752bcbe0d9dab10
Reviewed-on: https://go-review.googlesource.com/c/build/+/383874
Trust: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
@dmitshur
Copy link
Contributor Author

dmitshur commented May 31, 2022

We've recently found that the large number of builders and golang.org/x repos can sometimes lead to a large amount of parallel work, which in turn consumes more memory than coordinator currently has.

As next steps, I'll send a CL to give coordinator more memory and later we can collect pprof samples to better understand how to reduce the memory use under such workloads.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/409594 mentions this issue: cmd/coordinator: triple the memory allocation

gopherbot pushed a commit to golang/build that referenced this issue May 31, 2022
The coordinator often hovers at under 2 GB of memory use when not
in active use, but has been observed to reach and exceed its 12 GB
allocation during times of heavy parallel builds, especially when
there are concurrent changes to release branches (as sometimes
happens during the release process).

Since there's only one instance of the coordinator and it has a real
need today, it's relatively inexpensive to give it more memory until
we make progress on understanding and possibly reducing it. Having
the coordinator not restart while many builds are in progress will
help reduce computational costs and waiting times.

For golang/go#51057.

Change-Id: I530cefc8aa401843684eea48a6a5d34072622fd5
Reviewed-on: https://go-review.googlesource.com/c/build/+/409594
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Planned
Development

No branches or pull requests

2 participants