Symlinked sandbox is slow #16711

lberki · 2022-11-09T09:02:30Z

Description of the bug:

The symlinked sandbox is slow when there is a large number of input files (I have seen reports of actions with up to 300K)

There are a number of ways one could improve this:

Creating the input directories and symlinks on multiple threads (SandboxHelpers currently does this on one thread)
Traversing the Java -> C++ boundary less frequently
Using one symlink per large tree artifact instead of symlinking each file in it separately
Using io_uring on Linux for more efficient data transfer to the kernel
Keeping the file system created for an action around and re-using it if the same action (or a similar one) is executed again

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

No response

What is the output of `bazel info release`?

No response

If `bazel info release` returns `development version` or `(@non-git)`, tell us how you built Bazel.

No response

What's the output of `git remote get-url origin; git rev-parse master; git rev-parse HEAD` ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

The text was updated successfully, but these errors were encountered:

matthewjh · 2022-11-09T13:27:33Z

Yes, please. It is known to be particularly slow on MacOS:

#8230

In my (albeit limited experience), the biggest barrier to integrating Bazel into local developer workflows (where quick feedback is paramount) is the sandboxing time, which in many cases far outstrips action exec time.

brentleyjones · 2022-11-09T14:25:49Z

Using one symlink per large tree artifact instead of symlinking each file in it separately

This one would really help for Apple platform builds where they produce bundles of bundles.

meisterT · 2022-11-14T15:08:18Z

cc @larsrc-google

@lberki There is --experimental_reuse_sandbox_directories - have you tried that?

larsrc-google · 2022-11-16T22:23:53Z

--experimental_reuse_sandbox_directories is essentially your point #5, and it helped a lot. #3 (tree artifacts) does sound reasonable. The others I'd like to have some measurements for first, to see how much we can actually save.

lberki · 2022-11-17T00:36:34Z

Learned today: https://github.com/ikorennoy/jasyncfio , io_uring in Java (I'm not sure if it's useful and it'd be an extra dependency, but I don't want this nugget of data to get lost)

larsrc-google · 2022-11-17T14:14:08Z

We'll need some reproducible examples. I tried compiling Bazel itself with and without sandbox in various worker/non-worker configurations, and the difference was minimal.

lberki · 2022-11-17T14:15:38Z

Did you try synthetic loads? That would be a much easier avenue than testing on Chrome OS / Kleaf builds (AFAIU they have 300K/80K input files, but I don't know how many and how big TreeArtifacts there are in the former)

meisterT · 2022-11-17T14:20:20Z

@lberki Can you share the build that actually triggers the slowness? From there we can work towards a more minimal repro.

lberki · 2022-11-17T16:00:48Z

Plussed @larsrc-google into the pertinent threads (unfortunately, they are Google internal communications even though they are about the interaction between two Google open source projects...)

jacky8hyf · 2023-09-06T03:05:29Z

Kleaf uses sandbox builds by default (though we also encourage developers to disable the costly sandboxes for local development). This feature will greatly improve the build time for Kleaf.

I can provide some metrics for the time spent on sandbox creation for Kleaf builds on build bots (ci.android.com) upon request (the data is public but the dashboard is internal only).

lberki · 2023-09-06T08:23:18Z

Ack. Numbers would be really helpful to aid in our prioritization decisions.

metti · 2024-02-29T16:35:14Z

I am currently into potential improvements for the SymlinkedSandbox. In particular, I explore pushing more work batched together to JNI and to facilitate io_uring for I/O.

lberki added the team-Local-Exec Issues and PRs for the Execution (Local) team label Nov 9, 2022

sgowroji added the type: bug label Nov 9, 2022

meisterT added the P3 We're not considering working on this, but happy to review a PR. (No assignee) label Aug 3, 2023

lberki assigned metti Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Symlinked sandbox is slow #16711

Symlinked sandbox is slow #16711

lberki commented Nov 9, 2022

matthewjh commented Nov 9, 2022 •

edited

brentleyjones commented Nov 9, 2022

meisterT commented Nov 14, 2022

larsrc-google commented Nov 16, 2022

lberki commented Nov 17, 2022

larsrc-google commented Nov 17, 2022

lberki commented Nov 17, 2022 •

edited

meisterT commented Nov 17, 2022

lberki commented Nov 17, 2022

jacky8hyf commented Sep 6, 2023

lberki commented Sep 6, 2023

metti commented Feb 29, 2024

Symlinked sandbox is slow #16711

Symlinked sandbox is slow #16711

Comments

lberki commented Nov 9, 2022

Description of the bug:

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Which operating system are you running Bazel on?

What is the output of bazel info release?

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

matthewjh commented Nov 9, 2022 • edited

brentleyjones commented Nov 9, 2022

meisterT commented Nov 14, 2022

larsrc-google commented Nov 16, 2022

lberki commented Nov 17, 2022

larsrc-google commented Nov 17, 2022

lberki commented Nov 17, 2022 • edited

meisterT commented Nov 17, 2022

lberki commented Nov 17, 2022

jacky8hyf commented Sep 6, 2023

lberki commented Sep 6, 2023

metti commented Feb 29, 2024

What is the output of `bazel info release`?

If `bazel info release` returns `development version` or `(@non-git)`, tell us how you built Bazel.

What's the output of `git remote get-url origin; git rev-parse master; git rev-parse HEAD` ?

matthewjh commented Nov 9, 2022 •

edited

lberki commented Nov 17, 2022 •

edited