Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: allow using overlayfs for action workspaces #4330

Merged
merged 4 commits into from
Mar 11, 2024
Merged

Conversation

bduffany
Copy link
Member

@bduffany bduffany commented Jul 17, 2023

Adds a new flag --executor.workspace.overlayfs_enabled which enables overlayfs for action workspaces and another flag --executor.workspace.overlayfs_anonymous_only to enable it only for anonymous users (which may be preferable when already using group-level filecache isolation for authenticated users). This provides an alternative to reflinking to get copy-on-write for action inputs.

The behavior of this flag is:

  • If overlayfs is enabled for the current task, then before running any actions in the workspace, "convert" the workspace to an overlayfs mount.
  • "Convert" does the following:
    • Rename the existing workspace path to {path}.lower - inputs will go in this dir and cannot be overwritten by actions
    • Create {path}.work (for overlayfs temp data) and {path}.upper (for changes)
    • Mount the overlayfs to the pre-existing {path}
  • This "conversion" approach allows the changes to the existing workspace package to be minimal. All we have to do is hardlink inputs to the {path}.lower dir. Everywhere else (e.g. action working directory, uploading action outputs), we can still use workspace.Path() to get the action working directory, which points to the overlayfs mount dir and is where the action should be reading and writing files.
  • After running an action, we perform an "apply" operation which applies the changes from the upper dir to the lower dir, without actually modifying any of the files in the lower dir. (This can also be thought of as "flattening" the layers.) This is done for 2 reasons:
    • Since we use dirHelper for downloading inputs and uploading outputs, it simplifies the code so that dirHelper can always just be working from lowerdir instead of downloading inputs from lowerdir and uploading from the mount dir.
    • It ensures that when we hardlink new inputs into the lowerdir for the next action (in the case of runner recycling), the changes are guaranteed to be visible (i.e. not "masked" by something in the upperdir).
  • Note: It might seem a bit "unsafe" to be messing with files in the upper and lower dirs, but this sort of thing is not discouraged in the manual pages for overlayfs. The specification for overlayfs is clear e.g. about what file deletion markers look like, so as long as we're following the specification, we should be OK.

Benchmark results (from GCP VM matching current executor configuration)

goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
BenchmarkOverlayfs/OverlayEnabled_ReadFile_1B-16                  132554              7824 ns/op
BenchmarkOverlayfs/OverlayDisabled_ReadFile_1B-16                 177919              6671 ns/op
BenchmarkOverlayfs/OverlayEnabled_ReadFile_1KiB-16                149977              7982 ns/op
BenchmarkOverlayfs/OverlayDisabled_ReadFile_1KiB-16               169365              6887 ns/op
BenchmarkOverlayfs/OverlayEnabled_ReadFile_4KiB-16                133058              8929 ns/op
BenchmarkOverlayfs/OverlayDisabled_ReadFile_4KiB-16               150868              7860 ns/op
BenchmarkOverlayfs/OverlayEnabled_ReadFile_8KiB-16                117721             10076 ns/op
BenchmarkOverlayfs/OverlayDisabled_ReadFile_8KiB-16               131685              9157 ns/op
BenchmarkOverlayfs/OverlayEnabled_ReadFile_50KiB-16                50624             23179 ns/op
BenchmarkOverlayfs/OverlayDisabled_ReadFile_50KiB-16               53535             22065 ns/op
BenchmarkOverlayfs/OverlayEnabled_ReadFile_100KiB-16               34107             35225 ns/op
BenchmarkOverlayfs/OverlayDisabled_ReadFile_100KiB-16              35740             33563 ns/op
BenchmarkOverlayfs/OverlayEnabled_WriteFile_1B-16                  38896             29462 ns/op
BenchmarkOverlayfs/OverlayDisabled_WriteFile_1B-16                 53720             21583 ns/op
BenchmarkOverlayfs/OverlayEnabled_WriteFile_1KiB-16                39532             29479 ns/op
BenchmarkOverlayfs/OverlayDisabled_WriteFile_1KiB-16               52353             21704 ns/op
BenchmarkOverlayfs/OverlayEnabled_WriteFile_4KiB-16                39133             29495 ns/op
BenchmarkOverlayfs/OverlayDisabled_WriteFile_4KiB-16               53686             21603 ns/op
BenchmarkOverlayfs/OverlayEnabled_WriteFile_8KiB-16                36180             32434 ns/op
BenchmarkOverlayfs/OverlayDisabled_WriteFile_8KiB-16               46580             24317 ns/op
BenchmarkOverlayfs/OverlayEnabled_WriteFile_50KiB-16               20928             57340 ns/op
BenchmarkOverlayfs/OverlayDisabled_WriteFile_50KiB-16              24051             49096 ns/op
BenchmarkOverlayfs/OverlayEnabled_WriteFile_100KiB-16              14228             84356 ns/op
BenchmarkOverlayfs/OverlayDisabled_WriteFile_100KiB-16             15501             74746 ns/op

Results from a simple build (overlayfs is about 20% slower):

build //server/util/...
1 local executor
---
anonymous user, overlayfs enabled
warmup build: done
build 1: 104.5s
build 2: 106.3s
---
authenticated user, overlayfs disabled
warmup build: done
build 1: 84.5s
build 2: 87.1s

I checked to see if there are any knobs available that would help improve performance - I found the volatile mount option but it didn't measurably affect performance (it seems to be more applicable to some niche workloads which do a lot of fsync operations - the article mentions "Certain tools like RPM call for a sync after every file is written to disk").

Related issues: https://github.com/buildbuddy-io/buildbuddy-internal/issues/2441

@bduffany bduffany merged commit 8a3fff5 into master Mar 11, 2024
11 of 17 checks passed
@bduffany bduffany deleted the ws-overlay branch March 11, 2024 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants