-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Important race condition when using remote cache without remote build #3360
Comments
That's awesome, thank you for reporting it! |
Wait, but this is a race condition in local bazel caching as well. If you compile :race without any remote_cache, chnge foo.in mid-flight, then after the build change it back and and try to recompile :race again, it will say that the target is up-to-date, but give you incorrect results. |
I see you're right, it's a local bazel caching issue as well, I hadn't noticed that but it makes sense. I agree with your suggestion of checking timestamps. I can't think of a better way to handle it than that. When I was thinking this was only a problem with the remote cache, I thought bazel could just silently decide not to upload the incorrect files to the remote cache, but continue the local build. But since, as you say, the local cache is wrong too, maybe the build has to be aborted. Ugh. That feels undesirable, but I'm not sure what else to do. I suppose it could requeue the action and run it again, with a limit of 3 tries or something? That might be nice, but it's getting pretty fancy. I'm also trying to figure out if there are more complicated scenarios that need to be considered. For example, suppose you have Package A, which compiles Input A, and Package B, which compiles Input B. A build is going at the same time that a |
As a longer-term solution, it might be interesting to explore copy-on-write filesystems: If sandboxed builds made copy-on-write copies of the input files instead of using symlinks, then the process of creating the copies should be fast and shouldn't use much additional disk space, but I think it would protect us against this (as long as the hashes or timestamps are double-checked after the copies are made). But this is sort of exotic, and only some newer filesystems support this, such as macOS APFS, Windows ReFS, and Linux ZFS (I'm saying this based on a quick Google search, I may be wrong about support in those filesystems). |
cc: @ulfjack |
Thanks for the report. Well spotted. Technically, yes, there is the possibility of a race in the local case. However, it's sufficiently unlikely to happen. We can't make it completely reliable in all cases since it runs on your local machine and most people have root access to their local machines, so can do whatever they want. The remote cache is a bigger concern, because the bad result from one machine can poison the results for other users. Note that any user can already poison the results for any other user with a shared cache. You should only use a shared cache if you completely, fully trust everyone in your team; anyone with write access to the shared cache can own everyone else's machine. As such, using a user-writable shared cache is something that we'll warn people against. I apparently didn't add that to the documentation yet. The compromise is to have a cache that's written by an automated process, e.g., a CI system, which very few people have access to, and configure the clients for read-only access. That is, if you don't do remote execution on machines that are locked down. That said, we might want to double-check the checksums of the files before we upload an entry to the remote cache, as modifying source files during a build is a common-enough operation that this can easily happen by accident. If we enable --watchfs by default, we could check for relevant events for the relevant files between starting an action and writing the cache entry. The watcher APIs don't guarantee timely delivery, so an event that happens before writing a cache entry could still be delayed until after (so there's still a race), but it would be much less likely to happen. It also won't work for network file systems, because the relevant APIs usually don't. I think this is going to require some infrastructure to make it possible to check for file change events in the right places. We could maybe double-check the checksum before uploading the cache entry, but that could also be prohibitively expensive. Here's what isn't going to work:
|
Checking the checksums doesn't eliminate the race condition in general (see provided example, where the file changes from 1 to 2 and back to 1). We need to check the timestamps, which might actually be cheaper. The local cache problem means that the remote cache problem will be reproducible for a particular user -- their local cache will be hit first, return an incorrect result, and the build will continue wrongly from there. So users who run Bazel in a loop will need to remember to do a bazel clean often... |
Sure, I think using the watcher is actually a better plan. Absent that, we'd actually check the ctime, which we already record in some cases (but maybe not all?). Uhm, what? There's no problem with the local cache. The case that you describe is something we've never observed in practice, and anyone running bazel clean is doing it wrong. |
I think before uploading we should read the whole file into memory, then compute the checksum, then do the upload. This at least guarantees that the hash and the contents match.
I agree and disagree. It seems to me that anyone would want to run with a content-addressable storage cache that computes the hash server-side and doesn't trust the client's hash. So unless a malicious user manages to create a hash-collision, things should be fine. The remote cache can use the hash as a hint and return an error if its own computation and the client provided hash mismatch, but should never just accept the client's hash. |
We aren't uploading the input files to the remote cache. We do upload the output files, but the remote worker has code to double-check that the checksum matches. The action cache is not a pure CAS. The action results are entries from action hash to action result. You're right that it's impossible to poison a CAS (if the remote server checks the checksums), but that's not the problem. I think it's not ideal that we store CAS and non-CAS entries in exactly the same way - implementations might need to distinguish between the two and use different storage mechanisms. It's on my TODO list to change the SimpleBlobStore interface (or introduce a new one) that allows implementations to store these separately, and then also update the Ondisk implementation to put the CAS and non-CAS entries into different directories. Also, I apologize for my last comment. Thanks to Ola for pointing out the flaws in my not-well-thought-through previous post, and I agree that we should track file modification time. I'm still worried about costs, since it requires one additional stat per input file per action, but it's probably ok. |
The action hash is a hash of the action description, not of the action result. If we knew the action result already, we wouldn't need to ask the cache. |
I'm convincing myself that we shouldn't make a release without a bug fix or workaround for this issue. |
Workaround for bazelbuild#3360. After an action finishes executing, we then re-check the timestamps of all of the input files to that action. If any of them have changed, we don't upload the action to the remote cache. This does not address the problem that even the local cache is incorrect. Fortunately, that problem is more rare, and also it can be worked around with "bazel clean".
The 0.5.3 release will contain a bug fix by @ola-rozenfeld that should lower the chances of hitting the remote cache race by checking the input files |
Lower the chances of triggering #3360. Immediately before uploading output artifacts, check if any of the inputs ctime changed, and if so don't upload the outputs to the remote cache. I profiled the runs on //src:bazel before vs. after the change; as expected, the number of VFS_STAT increases by a factor of 180% (!), but, according to the profiler, it adds less than 1ms to the overall build time of 130s. PiperOrigin-RevId: 162179308
Thanks for your work on this. @ola-rozenfeld's fix addresses the example I gave in my original description, but it's still not difficult to run into this problem with only slightly more complicated scenarios. This is real, and I think it's likely that my team would hit this if we were using the current version of Bazel. (We are currently using a fork of Bazel with my own fix, similar to what I describe below.) Luckily, I think a fix similar to Ola's is not too difficult (although unfortunately it wouldn't be as localized as her fix was). The problem is that the current fix gets the ctime immediately before the action runs, and then immediately after the action runs; but often, the actual digests of the inputs files were computed much earlier during Bazel's execution. The real-world scenario is: The user starts a build (with local compilation but with remote cache) which will take "a little while" (let's say, at least 30-60 seconds). While the build is going, the user saves a file in their editor — a file that Bazel has not yet used in any action (but Bazel has already computed its digest, because that happened fairly early in Bazel's execution). The result is cache corruption. One way to address this would be: Immediately before any file's digest is computed, get its ctime; and save that somewhere (probably alongside the digest somehow). Then, much later, immediately after an action has executed, check if the ctimes of any of its input files have changed. Here's a repro case for this scenario, structured quite similarly to the one above, except with two actions instead of one: Make this
Make the files
Start a build of
While the build is happening — during the first The build will complete, and
That's not a bug. But then:
At this point, |
The best way I can think of to address this is to change How are filesystems exposed internally in Bazel? If they are located through some sort of registry, then perhaps another way to do it would be, |
Avoid triggering bazelbuild#3360. Immediately before a file's digest is computed for the first time, cache its last-modified time (mtime). Later, before uploading output artifacts, check if any of the inputs' mtimes have changed since their digest was computed, and if so, don't upload the outputs to the remote cache. We need to use mtime, not ctime (last-change time), because some files (e.g. genrule-setup.sh) have their metadata changed after their digest is computed. Also, the cache needs to resolve symbolic links, because some files are accessed from multiple paths that all resolve to the same file.
What do you think about recomputing the action hash (incl. source file hashes, affected merkle tree paths) before uploading to the remote cache? The code changes would be limited to the remote package. It's not clear to me that this would be noticeable in performance. Furthermore, the cost of hashing could be hidden by concurrently uploading output artifacts and only uploading the action hash if it did not change. |
@buchgr, I'm not sure what you're proposing here, but rehashing all the input files is likely prohibitively expensive. Just think of rehashing the JDK for every Javac action - not an option. The right thing is to use the ctime to check for file changes, and conservatively not cache if the ctime changed (or re-execute). Note that it's unsafe to upload a cache entry if you aren't sure that the inputs match what the compiler saw during the action. |
Maybe I'm missing something about the race condition here, but the way we avoid corrupting the cache is to recompute the hash on the cache service for every upload and throw an error if there is a mismatch. This way, in the case the file changes locally between the computation of the hash and the call to the remote cache service, the build just fails hand has to be started again. |
@buchgr Are you able to reproduce with experimental_guard_against_concurrent_changes enabled? |
What's the remaining issue with this flag? I'm considering running a cache locally on developer laptops instead of using the dir cache as discussed in #5139. |
@uri-canva on our CI we have as of last week seen actions being marked as modified even though no files have been modified on macOS. I did not have time to look into the issue yet. I am not aware of any problems on Linux. |
What's the status of this? |
Where do you store this info? |
In FileContentsProxy via FileArtifactValue. The check is already implemented, though. Just enable
|
Is there something blocking this from being enabled by default? Seems like a pretty clear, serious correctness issue, solved by @ulfjack's great work. |
I agree, the default should be safe. For people that can guarantee the source file doesn't change, giving them a speed up knob can be useful. But it should be safe by default. |
According to the comments above the issue happens on macOS too. We disabled it regardless of that because the overhead is very high, but I assume since it's not enabled by default there haven't been many eyes on that anyway. |
If this context helps, I can tell you that I don’t know how we would use Bazel if this flag wasn’t available. (We are a Mac shop.) Before this flag existed, corruption of our remote cache was somewhat rare, but when it did happen — perhaps once a month — we were pretty mystified and would have to spend a long time trying to figure out what happened and what to do about it. |
This page (https://docs.bazel.build/versions/master/remote-caching.html#known-issues) say. When an input file is modified during a build, Bazel might upload invalid results to the remote cache. You can enable a change detection with the --experimental_guard_against_concurrent_changes flag. There are no known issues and it will be enabled by default in a future release. See issue #3360 for updates. Generally, avoid modifying source files during a build. Can it be enabled by default now as the bold text tell it will happens? People that do not want it can disable this. |
I think that the docs aren't accurate anymore, there are known issues and that's why it's not enabled by default. |
It feels like there's a "{Fast, Correct} - Choose two" joke in here somewhere :) |
Do we have a handle on where the drawbacks are? I don't remember noticing meaningful slowdown when I enabled in on macOS. |
See the comments above. |
@buchgr can you open a new issue specifically for the false positive when using |
No, I get it, and have read. But you're seeing it manifest as serious performance issue for you, @uri-canva, worse than the cache poisoning? Any more details you can provide (here or in the other issue)? What I was seeing on my repo/machines was that it's trading off a minor performance issue for serious correctness/cache-poisoning issue--and human time is much more valuable than CPU time. Seems like some others maybe feel that way...but it's totally possible we're not hitting the issue the same way you are and are missing the severity of the trade off. |
We can force which bazel version is used. At this point, I think we need someone to justify a big speed regression to justify tolerating a cache corruption problem. The cache poisoning is a big real issue that affected many people. |
Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team ( |
I don’t understand why the flag still exists, instead of fully integrating the fix, not behind a command line flag. With local builds, Bazel caching is perfect with the flag; without the flag, it would not be usable, due to random, extremely difficult-to-debug cache corruption. |
Description of the problem / feature request / question:
Our org has about 50 engineers using Bazel, with a remote cache but without remote build. About once every 3-4 weeks, I would hear of a case of someone's build that was failing in strange ways when the remote cache was turned on, but would succeed when the cache was off.
I finally got a repro case, and was able to figure out what was happening. Bazel's remote spawn strategy does its work in this order:
But if the user makes any changes to the sources (e.g. saves a file, or uses git to check out a different branch of their code) after step 1, but before the build in step 3, then you will end up with an incorrect (corrupt) entry in the cache: Its hash matches the input files when step 1 executed, but its contents do not match what would have been built from those input files — its contents match what was built from the modified input files.
From that point on, all other developers on the team who try to build that same code will download the incorrect build results.
Interestingly, I think this cannot cause cache problems if you are using remote cache WITH remote build. In that scenario, Bazel's
SimpleBlobStoreActionCache.uploadFileContents()
will first read the file into memory, then compute its hash, and upload the already in-memory blob. So the file contents in the cache are guaranteed to be accurate (the source file's hash matches the source file's contents). I haven't tried it, but I imagine that the worst that could happen — and I'm not even sure if this could happen — is that the remote build might fail because it can't find the input files it wants. No serious harm done.If possible, provide a minimal example to reproduce the problem:
This
BUILD
file has a rule called:race
which, when built, will first sleep for a few seconds, and then will simply catfoo.in
tofoo.out
:Make a file
foo.in
:echo 1 > foo.in
Start a build with a command line similar to this one (for this test, I have nginx running on localhost acting as a remote cache; you will need to set up some sort of similar remote cache):
While the build is happening — during the
sleep 10
— in another shell, change the contents offoo.in
:echo 2 > foo.in
The build will complete, and
bazel-genfiles/foo.out
will, not surprisingly, contain2
. That's not a bug. But then:bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 clean
foo.in
to its old contents:echo 1 > foo.in
bazel build
, with the same command line as above.At this point,
bazel-genfiles/foo.out
will contain2
, because that's what Bazel found in the remote cache. This is definitely wrong.Environment info
OSX; probably on Linux too, I haven't tested it
bazel info release
):0.5.2
The text was updated successfully, but these errors were encountered: