Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Support custom caching for non-reproducible actions? #573

Open
silvergasp opened this issue Feb 21, 2024 · 4 comments
Open

Comments

@silvergasp
Copy link
Contributor

I've been experimenting with using non-reproducible systems in build systems like bazel and buck2. Buck2 and many modern build systems make the assumption that all build actions are reproducible. However I'd really like to use buck2 with some non-deterministic build actions. I'm fully aware that an enormous amount of work has gone into making buck2 reproducible and hermetic, and I'm aware that what I'm suggesting here runs somewhat counter to that effort.

Would the buck2 team be against having an optional/experimental configuration that allows for more direct control over cache-artifact lifetimes? I could see the following caching strategies being useful at a "rule" level;

  1. Never cache
  2. Cache expires after N seconds
  3. Cache expires on cron schedule
  4. Lazy cache evaluation i.e. immediately return cached artifact and then update it next time it's used.

This might look something like;

# BUCK
cxx_binary(
    name = "fuzz_foo",
    srcs = ["fuzz.cpp"],
    link_style = "static",
    # Default to never
    cache_eviction_strategy = "never",
)

timed_cache(
  name = "weekly",
  expire_after = "1w",
)

fuzz_report(
    name = "fuzz",
    target = ":fuzz_foo"
    cache_eviction_strategy = ":weekly",
    runs = 10000,
)
buck2 build //:fuzz
# ... building / running fuzzer and creating report

# 5 min later
buck2 build //:fuzz
# ... return cached result

# One week later
buck2 build //:fuzz
# ... building / running fuzzer and creating report

Use case's

I'm a cyber-security researcher and I regularly conduct scan's and do local fuzzing etc. Something that I find excellent about buck2 is the ability to create complex repeatable execution graphs. Currently buck2 works great as a declarative build tool where each execution in the execution graph is repeatable e.g. clang. However it would be great to combine some of this with non-reproducible execution graphs as well. This would be useful for;

  • Fuzzing
  • Testing LLM's
  • Scanning a website
  • etc.
@cbarrete
Copy link
Contributor

Is this actually a build problem? Shouldn't your implementation in fuzz.cpp e.g. take the current time and use that as a seed (or even better, take it via a command line flag or environment variable so that it's actually reproducible)? You could quantize that time if you need reproducible execution over a given period of time.

It seems to me that the part that you really care about is not the buck2 build part, but rather the buck2 run one, which isn't cached at all.

@silvergasp
Copy link
Contributor Author

Is this actually a build problem? Shouldn't your implementation in fuzz.cpp e.g. take the current time and use that as a seed (or even better, take it via a command line flag or environment variable so that it's actually reproducible)? You could quantize that time if you need reproducible execution over a given period of time.

It may not be a build problem, but more generally an automation problem, but that might come down to just semantics. I'd be quite happy for this to look something like;

buck2 run_pipeline //:some_pipeline

It's true you could use a consistent seed for fuzzing and just fuzz n-iterations and it's possible to get reproducible outputs in that case. However there are still use-cases where it's nice to have a full execution graph (that buck2 provides via DICE) where each node in the execution graph is not necessarily reproducible. A more concrete (though still toy) example of a non-reproducible execution graph might include;

  • Use 5 different scanners to detect a mis-configurations in a website, each outputting there own json file.
  • Take said json files and convert them into a markdown file.
  • Take the markdown file and convert it to a static html page for developer to view.

But let's say that that the scanner's take 40min to run, and don't need to be run all that often. So having some caching involved would be great, but then you wouldn't want them to be cached indefinitely (which would be the case with buck2). It's also the case that this execution graph is by definition non-reproducible because the website that is being scanned is outside of your control.

Buck2 solves 90% of this automation problem by handling execution graphs, remote execution and caching etc. I'm aware that this doesn't necessarily fit with the primary goal of buck2 being a build system. But it has enough overlap for me to find it interesting as a generalised declarative automation framework. Does this sound too far out of left field for buck2? I'm aware that this is kind of build-system adjacent.

It seems to me that the part that you really care about is not the buck2 build part, but rather the buck2 run one, which isn't cached at all.

This is sort of true, although I think what I'm hoping for is something like buck2 run but re-using some of the execution graph semantics in the runtime space.

@JakobDegen
Copy link
Contributor

JakobDegen commented Feb 22, 2024

  1. Never cache

Yeah, so we've talked about adding support for this kind of a thing before, primarily under the name "volatile actions." I think the hypothetical API is that when you call ctx.actions.run, you can specify volatile = True and then your action will get rerun on every command. Obviously this would need to be used with care.

The use-case that we had in mind at the time is better integration with system toolchains; for example, maybe you want to invalidate all your rust library builds when you upgrade your rustc version. You could define a volatile action that prints the rustc version into a file, and then add that as a never-read input to every rustc action.

I think the vibe on volatile actions is basically positive. Just needs someone to go and write some code I think.

  1. Cache expires after N seconds
  2. Cache expires on cron schedule

These two seem like they could be implemented on top of the first one. You can have a volatile action that prints the current timestamp / 3600 to a file, and then depend on that file from every other action - at the top of the hour, the contents of that file will change and your actions get invalidated.

I suppose that's not exactly the same as "expire after 1 hour," but its pretty close. If you don't care about RE, then you can actually modify this scheme to use incremental actions and then get exactly those semantics (have an action that writes the current timestamp to its output, if its been more than 1 hour since the timestamp written there right now).

  1. Lazy cache evaluation i.e. immediately return cached artifact and then update it next time it's used.

This one I'm a bit more hesitant on. My concern though isn't around the caching, but rather around the action execution management. Action executions currently are clearly tied to the lifetime of a single command, ie they are executed as part of that command, need to finish before the command can finish, and are cancelled if the command is cancelled. What you're suggesting seems like it would be a deviation from that, which I think is probably hard to do correctly, both in principle and in practice.

@thoughtpolice
Copy link
Contributor

I've thought about the fuzzing thing a number of times, and I sort of came to the conclusion that you probably want to fix the seeds in your fuzzing tests and try to have a reasonable amount of them if you expect them to run under buck2 test or whatnot. Actual major-scale runs of fuzzing e.g. with Clusterfuzz should probably be done by deploying some other kind of artifact (e.g. an OCI image to be deployed and probed.)

But "Volatile actions" are also really useful for a lot of other random things where a program may need to invoke some kind of ambient side effect on the system, which can actually be used to improve the precision of dependency tracking. When combined with early cut-off, a lot of the time they aren't so bad, like this example:

You could define a volatile action that prints the rustc version into a file, and then add that as a never-read input to every rustc action.

This is actually a great example that I used to do all the time when using Shake (through a feature called "Oracles.") I think it's really important for some cases. For example, let's say a user builds a project with CC=gcc as the compiler, it's just picked up off $PATH. Then they do a global system upgrade to their whole system, getting a new C compiler. If the user then enters the project and tries to build, it won't rebuild anything, because nothing seems to have changed; the build system can't track anything more than the fact it invokes "$CC" to compile objects, and as far as it can tell that command still exists just fine (probably /usr/bin/gcc, so even the path doesn't tell you anything), so there's nothing left to do.

In C or C++, this kind of mistake isn't so bad, because they have de-facto stabilized ABIs. This exact case can happen today in Buck2 with system_rust_toolchain and system_cxx_toolchain; but in the case of Rust, this error could cause catastrophic and hard-to-understand build failures. You can upgrade rustc, add a new library, run buck2 build, and now you might end up with rlibs that were cached and compiled previously with an old compiler, and rlibs that were compiled freshly with the upgraded compiler, and you will be lucky if the linker just explodes on them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants