Full rebuild after killing daemon #547

zjturner · 2024-01-24T21:54:13Z

If I do buck2 build //... then buck2 kill then buck2 build //... it does a full rebuild every single time. I have the following in my buckconfig.

[buck2]
materializations = deferred
sqlite_materializer_state = true
defer_write_actions = true
hash_all_commands = true
restarter = true

This seems like a bug, wondering if this is expected behavior.

The text was updated successfully, but these errors were encountered:

JakobDegen · 2024-02-08T02:09:16Z

Yes, this is expected behavior. The only caching that is available across daemon instances is whatever is provided by the RE action cache.

We've considered improvements here, typically under names like "dice rehydration" or something like that, which would preserve daemon state across restarts. Nothing that's on any near future roadmap though, unfortunately

zjturner · 2024-02-08T19:19:17Z

to be clear, is there no concept of a local action cache?

JakobDegen · 2024-02-09T04:35:27Z

There is not, no, although this is an interesting alternative to dice rehydration that I hadn't considered. I think @scottcao knows some things about this, maybe he can weigh in

zjturner · 2024-02-09T04:56:52Z

I was under the impression that hash_all_commands basically enabled the action cache and that it would have benefits even in local build scenarios. Did I misunderstand?

JakobDegen · 2024-02-10T02:52:12Z

I was under the impression that hash_all_commands basically enabled the action cache and that it would have benefits even in local build scenarios.

It does, but not across daemons.

Buck2 always has at least two strategies by which it avoids unnecessarily rerunning actions. The first is a remote action cache, which isn't relevant here. The second is DICE. DICE backs the implementation of most of buck2 core and is the reason that running the same command twice doesn't cause it to do all the work twice - it prevents not just unnecessarily rerunning actions, but also unnecessarily reevaluating buck files and things like that.

Dice, however, is not a cache, but rather an incrementality engine. It's primarily concerned with detecting what did or didn't change since the last build. As a result, if an action was already run five commands ago but invalidated since then, dice won't detect that it can skip recomputing that. Similarly, if two actions "swap places" in the build graph, dice won't detect to not rerun them either.

My understanding of hash_all_commands is that it activates a local action cache that can additionally avoid redoing work because of the above restrictions. However, that cache is still limited to the lifetime of a single daemon

zjturner · 2024-02-10T19:21:10Z

Just curious, what does DICE stand for? I keep seeing it and wondering.

In any case, I know buck1 had a dir cache, and the topic of dir cache has come up in the context of buck2 several times in the past. What's the current status of that? It seems like, if dir cache were implemented in buck1, would that also include action cache? And have there been any updates about implementing it on Meta's side?

Hypothetically speaking, let's say someone outside of Meta wanted to implement dir cache and upstream the work. Is it feasible without having access to internal Meta systems?

JakobDegen · 2024-02-12T23:41:21Z

Just curious, what does DICE stand for? I keep seeing it and wondering.

Distributed incremental computation engine. The "distributed" is ehhh... "future work"

What's the current status of that? It seems like, if dir cache were implemented in buck1, would that also include action cache?

I don't actually know what the dir cache in buck1 did, so I may have to get back to you on that.

Internally, I don't think we currently have any concrete plans for additional local caches - remote caches are working well for us.

Hypothetically speaking, let's say someone outside of Meta wanted to implement dir cache and upstream the work. Is it feasible without having access to internal Meta systems?

Yes, certainly; most of the internal-only code is around RE and this should basically work separately. We'd also be very happy to generally review design proposals or anything like that.

But even if that weren't the case - if there is some change that generally makes sense for OSS but we have no use for internally, feel free to send a PR anyway, I have no problem #[cfg]ing it out in our builds

scottcao · 2024-02-14T22:04:43Z

Hypothetically speaking, let's say someone outside of Meta wanted to implement dir cache and upstream the work. Is it feasible without having access to internal Meta systems?

This is possible by putting the dep files used for hash all commands on disk similar to how we store materializer state on disk, and that work can be done externally. However, there is some risk in that we currently have remote dep files as a separate feature that's implemented but not tested widely or rolled out internally, and making remote dep files work everywhere may require changing how existing dep file logic works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full rebuild after killing daemon #547

Full rebuild after killing daemon #547

zjturner commented Jan 24, 2024 •

edited

JakobDegen commented Feb 8, 2024

zjturner commented Feb 8, 2024

JakobDegen commented Feb 9, 2024

zjturner commented Feb 9, 2024

JakobDegen commented Feb 10, 2024

zjturner commented Feb 10, 2024 •

edited

JakobDegen commented Feb 12, 2024

scottcao commented Feb 14, 2024

Full rebuild after killing daemon #547

Full rebuild after killing daemon #547

Comments

zjturner commented Jan 24, 2024 • edited

JakobDegen commented Feb 8, 2024

zjturner commented Feb 8, 2024

JakobDegen commented Feb 9, 2024

zjturner commented Feb 9, 2024

JakobDegen commented Feb 10, 2024

zjturner commented Feb 10, 2024 • edited

JakobDegen commented Feb 12, 2024

scottcao commented Feb 14, 2024

zjturner commented Jan 24, 2024 •

edited

zjturner commented Feb 10, 2024 •

edited