Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full rebuild after killing daemon #547

Open
zjturner opened this issue Jan 24, 2024 · 8 comments
Open

Full rebuild after killing daemon #547

zjturner opened this issue Jan 24, 2024 · 8 comments

Comments

@zjturner
Copy link
Contributor

zjturner commented Jan 24, 2024

If I do buck2 build //... then buck2 kill then buck2 build //... it does a full rebuild every single time. I have the following in my buckconfig.

[buck2]
materializations = deferred
sqlite_materializer_state = true
defer_write_actions = true
hash_all_commands = true
restarter = true

This seems like a bug, wondering if this is expected behavior.

@JakobDegen
Copy link
Contributor

Yes, this is expected behavior. The only caching that is available across daemon instances is whatever is provided by the RE action cache.

We've considered improvements here, typically under names like "dice rehydration" or something like that, which would preserve daemon state across restarts. Nothing that's on any near future roadmap though, unfortunately

@zjturner
Copy link
Contributor Author

zjturner commented Feb 8, 2024

to be clear, is there no concept of a local action cache?

@JakobDegen
Copy link
Contributor

There is not, no, although this is an interesting alternative to dice rehydration that I hadn't considered. I think @scottcao knows some things about this, maybe he can weigh in

@zjturner
Copy link
Contributor Author

zjturner commented Feb 9, 2024

I was under the impression that hash_all_commands basically enabled the action cache and that it would have benefits even in local build scenarios. Did I misunderstand?

@JakobDegen
Copy link
Contributor

I was under the impression that hash_all_commands basically enabled the action cache and that it would have benefits even in local build scenarios.

It does, but not across daemons.

Buck2 always has at least two strategies by which it avoids unnecessarily rerunning actions. The first is a remote action cache, which isn't relevant here. The second is DICE. DICE backs the implementation of most of buck2 core and is the reason that running the same command twice doesn't cause it to do all the work twice - it prevents not just unnecessarily rerunning actions, but also unnecessarily reevaluating buck files and things like that.

Dice, however, is not a cache, but rather an incrementality engine. It's primarily concerned with detecting what did or didn't change since the last build. As a result, if an action was already run five commands ago but invalidated since then, dice won't detect that it can skip recomputing that. Similarly, if two actions "swap places" in the build graph, dice won't detect to not rerun them either.

My understanding of hash_all_commands is that it activates a local action cache that can additionally avoid redoing work because of the above restrictions. However, that cache is still limited to the lifetime of a single daemon

@zjturner
Copy link
Contributor Author

zjturner commented Feb 10, 2024

Just curious, what does DICE stand for? I keep seeing it and wondering.

In any case, I know buck1 had a dir cache, and the topic of dir cache has come up in the context of buck2 several times in the past. What's the current status of that? It seems like, if dir cache were implemented in buck1, would that also include action cache? And have there been any updates about implementing it on Meta's side?

Hypothetically speaking, let's say someone outside of Meta wanted to implement dir cache and upstream the work. Is it feasible without having access to internal Meta systems?

@JakobDegen
Copy link
Contributor

Just curious, what does DICE stand for? I keep seeing it and wondering.

Distributed incremental computation engine. The "distributed" is ehhh... "future work"

What's the current status of that? It seems like, if dir cache were implemented in buck1, would that also include action cache?

I don't actually know what the dir cache in buck1 did, so I may have to get back to you on that.

Internally, I don't think we currently have any concrete plans for additional local caches - remote caches are working well for us.

Hypothetically speaking, let's say someone outside of Meta wanted to implement dir cache and upstream the work. Is it feasible without having access to internal Meta systems?

Yes, certainly; most of the internal-only code is around RE and this should basically work separately. We'd also be very happy to generally review design proposals or anything like that.

But even if that weren't the case - if there is some change that generally makes sense for OSS but we have no use for internally, feel free to send a PR anyway, I have no problem #[cfg]ing it out in our builds

@scottcao
Copy link
Contributor

Hypothetically speaking, let's say someone outside of Meta wanted to implement dir cache and upstream the work. Is it feasible without having access to internal Meta systems?

This is possible by putting the dep files used for hash all commands on disk similar to how we store materializer state on disk, and that work can be done externally. However, there is some risk in that we currently have remote dep files as a separate feature that's implemented but not tested widely or rolled out internally, and making remote dep files work everywhere may require changing how existing dep file logic works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants