Non-deterministic multi-method route-guard codegen breaks build caching on content-addressed builders #4091
Unanswered
pradeepc017
asked this question in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
@JohnTitor @robjtede
PR #4090 tried to fix this but didn't explain why it matters, which is on me. Before opening another PR, would a small change that sorts the methods before emitting the guard chain be welcome? It's behavior-neutral (the chain is a logical OR), no API change, and makes the output deterministic.
When a route is registered with more than one method, the codegen builds the method-guard chain by iterating a HashSet. HashSet iteration order is randomized per process, so the generated code is emitted in a different order on every compile. Nothing breaks and compilation succeeds, but the produced artifact is not byte-stable from one build to the next.
This is invisible under a normal cargo workflow. Cargo does incremental and partial compilation, and its caching is not keyed on a content hash of the final artifact, so a cosmetic reordering inside a single crate carries no real cost. That is almost certainly why #4090 read as unnecessary.
The cost shows up on content-addressed build systems: nix with crate2nix and Buck2. These compile each crate as a single unit and cache it by hashing the output. When a proc-macro emits different bytes on each build, the crate's content hash changes, the cache misses, and that crate together with everything downstream rebuilds from source. On a large workspace a single non-deterministic crate deep in the dependency graph can turn a few-second cache hit into a multi-hour
from-source rebuild, on every CI run and every contributor's machine, because the miss cascades. Since actix-web is a widely used dependency, one non-deterministic expansion in it poisons the cache for many crates downstream.
I have made an example repo here https://github.com/pchikku/actix-determinism-repro. Readme has instructions to run (check.sh)
The change is small. Collect the methods into a Vec and sort them before emitting the guard chain, rather than iterating the HashSet directly. Sorting by the HTTP method name also makes the expanded code easier to read. The guard chain is a logical OR over the methods, so order has never affected behaviour, there is no public API change, and sorting a handful of items at compile time has no measurable cost. Only routes registered with more than one method are affected, since single-method routes already emit a single guard.
if you prefer, a regression test asserting stable expansion, a different sort key, or splitting out the Ord implementation can each be added on request.
Beta Was this translation helpful? Give feedback.
All reactions