stage2: introduce CacheMode #10451

andrewrk · 2021-12-30T03:57:53Z

The two CacheMode values are whole and incremental.
incremental is what we had before; whole is new.
Whole cache mode uses everything as inputs to the cache hash;
and when a hit occurs it skips everything including linking.
This is ideal for when source files change rarely and for backends that
do not have good incremental compilation support, for example
compiler-rt or libc compiled with LLVM with optimizations on.
This is the main motivation for the additional mode, so that we can have
LLVM-optimized compiler-rt/libc builds, without waiting for the LLVM
backend every single time Zig is invoked.

Incremental cache mode hashes only the input file path and a few target
options, intentionally relying on collisions to locate already-existing
build artifacts which can then be incrementally updated.

The bespoke logic for caching stage1 backend build artifacts
is removed since we now have a global caching mechanism for
when we want to cache the entire compilation, including linking.
Previously we had to get "creative" with libs.txt and a special
byte in the hash id to communicate flags, so that when the cached
artifacts were re-linked, we had this information from stage1
even though we didn't actually run it. Now that CacheMode.whole
includes linking, this extra information does not need to be
preserved for cache hits. So although this changeset introduces
complexity, it also removes complexity.

The main trickiness here comes from the inherent differences between the
two modes: incremental wants a directory immediately to operate on,
while whole doesn't know the output directory until the compilation is
complete. This commit deals with this problem mostly inside update(),
where, on a cache miss, it replaces zig_cache_artifact_directory with a
temporary directory, and then renames it into place once the compilation is
complete.

Items remaining before this branch can be merged:

Some preliminary performance measurements of wall clock time and
peak RSS used:

stage1 behavior (1077 tests), llvm backend, release build:

cold global cache: 4.6s, 1.1 GiB
warm global cache: 3.4s, 980 MiB

stage2 master branch behavior (575 tests), llvm backend, release build:

cold global cache: 0.62s, 191 MiB
warm global cache: 0.40s, 128 MiB

stage2 this branch behavior (575 tests), llvm backend, release build:

cold global cache: 0.62s, 179 MiB
warm global cache: 0.27s, 90 MiB

andrewrk · 2021-12-31T23:01:39Z

The macOS failure looks like this:

Test 1/13 stack-trace return (Debug)...
========= Expected this output: =========
error: TheSkyIsFalling
source.zig:2:5: [address] in main (test)
    return error.TheSkyIsFalling;
    ^

================================================
error: TheSkyIsFalling
???:?:?: 0x107d80082 in _main.0 (???)

error: TestFailed

It can be reproduced with this test.zig file:

pub fn main() !void {
    return error.TheSkyIsFalling;
}

and running this command:

./zig build-exe test.zig --enable-cache
# then run the test binary from the printed directory

The problem is this code here:

zig/lib/std/debug.zig

Line 1424 in 7c1e17a

const o_file_path = mem.sliceTo(self.strings[symbol.ofile..], 0);

The o_file_path that it gets from the binary contains something like zig-cache/tmp/[random]/test.o but with these new CacheMode changes, in whole mode, that directory gets renamed to zig-cache/o/[digest] after linking. So the o_file_path has a wrong absolute path.

I'm not quite sure how the object file paths get set in the binary. Some different ideas to address this problem:

After renaming the tmp dir to the final cache dir, tell the linker to update the object file paths in the binary.
We can know the final cache dir before calling flush(); perhaps the file paths can be delayed to be written until flush().
Adjust the object file paths to be relative to the binary rather than absolute.
Leave the compilation/linking alone, and introduce logic in our stack trace dumping code to notice when a binary has been moved, and use the new file system destination as a "patch" to adjust object file paths. The downside here is that this only works for zig stack traces, but we want gdb/lldb and other system tools to work as well. So arguably we should explore the other ideas first.

One thing that would help unblock me here would be to have a chat with @kubkon about where exactly is the logic that decides how those object file paths get computed and populated. But that will have to wait until holiday festivities are over 🙂 Happy New Year! 🥳

@embedfile

The two CacheMode values are `whole` and `incremental`. `incremental` is what we had before; `whole` is new. Whole cache mode uses everything as inputs to the cache hash; and when a hit occurs it skips everything including linking. This is ideal for when source files change rarely and for backends that do not have good incremental compilation support, for example compiler-rt or libc compiled with LLVM with optimizations on. This is the main motivation for the additional mode, so that we can have LLVM-optimized compiler-rt/libc builds, without waiting for the LLVM backend every single time Zig is invoked. Incremental cache mode hashes only the input file path and a few target options, intentionally relying on collisions to locate already-existing build artifacts which can then be incrementally updated. The bespoke logic for caching stage1 backend build artifacts is removed since we now have a global caching mechanism for when we want to cache the entire compilation, *including* linking. Previously we had to get "creative" with libs.txt and a special byte in the hash id to communicate flags, so that when the cached artifacts were re-linked, we had this information from stage1 even though we didn't actually run it. Now that `CacheMode.whole` includes linking, this extra information does not need to be preserved for cache hits. So although this changeset introduces complexity, it also removes complexity. The main trickiness here comes from the inherent differences between the two modes: `incremental` wants a directory immediately to operate on, while `whole` doesn't know the output directory until the compilation is complete. This commit deals with this problem mostly inside `update()`, where, on a cache miss, it replaces `zig_cache_artifact_directory` with a temporary directory, and then renames it into place once the compilation is complete. Items remaining before this branch can be merged: * [ ] make sure these things make it into the cache manifest: - @import files - @embedfile files - we already add dep files from c but make sure the main .c files make it in there too, not just the included files * [ ] double check that the emit paths of other things besides the binary are working correctly. * [ ] test `-fno-emit-bin` + `-fstage1` * [ ] test `-femit-bin=foo` + `-fstage1` * [ ] implib emit directory copies bin_file_emit directory in create() and needs to be adjusted to be overridden as well. * [ ] make sure emit-h is handled correctly in the cache hash * [ ] Cache: detect duplicate files added to the manifest Some preliminary performance measurements of wall clock time and peak RSS used: stage1 behavior (1077 tests), llvm backend, release build: * cold global cache: 4.6s, 1.1 GiB * warm global cache: 3.4s, 980 MiB stage2 master branch behavior (575 tests), llvm backend, release build: * cold global cache: 0.62s, 191 MiB * warm global cache: 0.40s, 128 MiB stage2 this branch behavior (575 tests), llvm backend, release build: * cold global cache: 0.62s, 179 MiB * warm global cache: 0.27s, 90 MiB

* Logic to check whether a bin file is not emitted is more complicated in between `Compilation.create` and `Compilation.update`. Fixed the logic that decides whether to build compiler-rt and other support artifacts. * Basically, one cannot inspect the value of `comp.bin_file.emit` until after update() is called - fixed another instance of this happening in the CLI. * In the CLI, `runOrTest` is updated to properly use the result value of `comp.bin_file.options.emit` rather than guessing whether the output binary is. * Don't assume that the emit output has no directory components in sub_path. In other words, don't assume that the emit directory is the final directory; there may be sub-directories.

when using `CacheMode.whole`. Also, I verified that `addDepFilePost` is in fact including the original C source file in addition to the files it depends on.

This fixes a regression in this branch that can be reproduced with the following steps: 1. `zig build-exe hello.zig` 2. delete the "hello" binary 3. `zig build-exe hello.zig` 4. observe that the "hello" binary is missing This happened because it was a cache hit, but nothing got copied to the output directory. This commit sets CacheMode to incremental - even for stage1 - when the CLI requests `disable_lld_caching` (this option should be renamed), resulting in the main Compilation to be repeated (uncached) for stage1, populating the binary into the cwd as expected. For stage2 the result is even better: the incremental compilation system will look for build artifacts to incrementally compile, and start fresh if not found.

libtsan, libunwind, and libcxxabi.

Previously the code asserted source files were already loaded, but this is not the case when cached ZIR is loaded. Now it will trigger .zig source code to be loaded for the purposes of hashing the source for `CacheMode.whole`. This additionally refactors stat_size, stat_inode, and stat_mtime fields into using the `Cache.File.Stat` struct.

Need to mark it as "needing cleanup" a bit earlier in the function.

Instead of juggling GPA-allocated sub_path (and ultimately dropping the ball, in this analogy), `Compilation.create` allocates an already-exactly-correct size `sub_path` that has the digest unpopulated. This is then overwritten in place as necessary and used as the `emit_bin.sub_path` value, and no allocations/frees are performed for this file path.

I accidentally used errdefer instead of defer for the function-local arena.

Broken by the introduction of `CacheMode.whole`, they are now working again by using the same mechanism as emitted binary files.

This improves readability as well as compatibility with stage2. Most of compiler-rt is now enabled for stage2 with just a few functions disabled (until stage2 passes more behavior tests).

This allows stage2 to build more of compiler-rt. I also changed `-%` to `-` for comptime ints in the div and mul implementations of compiler-rt. This is clearer code and also happens to work around a bug in stage2.

The semantics of this function are that it moves both files and directories. Previously we had this `is_dir` boolean field of `std.os.windows.OpenFile` which required the API user to choose: are we opening a file or directory? And the other kind would either cause error.IsDir or error.NotDir. But that is not a limitation of the Windows file system API; it was self-imposed. On Windows, rename is implemented internally with `NtCreateFile` so we need to allow it to open either files or directories. This is now done by `std.os.windows.OpenFile` accepting enum{file_only,dir_only,any} instead of a boolean.

Without this, Zig would re-use a compiler-rt built with stage2 when one built by stage1 was needed.

Doc comments reproduced here: This function is called by the frontend before flush(). It communicates that `options.bin_file.emit` directory needs to be renamed from `[zig-cache]/tmp/[random]` to `[zig-cache]/o/[digest]`. The frontend would like to simply perform a file system rename, however, some linker backends care about the file paths of the objects they are linking. So this function call tells linker backends to rename the paths of object files to observe the new directory path. Linker backends which do not have this requirement can fall back to the simple implementation at the bottom of this function. This function is only called when CacheMode is `whole`. This solves stack trace regressions on Windows and macOS because the linker backends do not observe object file paths until flush().

andrewrk force-pushed the cache-mode branch 4 times, most recently from 10eb5f4 to 556906d Compare December 31, 2021 06:19

andrewrk force-pushed the cache-mode branch from 263d1d7 to 398ce37 Compare January 2, 2022 02:50

andrewrk added 12 commits January 2, 2022 13:16

stage2: add @import and @embedFile to CacheHash

e367181

when using `CacheMode.whole`. Also, I verified that `addDepFilePost` is in fact including the original C source file in addition to the files it depends on.

stage2: fix path to cache artifacts in libcxx,

dbd0a2c

libtsan, libunwind, and libcxxabi.

stage2: fix not calling deinit() on whole_cache_manifest

208a6c7

Need to mark it as "needing cleanup" a bit earlier in the function.

stage2: fix memory leak in addNonIncrementalStuffToCacheManifest

013c286

I accidentally used errdefer instead of defer for the function-local arena.

stage2: fix implibs

55243cf

Broken by the introduction of `CacheMode.whole`, they are now working again by using the same mechanism as emitted binary files.

stage2: fix UAF of system_libs

9dc25dd

compiler-rt: simplify implementations

6b14c58

This improves readability as well as compatibility with stage2. Most of compiler-rt is now enabled for stage2 with just a few functions disabled (until stage2 passes more behavior tests).

andrewrk force-pushed the cache-mode branch from 398ce37 to 6b14c58 Compare January 2, 2022 20:16

andrewrk added 4 commits January 2, 2022 14:11

Sema: implement peer type resolution of signed and unsigned ints

b4d6e85

This allows stage2 to build more of compiler-rt. I also changed `-%` to `-` for comptime ints in the div and mul implementations of compiler-rt. This is clearer code and also happens to work around a bug in stage2.

compiler-rt: small refactor in atomics

5e086b2

stage2: fix missing items from whole cache mode hash

663ffa5

Without this, Zig would re-use a compiler-rt built with stage2 when one built by stage1 was needed.

andrewrk force-pushed the cache-mode branch from 2a5c28f to c4a981d Compare January 3, 2022 04:21

andrewrk force-pushed the cache-mode branch from c4a981d to d94303b Compare January 3, 2022 21:50

andrewrk merged commit 81fa31c into master Jan 3, 2022

andrewrk deleted the cache-mode branch January 3, 2022 21:50

andrewrk mentioned this pull request Mar 8, 2022

cache is not invalidated for #include'd files #11063

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

stage2: introduce CacheMode #10451

stage2: introduce CacheMode #10451

Uh oh!

andrewrk commented Dec 30, 2021 •

edited

Loading

Uh oh!

andrewrk commented Dec 31, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

stage2: introduce CacheMode #10451

stage2: introduce CacheMode #10451

Uh oh!

Conversation

andrewrk commented Dec 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewrk commented Dec 31, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

andrewrk commented Dec 30, 2021 •

edited

Loading