Skip to content

Conversation

@mlugg
Copy link
Member

@mlugg mlugg commented Jan 4, 2025

Some minor bugfixes, and one very big fix: changing the panic handler no longer makes the compiler explode!

To do this, some new AnalUnits are introduced which "wrap" the analysis of multiple things in e.g. std.builtin. That way, dependencies on things like the panic handler are represented with a single edge, avoiding dependency bloat and allowing memoization to continue to work.

The improved memoization here actually slightly improves the performance of semantic analysis even without -fincremental:

Analyze std Tests

Benchmark 1 (3 runs): /home/mlugg/zig/true-master/stage4/bin/zig test lib/std/std.zig --zig-lib-dir lib -fno-emit-bin
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          10.3s  ± 13.4ms    10.2s  … 10.3s           0 ( 0%)        0%
  peak_rss            594MB ± 86.0KB     594MB …  594MB          0 ( 0%)        0%
  cpu_cycles         44.5G  ± 78.2M     44.4G  … 44.6G           0 ( 0%)        0%
  instructions       87.8G  ± 7.05K     87.8G  … 87.8G           0 ( 0%)        0%
  cache_references   3.34G  ± 4.65M     3.34G  … 3.34G           0 ( 0%)        0%
  cache_misses       90.4M  ± 1.25M     89.5M  … 91.8M           0 ( 0%)        0%
  branch_misses      65.0M  ±  712K     64.2M  … 65.6M           0 ( 0%)        0%
Benchmark 2 (3 runs): /home/mlugg/zig/incremental/stage4/bin/zig test lib/std/std.zig --zig-lib-dir lib -fno-emit-bin
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          10.0s  ± 24.0ms    10.00s  … 10.0s          0 ( 0%)        ⚡-  2.4% ±  0.4%
  peak_rss            595MB ± 90.7KB     594MB …  595MB          0 ( 0%)          +  0.1% ±  0.0%
  cpu_cycles         43.4G  ± 87.0M     43.3G  … 43.5G           0 ( 0%)        ⚡-  2.5% ±  0.4%
  instructions       84.6G  ± 27.8K     84.6G  … 84.6G           0 ( 0%)        ⚡-  3.6% ±  0.0%
  cache_references   3.20G  ± 6.93M     3.19G  … 3.21G           0 ( 0%)        ⚡-  4.1% ±  0.4%
  cache_misses       82.9M  ± 1.15M     81.7M  … 83.9M           0 ( 0%)        ⚡-  8.2% ±  3.0%
  branch_misses      66.9M  ±  119K     66.8M  … 67.0M           0 ( 0%)        💩+  2.9% ±  1.8%

Analyze Compiler

Benchmark 1 (3 runs): /home/mlugg/zig/true-master/stage4/bin/zig build-exe -ODebug --dep aro --dep aro_translate_c --dep build_options -Mroot=src/main.zig -Maro=lib/compiler/aro/aro.zig --dep aro -Maro_translate_c=lib/compiler/aro_translate_c.zig -Mbuild_options=.zig-cache/c/68182ff639098a2e8e189aefff4644bc/options.zig -fno-emit-bin
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.69s  ± 52.0ms    7.64s  … 7.74s           0 ( 0%)        0%
  peak_rss            388MB ±  140KB     388MB …  388MB          0 ( 0%)        0%
  cpu_cycles         34.2G  ±  147M     34.0G  … 34.3G           0 ( 0%)        0%
  instructions       60.6G  ± 19.5K     60.6G  … 60.6G           0 ( 0%)        0%
  cache_references   2.84G  ± 16.1M     2.83G  … 2.86G           0 ( 0%)        0%
  cache_misses        140M  ± 1.76M      139M  …  142M           0 ( 0%)        0%
  branch_misses      91.0M  ±  429K     90.5M  … 91.3M           0 ( 0%)        0%
Benchmark 2 (3 runs): /home/mlugg/zig/incremental/stage4/bin/zig build-exe -ODebug --dep aro --dep aro_translate_c --dep build_options -Mroot=src/main.zig -Maro=lib/compiler/aro/aro.zig --dep aro -Maro_translate_c=lib/compiler/aro_translate_c.zig -Mbuild_options=.zig-cache/c/68182ff639098a2e8e189aefff4644bc/options.zig -fno-emit-bin
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.31s  ± 74.5ms    7.22s  … 7.35s           0 ( 0%)        ⚡-  5.0% ±  1.9%
  peak_rss            388MB ±  136KB     388MB …  388MB          0 ( 0%)          -  0.1% ±  0.1%
  cpu_cycles         32.2G  ±  225M     31.9G  … 32.3G           0 ( 0%)        ⚡-  5.8% ±  1.3%
  instructions       57.5G  ± 1.09M     57.5G  … 57.5G           0 ( 0%)        ⚡-  5.2% ±  0.0%
  cache_references   2.68G  ± 5.50M     2.67G  … 2.68G           0 ( 0%)        ⚡-  5.7% ±  1.0%
  cache_misses        126M  ± 1.32M      125M  …  127M           0 ( 0%)        ⚡- 10.5% ±  2.5%
  branch_misses      88.5M  ±  476K     87.9M  … 88.8M           0 ( 0%)        ⚡-  2.8% ±  1.1%

mlugg added 3 commits January 4, 2025 05:44
…e changes

`Zcu.PerThead.ensureTypeUpToDate` is set up in such a way that it only
returns the updated type the first time it is called. In general, that's
okay; however, the exception is that we want the function to continue
returning `error.AnalysisFail` when the type has been lost, or its
number of captures changed.

Therefore, the check for this case now happens before the up-to-date
success return.

For simplicity, the number of captures is now handled by intentionally
losing the instruction in `Zcu.mapOldZirToNew`, since there is nothing
to gain from tracking a type when old instances of it can never be
reused.
@mlugg mlugg force-pushed the incremental branch 2 times, most recently from 97d3ec5 to db3226a Compare January 4, 2025 07:39
mlugg added 2 commits January 4, 2025 07:51
This commit reworks how values like the panic handler function are
memoized during a compiler invocation. Previously, the value was
resolved by whichever analysis requested it first, and cached on `Zcu`.
This is problematic for incremental compilation, as after the initial
resolution, no dependencies are marked by users of this memoized state.
This is arguably acceptable for `std.builtin`, but it's definitely not
acceptable for the panic handler/messages, because those can be set by
the user (`std.builtin.Panic` checks `@import("root").Panic`).

So, here we introduce a new kind of `AnalUnit`, called `memoized_state`.
There are 3 such units:
* `.{ .memoized_state = .va_list }` resolves the type `std.builtin.VaList`
* `.{ .memoized_state = .panic }` resolves `std.Panic`
* `.{ .memoized_state = .main }` resolves everything else we want

These units essentially "bundle" the resolution of their corresponding
declarations, storing the results into fields on `Zcu`. This way, when,
for instance, a function wants to call the panic handler, it simply runs
`ensureMemoizedStateResolved`, registering one dependency, and pulls the
values from the `Zcu`. This "bundling" minimizes dependency edges. The 3
units are separated to allow them to act independently: for instance,
the panic handler can use `std.builtin.Type` without triggering a
dependency loop.
These cover the fixes from the last few commits.
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@mlugg mlugg merged commit 136c5a9 into ziglang:master Jan 5, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants