Investigate assess performance regression with MIR linker #1828

tedinski · 2022-11-01T16:44:19Z

Using the MIR linker significantly increases the "problem sizes" and verification times for my goto test crate (rand). We go from usually <1k properties and ~10s common verification times to 50k properties and 60s verification times for each test harness. I have not yet investigated why.

Two tests from that crate go from unwind failures to unsupported_construct failures (try and inline asm). It's curious that this happened as a result (presumably) of linking the standard library. I think this also should be investigated.

This is a tracking issue for investigating these behavior changes.

The text was updated successfully, but these errors were encountered:

fzaiser · 2022-11-01T17:54:27Z

May be related to #1818

tedinski · 2022-11-03T21:13:08Z

So far I believe I've found two issues:

When using --reachability=pub_fns and --profile test we end with the the sole "root" being one symbol: e.g. _RNvCscWIzLuMkfbr_12assess_works4main (which I believe is the generated main for the test binary, which brings in the whole parallel test runner machinery I assume)
Even accounting for this, there still appears to be symbols included in the goto binary that are unreachable from this root. I haven't found a culprit yet, and am working on ways to figure out how to find this...

So:

I'm surprised by main being the sole root of the reachability, since I thought we previously saw the actual test harnesses themselves get made "public" and each would be roots. I'm wondering if there's a tweak we can make here to make "pub_fns" go back to that behavior.
Otherwise, I suspect "pub_fns" isn't useful and we'll have to add a different reachability mode for "all harness types" (or something like that, to get test and proof harnesses as roots)
More investigation needed on that extra stuff that seems to be getting pulled in.

celinval · 2022-11-03T21:33:24Z

Interesting, here are few thoughts:

We explicitly bring in the main function as an entry point when using pub_fns. This was done to fix bookrunner, since we use --function main, and main might not be set as public (Fix MIR Linker handling of --function main #1775). I would also expect the legacy linker to bring main though. Do you know if that's the case?
The pub_fns is still required for --function and it might be useful for testing code generation of a crate, but it is possible that what you want is another option that pulls in the test harnesses instead. It will definitely be more efficient.
I'm curious to know what is being brought up that is not reachable from root. One possibility is the implementation of some traits. Whenever we encounter a DST Coercion of a concrete type into a trait, we add to the reachability analysis all methods that are needed to generate the virtual table. Even if those methods don't ever get invoked.

tedinski · 2023-01-04T20:41:12Z

Current status: The addition of the test reachability mode back in Nov cut the performance problem by 50%, but we still see many goto binaries "blow up" in size.

Probable course of action:

We can investigate to see if we can find a specific set of functions that cause this ballooning, and perhaps stub them out (even by default). We have a hypothesis that some of this size problem is caused by code that triggers reachability for the entire backtrace machinery in Rust. If we can prevent that, this would largely solve the problem where tiny, trivial harnesses balloon into 200+MB of code (and GBs of memory).
Introduce project and stop merging all files generated by the compiler #1956 made all targets get built serially instead of in parallel by cargo. We could fix this, but in a practical sense we're blocked on doing that by reducing the memory consumption and ballooning problem above. If all we did was fix this problem, we'd just run machines out of memory as they try to build 16 symtabs that each consume 10+GB of RAM.
Longer term, we can generate goto binaries directly, instead of generating symtab json and translating them with symtab2gb.

There are a lot of alternative options here: for instance, we could try to optimize the memory consumption of symtab2gb. Or break up symtab generation into multiple files. Or we could try having a pre-built goto binary for the already monomorphic parts of std. But the above seems like the most compelling course of action.

celinval · 2023-01-05T04:52:11Z

I think we stopped building in parallel before that. We did that so we could use cargo rustc.

tedinski · 2023-01-06T22:47:51Z

My mistake, I checked the history this time:

Add --mir-linker flag and wire it up to the compiler #1652 introduced the loop in kani-driver for packages
Fix target selection for MIR Linker #1789 then expanded it to targets
Introduce project and stop merging all files generated by the compiler #1956 was innocent, it just fixed how things got linked.

tedinski · 2023-02-02T22:56:10Z

This trivial harness:

use anyhow::*;

#[kani::proof]
fn proof_using_anyhow() {
    let _ = Ok(()).context("words");
}

Produces 286MB of symtab.json. With a few different hacky experiments in visit_fn in the reachability code:

40M if pretty_name.contains("std::backtrace") { return }
67M if pretty_name.contains("backtrace_rs::") { return }

So there's a pretty significant reduction in bloat there (220+MB eliminated). Unfortunately, I have not yet been able to identify any single or few functions that, if stubbed, would accomplish the same effect. I think because the backtrace gets passed around and potentially printed that a lot of this code looks potentially reachable.

I tried applying the latter hack to an assess run, but I still saw pretty significant symtab2gb memory usage in a few crates, so that's probably where to investigate next: (not an exhaustive list)

anyhow 12G
textwrap 13G
unicode-normalization 10G
regex 16G

tedinski self-assigned this Nov 1, 2022

tedinski added this to the Create Kani Assess tool milestone Nov 1, 2022

tedinski added the [E] Performance Track performance improvement (Time / Memory / CPU) label Nov 14, 2022

tedinski mentioned this issue Nov 18, 2022

Introduce test reachability mode #1914

Merged

4 tasks

tedinski assigned celinval Jan 3, 2023

tedinski mentioned this issue Jan 12, 2023

Tracking issue: common unsupported features #2107

Open

3 tasks

feliperodri unassigned celinval Jan 12, 2023

tedinski mentioned this issue Jan 20, 2023

Tracking issue: Improving Assess for analyzing many crates #2138

Open

16 tasks

tedinski removed their assignment Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate assess performance regression with MIR linker #1828

Investigate assess performance regression with MIR linker #1828

tedinski commented Nov 1, 2022

fzaiser commented Nov 1, 2022

tedinski commented Nov 3, 2022

celinval commented Nov 3, 2022

tedinski commented Jan 4, 2023

celinval commented Jan 5, 2023

tedinski commented Jan 6, 2023 •

edited

Loading

tedinski commented Feb 2, 2023 •

edited

Loading

Investigate assess performance regression with MIR linker #1828

Investigate assess performance regression with MIR linker #1828

Comments

tedinski commented Nov 1, 2022

fzaiser commented Nov 1, 2022

tedinski commented Nov 3, 2022

celinval commented Nov 3, 2022

tedinski commented Jan 4, 2023

celinval commented Jan 5, 2023

tedinski commented Jan 6, 2023 • edited Loading

tedinski commented Feb 2, 2023 • edited Loading

tedinski commented Jan 6, 2023 •

edited

Loading

tedinski commented Feb 2, 2023 •

edited

Loading