Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HoarePO.equal and leq performance #803

Closed
jerhard opened this issue Jul 28, 2022 · 8 comments · Fixed by #809
Closed

Improve HoarePO.equal and leq performance #803

jerhard opened this issue Jul 28, 2022 · 8 comments · Fixed by #809
Assignees
Labels
performance time, memory
Milestone

Comments

@jerhard
Copy link
Member

jerhard commented Jul 28, 2022

Findings

@michael-schwarz and I did some runs of Goblint (master) on the SQLite amalgamation, and investigated performance using perf and the firefox-profiler. We looked at the high run time of (on my system) BaseDomain.fun_6129, which we conjectured to be BaseDomain.equal_basecomponents_t.

Further investigation with Stats.time on a ~1 minute run of SQLite showed that a significant amount of time is used in HoarePO.equal:

TOTAL                          55.185 s
  parse                           0.463 s
  convert to CIL                  0.568 s
  mergeCIL                        0.136 s
  analysis                       54.019 s
    createCFG                       0.836 s
      handle                          0.293 s
      iter_connect                    0.479 s
        computeSCCs                     0.217 s
    global_inits                    0.403 s
      HoareDomain.equal               0.001 s
    solving                        52.741 s
      S.Dom.equal                     0.199 s
      HoareDomain.equal              18.238 s
Timing used
Command line used for running Goblint on sqlite
/goblint ../sqlite-amalgamation-3370200/sqlite3.c ../sqlite-amalgamation-3370200/sqlite3.h ../sqlite-amalgamation-3370200/sqlite3ext.h ../sqlite-amalgamation-3370200/shell.c -v --set pre.cppflags[+] -DSQLITE_DEBUG --disable ana.base.context.non-ptr --disable sem.unknown_function.spawn --set ana.thread.domain plain --enable exp.earlyglobs --set ana.base.privatization none --set pre.cppflags[+] -DGOBLINT_NO_BSEARCH --set pre.cppflags[+] -DGOBLINT_NO_ASSERT --set result json-messages
We tried this a few times, and in the first minute of analyzing SQLite 30-50% of the runtime are spent in `HoarePO.equal`.

Problems With Current Implementation

The current implementation of HoarePO.equal relies on checking HoarePO.leq in both directions.
The implementation of leq in turn uses

  • for_all, which transforms the argument Hoare-map to a list,
  • and mem, which also transforms its Hoare-map argument to a list. The function mem may be called for each element in a Hoare-map.

It thus seems like the current implementation could be optimized considerably.

Goals

The implementation of HoarePO.equal and HoarePO.leq should perform a bucket-wise comparison, as all comparable elements have the same hash and are contained in the same bucket. One may also consider implementing equal directly, instead of relying on leq

Edit: Changed HoareDomain to HoarePO, as that's what I was talking about.

@jerhard jerhard added the performance time, memory label Jul 28, 2022
@jerhard
Copy link
Member Author

jerhard commented Jul 28, 2022

@michael-schwarz suggested adding a short circuiting physical equality check to the HoarePO.equal, i.e.

let equal x y = x == y || leq x y && leq y x

and that already increases the number of evals after about minute from ~110000 to ~190000 on SQLite.

@sim642
Copy link
Member

sim642 commented Jul 28, 2022

There's quite an extensive description of all our Hoare-(like) domains in #101, so maybe I'll have to think about this Hoare and partitioning stuff again.

The implementation of HoareDomain.equal and HoardDomain.leq should perform a bucket-wise comparison, as all comparable elements have the same hash and are contained in the same bucket.

This is a nasty hack used by AddressDomain, which only works due to the particular way hash is implemented for individual addresses. Ideally this should be a separate function, not implicitly tied to hash, but that's actually not the biggest problem. Rather, path-sensitivity is defined by the analyses' should_join pairwise operator, which isn't amenable to such bucketing.

The current implementation of HoareDomain.equal relies on checking HoareDomain.leq in both directions.

I don't think that's true. It's how HoarePO does it, but HoarePO is only used for AddressDomain due to the aforementioned reason and it implements a very particular kind of Hoare domain with a hand-rolled functional hashmap that doesn't have a should_join that can keep elements in buckets separate by force.

Actually, now I'm confused: which of the two did you time? Because one's exclusively for addresses and the other is for paths (and some privatizations).

The normal HoareDomain.Set just uses the default Set equality and probably just assumes that such sets are kept and compared only in their normal form. I guess HoarePO doesn't make that assumption, but I think it very well could, which would take all the complexity out of its equal as well and allow using the default Map equality.

@sim642
Copy link
Member

sim642 commented Jul 28, 2022

And another thing to look at (depending on which of the two is the culprit): how many elements do the Hoare sets in question actually have? It could be that for some reason they collect a massive number of unjoinable elements and that's the reason for slowdown.

@michael-schwarz
Copy link
Member

Actually, now I'm confused: which of the two did you time?

HoarePO, because our suspicion was that the slowdown is due to addresses of some sort.

@sim642
Copy link
Member

sim642 commented Jul 28, 2022

The normal HoareDomain.Set just uses the default Set equality and probably just assumes that such sets are kept and compared only in their normal form. I guess HoarePO doesn't make that assumption, but I think it very well could, which would take all the complexity out of its equal as well and allow using the default Map equality.

Here's another reason that it could easily work: HoarePO's compare and hash just operate on the Map and list structures directly, without performing any leq checks. Since those two must be compatible with equal, it seems like equal could simply do the same. Otherwise we have invalid hash and compare implementations, which themselves may be causing major issues.

@michael-schwarz
Copy link
Member

It could be that for some reason they collect a massive number of unjoinable elements and that's the reason for slowdown.

Spoken aside, We are also locally experimenting with not tracking strings (or in this case as debugging hack, fixing all strings to be equal. That has kind of promising results on sqlite (after #802). Every unknown seems to have been uncovered now, and the majority of things are stable:

runtime: 00:38:03.064
vars: 62969, evals: 3666907

|rho|=62969
|called|=415
|stable|=59688
|infl|=62717
|wpoint|=562
Found 1806 contexts for 1806 functions. Top 5 functions:

@jerhard jerhard changed the title Improve HoareDomain.equal and leq performance Improve HoarePO.equal and leq performance Jul 29, 2022
@sim642 sim642 self-assigned this Aug 1, 2022
@sim642
Copy link
Member

sim642 commented Aug 1, 2022

I suspect there's a reason why these operations have been implemented so inefficiently, address offsets don't actually have well-defined partitioning structure: #101 (comment).

@sim642
Copy link
Member

sim642 commented Sep 20, 2022

Completed by #809.

@sim642 sim642 closed this as completed Sep 20, 2022
@sim642 sim642 linked a pull request Sep 20, 2022 that will close this issue
6 tasks
@sim642 sim642 added this to the v2.1.0 milestone Sep 22, 2022
sim642 added a commit to sim642/opam-repository that referenced this issue Nov 25, 2022
CHANGES:

Functionally equivalent to Goblint in SV-COMP 2023.

* Add automatic configuration tuning (goblint/analyzer#772).
* Add many library function specifications (goblint/analyzer#865, goblint/analyzer#868, goblint/analyzer#878, goblint/analyzer#884, goblint/analyzer#886).
* Reorganize library stubs (goblint/analyzer#814, goblint/analyzer#845).
* Add Trace Event Format output to timing (goblint/analyzer#844).
* Optimize domains for address and path sets (goblint/analyzer#803, goblint/analyzer#809).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance time, memory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants