Slot no more: overhauled internal algorithm #296

nikomatsakis · 2022-01-28T17:29:52Z

This is the overhauled implementation that avoids slots, is more parallel friendly, and paves the way to fixed point and more expressive cycle handling.

We just spent 90 minutes going over it. Some rough notes are available here, and a video will be posted soon.

You may find the flowgraph useful.

Instead of grabbing the arc, just pass back an `&mut Runtime`. The eventual goal is to get rid of the lock on the `set` pathway altogether, but one step at a time.

Because dash-map isn't indexable, we need to store a copy of the key and have two separate maps. I expect to iterate on the best data structures here.

netlify · 2022-01-28T17:29:59Z

❌ Deploy Preview for salsa-rs failed.

🔨 Explore the source changes: d4a6b24

🔍 Inspect the deploy log: https://app.netlify.com/sites/salsa-rs/deploys/61f428116eb8450008d21f49

vlthr · 2022-01-29T16:27:35Z

I finally got around to running our benchmarks. I ran our integration test benchmarks on salsa 0.16.0 (our current version in production), the master branch, plus the slot-no-more branch for this PR. Each integration test runs a fresh (non-incremental) analysis of some contract text, either completely single-threaded or "preheated", where we spin up a bunch of threads to query common data.

Here are the initial results:

going from 0.16.0 to master was almost equal across the board, with only one document being 10% slower, so no meaningful change
going from master to slot-no-more was roughly equal in the single-threaded case (possibly <5% average slowdown, but no significant pattern)
in the preheated tests however, slot-no-more caused regressions of between 7% and 64%. After having a look at the flamegraph, the culprit is a single query spending a lot of time contending the SyncMap via QueryStorageOps::fetch -> SyncMap::claim -> DashMap::entry -> dashmap::lock::compare_exchange

In our case, I suspect part of the problem is that the majority of the preheating is routed through the same query (which looks up and executes dynamically registered entity finders by ID). Could this possibly be fixed by dynamically generating separate queries for each entity finder?

nikomatsakis · 2022-01-31T14:11:11Z

@vlthr interesting. I don't think you should have to refactor your tests -- it seems surprising to see the problem being contention for the sync-map. Do you any way to measure the hit rate overall?

Are your tests available on github?

vlthr · 2022-01-31T19:02:02Z

@nikomatsakis the code is unfortunately not publicly accessible, but if it helps I can DM more info and/or profiling/benchmark results.

I wasn't able to measure the hit rate, but I had another look at the preheating code to figure out why it's causing so much contention. The way it was implemented was pretty naive: we have ~500 dynamically registered entity labellers accessible via a common query endpoint db.labels_from(labeller_id), but only partial static knowledge of their dependencies (i.e. certain labeller implementations may call salsa queries which aggregate some other group of labellers). Since the preheater only has access to a partial dependency graph it was naively spinning up enough threads to run an entire cluster of possibly dependent queries each on their own thread and relying on the dependent ones being parked by salsa.

I'm guessing that having 50+ (or possibly more) threads hammering the same query might be close to the worst case for the compute_value loop. After adding some heuristics to the preheater I was able to slim down the number of threads, and that seems to have fixed it. With the new preheater all of the benchmarks time deltas between master and slot-no-more are within measurements noise, and there's no contention visible on the flamegraph anywhere.

nikomatsakis · 2022-02-07T20:23:03Z

@vlthr ok, so let me see here. The idea is that you have a lot of threads that are all trying to do the same query at once. Under this system, it probably does require an extra lock compared to the old system. You have to:

Check for a completed result (fetch-hot) -- finds nothing
Acquire sync-map -- finds it is already in use
Acquire a lock on the runtime to acquire condvar

I believe the first lock is not used in the old system. Hard to imagine that's such a source of contention, but then, it might be! (Makes me want to experiment with alternatives to dashmap.)

Still, seems like we should consider landing this.

nikomatsakis · 2022-02-07T20:29:53Z

bors r+

bors · 2022-02-07T20:32:52Z

Build succeeded:

nikomatsakis added 16 commits January 21, 2022 13:49

extract hashing definitions into a utility module

82d695b

refactor _mut path to not take arc

c0d9070

Instead of grabbing the arc, just pass back an `&mut Runtime`. The eventual goal is to get rid of the lock on the `set` pathway altogether, but one step at a time.

move to dash-map

685fccc

Because dash-map isn't indexable, we need to store a copy of the key and have two separate maps. I expect to iterate on the best data structures here.

new parallel friendly algorithm

3563925

Update derived-query-read.drawio.svg

08be38a

add RFC that describes the new scheme

e5f59cb

tweak SVG to rename "maybe changed after"

1f5d701

"maybe changed AFTER" in diagram

2e0301f

check for cancellation more aggressively

ade6bcf

remove dead struct

37a188c

Update derived-query-read.drawio.svg to indicate fn boundaries

b324206

Added derived-query-maybe-changed-after.drawio.svg

be7a6ed

Update derived-query-maybe-changed-after.drawio.svg

d69bdde

Update derived-query-maybe-changed-after.drawio.svg

ff5c613

Update derived-query-maybe-changed-after.drawio.svg

416884b

Update derived-query-maybe-changed-after.drawio.svg

d4a6b24

bors bot merged commit 0f9971a into salsa-rs:master Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slot no more: overhauled internal algorithm #296

Slot no more: overhauled internal algorithm #296

nikomatsakis commented Jan 28, 2022

netlify bot commented Jan 28, 2022 •

edited

vlthr commented Jan 29, 2022

nikomatsakis commented Jan 31, 2022

vlthr commented Jan 31, 2022

nikomatsakis commented Feb 7, 2022

nikomatsakis commented Feb 7, 2022

bors bot commented Feb 7, 2022

Slot no more: overhauled internal algorithm #296

Slot no more: overhauled internal algorithm #296

Conversation

nikomatsakis commented Jan 28, 2022

netlify bot commented Jan 28, 2022 • edited

vlthr commented Jan 29, 2022

nikomatsakis commented Jan 31, 2022

vlthr commented Jan 31, 2022

nikomatsakis commented Feb 7, 2022

nikomatsakis commented Feb 7, 2022

bors bot commented Feb 7, 2022

netlify bot commented Jan 28, 2022 •

edited