Parallel feed lookups #1332

jpeletier · 2019-04-10T13:53:23Z

Abstract

This PR introduces a concurrent algorithm that significantly increases Swarm Feeds lookup speed and reliability by launching simultaneous routines that explore the portion of keyspace which has the most probability of containing an update at any given time.

Lookup algorithm parameters have also been tweaked to reduce useless lookups back to Unix epoch time=0. As a result, a lookup of an empty feed resolves at around 1 second.

This fixes issue Issue 1184, whereupon nodes could potentially return different results when looking up a feed.

Additionally, tests have been refactored to simulate time and compute benchmarks. This allows to measure algorithm performance much more easily without having to consume that amount of time during actual testing.

With ❤ from Epic Labs

Background

Our last PR replaced Swarm Feed's lookup process with a new adaptive-frequency lookup algorithm library. This algorithm is described here: PR 17559 and was named FluzCapacitor

This first algorithm version works in series, meaning it explores the search space by deciding what epoch address to look up, attempting a retrieval and then going one path or another depending on the result (success or failure) of that operation (see PR 17559 for details on how this works). This means that the algorithm must wait for each lookup (seconds) to complete before taking action. To alleviate this problem and launch Adaptive Feeds as soon as possible, we made the decision of establishing an unrealistic timeout of 100ms per lookup, so that a full query would run within a few seconds.

This works well within your own node for testing, when it is something that you have published yourself and therefore all your feed chunks are stored locally. However, when the feed was published in another node, some lookups could fail if the node the user was connected to was not able to retrieve a chunk within those 100ms. Issue 1184 was open to address this.

A quick solution would be to increase that timeout to, say, 1 second. But if we did that, then a lookup would take 10 times more to complete, which could quickly get to the 10-20s range for first lookups and easily 5-6 seconds afterwards.

Parallel lookups

This PR introduces and activates a new algorithm named LongEarth (Named after Terry Prattchett's book series The Long Earth, in which multiple parallel universes exist).

This algorithm can work with large timeouts per lookup by initiating further more specific lookups without waiting for the previous ones to finish. Since the search process needs to know the result of a previous lookup in order to know what path to follow, LongEarth forks on each lookup after a headstart time, taking both paths recursively (as if exploring all interesting parallel universes at once). Once a previous lookup result is known, the losing branch is immediately pruned, leaving active only the branch that the update will be eventually found at. This results in lookup times that are an order of magnitude faster, at the expense of more exploratory chunk lookups.

Thus, you can think of LongEarth as an algorithm that aims to find the parallel universe where all of our choices were correct, while destroying all the others 😉.

Learning by example

Note: to understand the naming conventions below and how the seeding algorithm lays out updates and why, please refer to PR 17559.

Let's consider the below epoch grid with 9 updates, labeled U1 to U9 (last one).
Marked in yellow we have known updates (U1). Light orange is the hint we are giving the algorithm (U2), which for now we consider contains an update too. Updates U3-U9 are unknown. Our clock reads now it is t=14 and we want to find the last update (U9).

The first step is to determine where to look that gives us the highest probability of finding an update. This is determined by the lookup.GetNextEpoch(hint, now) function, that for this case returns (8, 3), the location of U6. GetNextEpoch() is computed as a simple operation involving a XOR, and it is explained in PR 17559 in the Walking the grid section.

Therefore, the algorithm starts looking up (8, 3) (R1 below) to see if there is an update there. It also determines a lookahead area (LA1), marked in light blue and a lookback area (LB1), marked in red also below. Active lookups are marked in purple. The lookahead area indicates the area that will continue to be explored if R1 succeeds, while the lookback area highlights what will be continued to be explored if R1 fails to find an update in (8, 3).

After a short interval (head start) waiting for R1 to resolve, the algorithm sets out to explore both lookahead and lookback areas, headed by (12, 2) and (4, 2) respectively. These two simultaneous lookups are labeled R2 in the figure below, and are active together with R1. Also, recursively, 4 more lookup areas (labeled LA2 and LB2) are defined that depend on the result of each instance of R2:

Once R1 resolves we can prune one area or the other. This is implemented by recursively cancelling a context, which aborts all chunk retrievals associated with it.

In our example, R1 returns update U6, therefore our status is as below. Note that U6 is now marked as "known" in yellow, thus we're certain that the area in dark red, while it could contain other updates, does not contain the latest one, which is the one we care about:

Again, after a short head start, the algorithm proceeds to look up the lookback and lookahead headers (8, 2) and (14, 1), marked as active (purple), with the label R3 below. These come, recursively, with their own lookahead and lookback areas, labeled LA3 and LB3:

In our example, R2 finally resolves, finding update U8. This means we can cancel lookback area LB2 as we are now sure the last update can't be there:

To shorten up the example, if R3 resolved immediately with no update found ((14, 1) did not contain an update), that would cancel LA3 family of lookups, while LB3 continues (now marked as R4):

Altough not drawn above, it is easy to see the pattern: LA4 and LB5 would then be scanned as R5 and R6 respectively, and in this case, fail, thus leaving (12, 1) (U9), as the found update:

As with FluzCapacitor, the found update U9 can then be used as a hint for future lookups.

This example is simplified—with the current algorithm parameters, there can potentially be around 30 lookups taking place concurrently, exploring the search space for updates and pruning entire branches recursively once a better path is found. This can be configured by adjusting the lookup timeout and headstart times.

If the algorithm failed to find an update (i.e., if U3-U9 did not actually exist), then the hint in (0, 3) (U2) would be challenged for validity: In this situation if the hint actually contains an update, that is then returned the last update and we're done. If the hint, however, is proven false then the algorithm restarts without hint at that point.

Benchmarks

The following benchmark assumes the following:

Timeout when no update is found: 1s. (Prior to this PR, this was set to 0.1s, which was unrealistic and prone to errors)
Time a retrieval takes when successful: 0.5s

	FluzCapacitor		LongEarth
	Time	lookups	Time	lookups
Empty Feed	1s	1	1s	10
Monthly update 3 years ago, without hint	10.25s	15	4.55s	78
Monthly update 3 years ago, with hint	4.75s	5	2.35s	51
One update	2.65s	3	1.55s	21
Bad Hint	12.1s	12	5.4s	127
High Frequency updates, without hint	17.6s	32	8.3s	75
High Frequency updates, with hint	2.65s	3	1.3s	13
Sparse updates, without hint	15.6s	21	6s	127
Sparse updates, with hint	9s	9	3.35	102

Note how LongEarth drastically improves lookup times at the expense of more lookup attempts.

Other changes

Epoch HighestLevel change

This PR also comes with a feed algorithm configuration change. Namely, the HighestLevel parameter has been changed from 25 to 31. This reduces the time it takes to find whether a feed is empty to the lookup timeout, which is at most 1 second, as opposed to having to go "back in time" to Unix time=0 (Jan 1st 1970), which does not make sense for 99,9% of the cases.

This means existing published feeds won't be found. To fix this for your feed, simply republish the last update again 6 times in a quick succession to fill levels 31-26.

Testing

Multi-algorithm test suite

Tests have been modified to test both FluzCapacitor and LongEarth side by side. This allows for some comparison, benchmarking and stronger tests, since both algorithms only differ in the way and the resources they consume to reach the result, but the results must be the same.

The new tooling measures the time it takes for each algorithm to reach a conclusion, as well as the amount of reads it performs, among other things.

Time simulation

The PR also comes with test refactoring to simulate time so that the tests don't have to last the same amount of time as the different lookup timeouts, which could cause problems when running tests in certain shared CI environments.

Final notes and further work

We have described a parallelization of FluzCapacitor providing a concurrent implementation and extensive test suite to measure and validate results.

Further work will focus on optimizing the algorithm parameters to get the best of both. In particular, the *headstart parameters of LongEarth determine how aggresively it spends lookups to try to save time: setting these to infinite (no lookahead/lookback) effectively makes LongEarth equivalent to FluzCapacitor. On the other hand, setting these to zero would force the algorithm to lookup all possible outcomes simultaneously 😅, which would exhaust stack/memory and be impossible to execute!

Please let me know your feedback, questions and test issues. I hope you like this feature. I am available on Gitter (@jpeletier) in #orange-lounge channel. Enjoy!!

nolash

Very nice code @jpeletier

My review includes the commits since 6de585008a6cbbaacf182cec2a5ff4ef97cce6a1 - I hope that's correct.

swarm/storage/feed/lookup/algorithm_longearth.go

swarm/storage/feed/lookup/store_test.go

swarm/storage/feed/lookup/lookup_test.go

swarm/storage/feed/lookup/timesim_test.go

jpeletier mentioned this pull request Apr 10, 2019

Swarm/feeds: Parallel feed lookups ethereum/go-ethereum#19414

Merged

jpeletier self-assigned this Apr 10, 2019

jpeletier added the feeds label Apr 10, 2019

jpeletier force-pushed the parallel-feeds branch from b533639 to d73bd4f Compare April 16, 2019 09:12

jpeletier requested a review from zsfelfoldi as a code owner April 16, 2019 09:12

jpeletier requested review from nolash and zelig and removed request for zsfelfoldi April 16, 2019 09:20

nolash reviewed Apr 18, 2019

View reviewed changes

nolash approved these changes Apr 25, 2019

View reviewed changes

jpeletier added 13 commits May 13, 2019 12:53

swarm/storage/feed/lookup: First LE that works

9f31234

swarm/storage/feed/lookup: multi-algorithm test suite

daf5126

swarm/storage/feed/lookup: Add comments to tests

34bf9c4

swarm/storage/feed/lookup: Refactor mock storage, add perf counters

7301f2b

swarm/storage/feeds/lookup: Commented tests and LongEarthAlgorithm

8624668

swarm/storage/feed: Increase the lookup timeout to 1 second.

4981a9a

swarm/storage/feed: make handler read count atomic

f87cc00

swarm/storage/feed/lookup: Add Base to Epoch.String()

78bf518

swarm/storage/feed/lookup: LE hint check if main lookup fails

d378227

swarm/storage/feed/lookup: change lookup HL to 31 for efficiency

2096c44

swarm/storage/feed/lookup: more comments

709cc6f

swarm/storage/feed/lookup: fix typos

1109b73

swarm/storage/feed/lookup: renamed step hint to 'last'

f23b88a

jpeletier force-pushed the parallel-feeds branch from 621f654 to f23b88a Compare May 13, 2019 10:55

nonsense closed this May 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel feed lookups #1332

Parallel feed lookups #1332

jpeletier commented Apr 10, 2019

nolash left a comment •

edited

Parallel feed lookups #1332

Parallel feed lookups #1332

Conversation

jpeletier commented Apr 10, 2019

Abstract

Background

Parallel lookups

Learning by example

Benchmarks

Other changes

Epoch HighestLevel change

Testing

Multi-algorithm test suite

Time simulation

Final notes and further work

nolash left a comment • edited

Choose a reason for hiding this comment

nolash left a comment •

edited