Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Use bloom filters in collator protocol #6071

Closed
eskimor opened this issue Sep 28, 2022 · 8 comments
Closed

Use bloom filters in collator protocol #6071

eskimor opened this issue Sep 28, 2022 · 8 comments
Labels
T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes.

Comments

@eskimor
Copy link
Member

eskimor commented Sep 28, 2022

This can be seen as an intermediate step to the alternate proposal by @rphmeier here. It improves things on its own, but with candidate providers != candidate creators this becomes even more solid.

Idea being, validators maintain two bloom filters with collator PeerIds getting added every time they provide a valid collation. Those bloom filters will have a life time of let's say a day/two days. One bloom filter is the active one where new PeerIds are added to, the other is inactive and stays at the state of the previous day. Both are used for prioritizing incoming connections: If the PeerId passes one of the bloom filters it is accepted - regardless of the free slots in the peerset.

The filtering should ideally be performed in substrate at the peerset level, similar to how the reserved set works. There are two differences:

  1. It concerns only incoming connections.
  2. Instead of a list of peers we pass in a filter function (which makes use of the bloom filter).
    -> That filter function should likely take the PeerId of the connecting peer and the number of already opened connections, so it can enforce a hard upper limit.

Incoming connections will then be admitted if we either have free incoming slots or the PeerId passes the filter.

Seeding initial filters

Validators should persist their filters to disk and in addition when coming up for the first time (or after a longer downtime) they can request bloom filters from random other validators. This gives us a 2/3 probability of starting with a good set, which is better than an empty set if the limit enforced by the above filter function is relatively low, otherwise it would probably be better to start with an empty filter. We should probably limit the number of bloom filter allowed connections quite aggressively, if the bloom filter is coming from an untrusted source, but have a rather large limit if it is was generated locally.

Sizes

Some back of the envelope calculations suggest that a bloom filter with a false positive rate of <0.1% for maintaining ~1000 good nodes (to be done properly), would be around 2kiB, therefore for 100 parachains this would be around 200kiB (400kiB for 2 filters).

Security Considerations

With a false positive rate of 0.1% an attacker needs 1000 connection attempts to compete with honest collators, which makes attacks way harder. The hashing should definitely be based on some non-guessable random seed, so the attacker has to actually attempt the connections.

Implementation

This has to be done in substrate in the peer set implementation, otherwise we would need to replicate the whole implementation - which would be slower and easier to attack. The issue in particular: Having some number of incoming free slots won't work anymore, as that would be both for bloom filtered connections and new connections.

@eskimor eskimor added the T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes. label Sep 28, 2022
@eskimor
Copy link
Member Author

eskimor commented Sep 28, 2022

Very much related to: #2888

@eskimor eskimor mentioned this issue Sep 28, 2022
5 tasks
@burdges
Copy link
Contributor

burdges commented Sep 28, 2022

You can only share filters by sharing the key, which increases false positive risks, which increases risk if you also record reputation or miss behavior in the filter. You'd need a cuckoo filter though if you want to record reputation in the filter.

@burdges
Copy link
Contributor

burdges commented Sep 28, 2022

Aren't parachains mostly running like 15ish nodes though? Any idea how big the largest is? We do not necessarily need many collators per parachain.

We're thinking parathread blocks come accompanied by a transaction, perhaps even in opening the connection. We're also thinking deterministic runtimes require parachains or their collators have some bond. If all collators of all or most parachains wind up with relay chain accounts, then maybe we already know them all anyways?

We might need large parachains for gav's special low-security recursive stuff, but then we could add special support later, while most parachains still run under some simple capped collator count mode.

@eskimor
Copy link
Member Author

eskimor commented Sep 28, 2022

There are concerns about leaving the collator set as open as possible. I personally like the idea of nodes having accounts - that is actually a favorite of mine. Having a list/filter of previously known good connections is already a step in that direction. That list/filter could equally well become nodes having an account >some DOT amount at some later point.

@rphmeier
Copy link
Contributor

I like this bloom filter idea, but we should separate that discussion from any discussion about having collators be registered on the relay-chain (it's come up multiple times in the past and it's not a degree of freedom I feel comfortable removing at this stage).

@burdges
Copy link
Contributor

burdges commented Sep 28, 2022

I really asked: Do we expect this many collators?

In fact, there exists a smooth-ish transition plan which morally goes via cuckoo filters although not necessarily in practie:

We could plan to make parachains provide a collator upper bound, and implement this using cuckoo filters, sized using this upper bound, which gives some advantages:

  • HashMap works right now even before implementing the cuckoo filter.
  • cuckoo filters are more space efficient than bloom filters, if tuned for this purpose.
  • cuckoo filters could instead hold actual data, if tuned to do so, which gives HashMap like memory usage.
  • actual data could be sent to other validators without exposing the map/filter keys.

We do not need the upper bound while the cuckoo filter holds actual data, as filters could be rebuilt like HashMap does, but it matters if we remove the actual data later to save memory. We always say cuckoo filters wind up more space efficient than bloom filters and hash maps, but both assume everyone targeting the right fullness, so chain size bounds always matter once memory size matters. I think cuckoo filters wind up more vulnerable to key exposure too, so maybe worth using bloom filters if sharing the whole filter.

@eskimor
Copy link
Member Author

eskimor commented Oct 13, 2022

Note: If any node is able to provide a collation (received it from actual collators), even if not a collator (or even if a collator, if the set is large/unlimited) it might still be possible to fill up all available slots slowly over time, by simply providing a good collation once and then never again.

Counter measures:

  1. We would still disconnect if not providing something useful in a certain amount of time - this gives other nodes an opportunity to connect - given that most nodes will not be able to do so, we will actually disconnect a lot, hence we will have free slots and only stay connected to the fastest peers (who win the race).
  2. Per parachain the number of collations to provide is limited (full announce + request + validate only one per relay chain block on average), so by simply allowing enough of connections we can make it so that it is impossible for bad guys to fully fill up all available slots, until the bloom filter is rotated.

@eskimor
Copy link
Member Author

eskimor commented Oct 28, 2022

I am currently pondering about an alternate design to solve more issues better.

@eskimor eskimor closed this as completed Oct 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T5-parachains_protocol This PR/Issue is related to Parachains features and protocol changes.
Projects
No open projects
Status: Done
Development

No branches or pull requests

3 participants