New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/bloombits, eth/filter: transformed bloom bitmap based log search #3749

Closed
wants to merge 11 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@zsfelfoldi
Contributor

zsfelfoldi commented Mar 6, 2017

This PR is based on #14431
Further parts of the code may be moved into separate PRs to make the review process easier (suggestions are welcome).

This PR optimizes log searching by creating a data structure (BloomBits) that makes it cheaper to retrieve bloom filter data relevant to a specific filter. When searching in a long section of the block history, we are checking three specific bits of each bloom filter per address/topic. In order to do that, currently we read/retrieve a cca. 500 byte block header for each block. The implemented structure optimizes this by a "bitwise 90 degree rotation" of the bloom filters. Blocks are grouped into sections (SectionSize is 4096 blocks at the moment), BloomBits[bitIdx][sectionIdx] is a 4096 bit (512 byte) long bit vector that contains a single bit of each bloom filter from the block range [sectionIdx*SectionSize ... (sectionIdx+1)*SectionSize-1]. (Since bloom filters are usually sparse, a simple data compression makes this structure even more efficient, especially for ODR retrieval.) By reading and binary AND-ing three BloomBits sections, we can filter for an address/topic in 4096 blocks at once ("1" bits in the binary AND result mean bloom matches).

Implementation and design rationale of the matcher logic

  • Pipelined structure

The matcher was designed with the needs of both full and light nodes in mind. A simpler architecture would probably be satisfactory for full nodes (where the bit vectors are available in the local database) but the network retrieval bottleneck of light clients justifies a more sophisticated algorithm that tries to minimize the amount of retrieved data and return results as soon as possible. The current implementation is a pipelined structure based on input and output channels (receiving section indexes and sending potential matches). The matcher is built from sub-matchers, one for the addresses and one for each topic group. Since we are interested in matches that each sub-matcher signals as positive, they are daisy-chained in a way that subsequent sub-matchers are only retrieving and matching the bit vectors of sections where the previous matchers have found a potential match. The "1" bits of the output of the last sub-matcher are returned as bloom filter matches.
Sub-matchers use a set of fetchers to retrieve a stream of bit vectors belonging to certain bloom filter bit indexes. Fetchers are also pipelined, receiving a stream of section indexes and sending a stream of bit vectors. As soon as each fetcher returned the bit vector belonging to a certain section index, the sub-matcher performs the necessary binary AND and OR operations and outputs the resulting vector and its belonging section index if the vector is not made of zeroes only.

  • Prioritizing and batching requests

Light clients retrieve the bit vectors with merkle proofs, which makes it much more efficient to retrieve batches of vectors (whose merkle proofs share most of their trie nodes) in a single request. Also, it is preferable to prioritize requests based on their section index (regardless of bit index) in order to ensure that matches are found and returned as soon as possible (and in a sequential order). Prioritizing and batching are realized by a common request distributor that receives individual bit index/section index requests from fetchers and keeps an ordered list of section indexes to be requested, grouped by bit index. It does not call any retrieval backend function but it is called by a "server" process (see serveMatcher in filter.go). NextRequest returns the next batch to be requested, retrieved vectors are returned through the Deliver function. This method ensures that the bloombits package should only care about implementing the matching logic. The caller can retain full control over the resources (CPU/disk/network bandwidth) assigned to this task.

@mention-bot

This comment has been minimized.

Show comment
Hide comment
@mention-bot

mention-bot Mar 6, 2017

@zsfelfoldi, thanks for your PR! By analyzing the history of the files in this pull request, we identified @obscuren, @Arachnid and @fjl to be potential reviewers.

mention-bot commented Mar 6, 2017

@zsfelfoldi, thanks for your PR! By analyzing the history of the files in this pull request, we identified @obscuren, @Arachnid and @fjl to be potential reviewers.

@zsfelfoldi zsfelfoldi requested review from fjl and karalabe Mar 6, 2017

@zsfelfoldi zsfelfoldi added this to the 1.6.0 milestone Mar 6, 2017

@karalabe

This comment has been minimized.

Show comment
Hide comment
@karalabe

karalabe Apr 5, 2017

Member

I took a quick peek at this PR, but I have some issues with it in general.

It adds a lot of stuff all over core, which it then disables being "experimental". I honestly don't like this approach. This would introduce a ton of tech debt that someone will have to clean up later in the name of experimentalness.

If this approach is superior to the current mip-map blooms that we have, we should simply swap it out as is, with the one proposed in the current PR, ready for production. We already have an experimental discovery v5 that kind of works, but isn't actually usable outside of the light client; we have the light client itself which kind of works, but isn't really production ready and now this PR would mix another layer of experimental stuff that kind of works, but doesn't actually.

My proposal for moving forward with this PR is to:

  • Rebase on top of current master, since the code diverged.
  • Dump the old mipmap bloom stuff and fully replace with the proposal here.
  • Do an actual import benchmark to see how the database behaves with the live chain.

If performance is good, we can polish the code to be production ready and enable it for everyone. There's no point in further adding experimental stuff into geth, we have way too much already.

Member

karalabe commented Apr 5, 2017

I took a quick peek at this PR, but I have some issues with it in general.

It adds a lot of stuff all over core, which it then disables being "experimental". I honestly don't like this approach. This would introduce a ton of tech debt that someone will have to clean up later in the name of experimentalness.

If this approach is superior to the current mip-map blooms that we have, we should simply swap it out as is, with the one proposed in the current PR, ready for production. We already have an experimental discovery v5 that kind of works, but isn't actually usable outside of the light client; we have the light client itself which kind of works, but isn't really production ready and now this PR would mix another layer of experimental stuff that kind of works, but doesn't actually.

My proposal for moving forward with this PR is to:

  • Rebase on top of current master, since the code diverged.
  • Dump the old mipmap bloom stuff and fully replace with the proposal here.
  • Do an actual import benchmark to see how the database behaves with the live chain.

If performance is good, we can polish the code to be production ready and enable it for everyone. There's no point in further adding experimental stuff into geth, we have way too much already.

@zsfelfoldi zsfelfoldi removed this from the 1.6.0 milestone Apr 6, 2017

@zsfelfoldi zsfelfoldi requested a review from Arachnid Apr 25, 2017

@zsfelfoldi

This comment has been minimized.

Show comment
Hide comment
@zsfelfoldi

zsfelfoldi Apr 27, 2017

Contributor

I ran some benchmarks testing different section sizes and compression methods:
https://github.com/zsfelfoldi/go-ethereum/commits/bloombits-bench
https://gist.github.com/zsfelfoldi/ed2991f90614b7f9f6e5368bc6d8faa8

Notes:

  • compression method 0 means no compression at all, 1 is my own algorithm, 2 is google snappy
  • the additional running costs of the pipelined matcher start to become noticeable at or below 1k section size
  • even after we choose a section size, the execution speeds of both the matcher and my compression could be further optimized if necessary
  • searching the last few blocks (less than sectionSize+512 blocks) with regular bloom filtering takes a considerable amount of time compared to the fast filtering so the ideal section size is definitely not larger than 4k (I'd probably choose 2k or 4k).
  • for light client mode I will probably want to use a higher section size and my own compression because it is cca. 3x more efficient here than snappy. For full clients we could consider using snappy (it is more standard and also a bit faster than the current naive implementation of my stuff) but I think it might be easier to use the same algorithm everywhere.
Contributor

zsfelfoldi commented Apr 27, 2017

I ran some benchmarks testing different section sizes and compression methods:
https://github.com/zsfelfoldi/go-ethereum/commits/bloombits-bench
https://gist.github.com/zsfelfoldi/ed2991f90614b7f9f6e5368bc6d8faa8

Notes:

  • compression method 0 means no compression at all, 1 is my own algorithm, 2 is google snappy
  • the additional running costs of the pipelined matcher start to become noticeable at or below 1k section size
  • even after we choose a section size, the execution speeds of both the matcher and my compression could be further optimized if necessary
  • searching the last few blocks (less than sectionSize+512 blocks) with regular bloom filtering takes a considerable amount of time compared to the fast filtering so the ideal section size is definitely not larger than 4k (I'd probably choose 2k or 4k).
  • for light client mode I will probably want to use a higher section size and my own compression because it is cca. 3x more efficient here than snappy. For full clients we could consider using snappy (it is more standard and also a bit faster than the current naive implementation of my stuff) but I think it might be easier to use the same algorithm everywhere.
@karalabe

This comment has been minimized.

Show comment
Hide comment
@karalabe

karalabe Apr 27, 2017

Member

Regarding snappy vs. own, I'm leaning towards snappy tbh. For the simple practical reason that we don't have to maintain it. How large is the different between yours and snappy? And why do you think yours it better for light client (I mean we can do yours too, just it would need to be worth the extra effort).

Member

karalabe commented Apr 27, 2017

Regarding snappy vs. own, I'm leaning towards snappy tbh. For the simple practical reason that we don't have to maintain it. How large is the different between yours and snappy? And why do you think yours it better for light client (I mean we can do yours too, just it would need to be worth the extra effort).

@zsfelfoldi

This comment has been minimized.

Show comment
Hide comment
@zsfelfoldi

zsfelfoldi Apr 27, 2017

Contributor

Mine produces a 3x better compression ratio, which is really significant for the light client use case. If they would produce similar results, I'd definitely go with snappy too. Luckily, mine is extremely simple and easy to specify.

Contributor

zsfelfoldi commented Apr 27, 2017

Mine produces a 3x better compression ratio, which is really significant for the light client use case. If they would produce similar results, I'd definitely go with snappy too. Luckily, mine is extremely simple and easy to specify.

@zsfelfoldi

This comment has been minimized.

Show comment
Hide comment
@zsfelfoldi

zsfelfoldi Apr 27, 2017

Contributor

See the compression ratios in the benchmark output.

Contributor

zsfelfoldi commented Apr 27, 2017

See the compression ratios in the benchmark output.

@Arachnid

Review of just the first commit

"time"
)
// ChainProcessor is an external process that creates auxiliary data structures

This comment has been minimized.

@Arachnid

Arachnid May 2, 2017

Contributor

You should probably add a note that chain processors can't call any methods on Blockchain that use this lock, or we'll get a deadlock.

@Arachnid

Arachnid May 2, 2017

Contributor

You should probably add a note that chain processors can't call any methods on Blockchain that use this lock, or we'll get a deadlock.

This comment has been minimized.

@zsfelfoldi

zsfelfoldi May 5, 2017

Contributor

Added in #14431

@zsfelfoldi

zsfelfoldi May 5, 2017

Contributor

Added in #14431

childProcessors []ChainProcessor
}
// ChainSectionProcessorBackend does the actual post-processing job.

This comment has been minimized.

@Arachnid

Arachnid May 2, 2017

Contributor

Can you document here what the methods are expected to do, and the meaning of the arguments and return values?

@Arachnid

Arachnid May 2, 2017

Contributor

Can you document here what the methods are expected to do, and the meaning of the arguments and return values?

This comment has been minimized.

@zsfelfoldi

zsfelfoldi May 5, 2017

Contributor

Added in #14431

@zsfelfoldi

zsfelfoldi May 5, 2017

Contributor

Added in #14431

Show outdated Hide outdated core/chain_processor.go
Show outdated Hide outdated core/chain_processor.go
Show outdated Hide outdated core/chain_processor.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment