Conversation
A couple of questions.
I wrote about this here: Adaptive Enhanced Bloom Filters. |
ethcore/src/blockchain/blockchain.rs
Outdated
@@ -1459,8 +1459,7 @@ mod tests { | |||
use std::sync::Arc; | |||
use rustc_hex::FromHex; | |||
use hash::keccak; | |||
use kvdb::{KeyValueDB, DBTransaction}; | |||
use kvdb_memorydb; | |||
use kvdb::{DBTransaction}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks around a single import
@@ -33,7 +33,7 @@ use snapshot::io::{SnapshotReader, PackedWriter, PackedReader}; | |||
use tempdir::TempDir; | |||
use rand::Rng; | |||
|
|||
use kvdb::{KeyValueDB, DBValue}; | |||
use kvdb::{DBValue}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks around a single import
@@ -28,8 +27,7 @@ use snapshot::io::{PackedReader, PackedWriter, SnapshotReader, SnapshotWriter}; | |||
|
|||
use parking_lot::Mutex; | |||
use snappy; | |||
use kvdb::{KeyValueDB, DBTransaction}; | |||
use kvdb_memorydb; | |||
use kvdb::{DBTransaction}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks around a single import
@@ -24,11 +24,11 @@ use ids::BlockId; | |||
use snapshot::service::{Service, ServiceParams}; | |||
use snapshot::{self, ManifestData, SnapshotService}; | |||
use spec::Spec; | |||
use test_helpers::{self, generate_dummy_client_with_spec_and_data}; | |||
use test_helpers::{generate_dummy_client_with_spec_and_data}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks around a single import
use test_helpers_internal::restoration_db_handler; | ||
|
||
use io::IoChannel; | ||
use kvdb_rocksdb::{Database, DatabaseConfig}; | ||
use kvdb_rocksdb::{DatabaseConfig}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks around a single import
ethcore/src/test_helpers_internal.rs
Outdated
@@ -19,7 +19,7 @@ | |||
use std::path::Path; | |||
use std::sync::Arc; | |||
use parking_lot::RwLock; | |||
use kvdb::{KeyValueDB, KeyValueDBHandler}; | |||
use kvdb::{KeyValueDB}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks around a single import
ethcore/src/trace/db.rs
Outdated
@@ -108,18 +104,15 @@ impl<T> TraceDB<T> where T: DatabaseExtras { | |||
let current_size = self.cache_size(); | |||
|
|||
let mut traces = self.traces.write(); | |||
//let mut blooms = self.blooms.write(); | |||
let mut cache_manager = self.cache_manager.write(); | |||
|
|||
cache_manager.collect_garbage(current_size, | ids | { | |||
for id in &ids { | |||
match *id { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use if let
instead of match of a single pattern?
@niklasad1 this pr is still in progress and quite far away from being ready for a review :)
They will, especially top level blooms. But the cost of checking them is so low, that it doesn't matter. Currently we use exactly the same bloom filters with three layers for filtering events. The bottleneck is rocksdb io, and this pr is trying to address only that problem. :)
Yes, that's a possible improvement.
You can use transaction traces exactly for that. They already utilize the same bloom filters. |
The reason I'm suggesting adding every address to the bloom is to avoid having to request traces. If one is trying to do a full accounting of a particular address (or group of addresses), one needs to visit every trace, but if the blooms included all addresses involved in a block, one could use them to avoid a huge number of trace requests. Requesting traces (especially across the 2016 dDos transactions) is brutal. Did you have a chance to glance at the paper I mentioned above? With that method, I can collect full lists of transactions per account about 100 times faster than reading every trace. The current blooms don't include every address, so they miss a lot of blocks if one wants a full list of transactions. Also, one thing I found was that because the block level blooms "roll-up" the transaction level blooms, two unfortunate things happen: (1) the block level blooms are becoming saturated, so they are reporting more false positives than you might want (this is exacerbated in the hierarchical scheme you suggest without widening the mid- and upper-level blooms), and (2) the receipt-level blooms are way too undersaturated and therefore take up a ton of space that isn't well-utilized. |
…, cause fs::File is just a shared file handle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ready for the review
fn open(&self, db_path: &Path) -> Result<Arc<BlockChainDB>, Error> { | ||
let key_value = Arc::new(kvdb_rocksdb::Database::open(&self.config, &db_path.to_string_lossy())?); | ||
let blooms_path = db_path.join("blooms"); | ||
let trace_blooms_path = db_path.join("trace_blooms"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is a good location for blooms db. I'm open for any other suggestions
@@ -1293,10 +1294,11 @@ impl snapshot::DatabaseRestore for Client { | |||
let mut tracedb = self.tracedb.write(); | |||
self.importer.miner.clear(); | |||
let db = self.db.write(); | |||
db.restore(new_db)?; | |||
db.key_value().restore(new_db)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because blooms-db is placed in the same directory, it also moves it. Maybe we should move this function out of KeyValueDB
interface to make it more descriptive
ethcore/src/client/client.rs
Outdated
|
||
// TODO: restore blooms properly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I forgot to reopen the blooms db, will fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@debris Fixed? If so remove the comment 👋
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not yet, that's the only remaining thing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
// Some(3) -> COL_EXTRA | ||
// 3u8 -> ExtrasIndex::BlocksBlooms | ||
// 0u8 -> level 0 | ||
let blooms_iterator = db.key_value() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from ethcore/src/blockchain/extras.rs
// Some(4) -> COL_TRACE | ||
// 1u8 -> TraceDBIndex::BloomGroups | ||
// 0u8 -> level 0 | ||
let trace_blooms_iterator = db.key_value() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from ethcore/src/trace/db.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. Just not sure about some aspects regarding the database consistency issues.
}); | ||
|
||
for (number, blooms) in blooms_iterator { | ||
db.blooms().insert_blooms(number, blooms.iter())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
insert_blooms
is not atomic. If any of top
, mid
, or bot
file write fails, the bloomdb basically enters a corrupt status -- we will have top-level blooms saying something exists but cannot be found in mid
or bot
(true negative). So I think it may be better to just panic expect
here like we do in blockchain.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters. even if it ends up in the corrupt state. On next parity launch the migration will start from scratch. Corrupted data will be overwritten and database state will be consistent
// constant forks make lead to increased ration of false positives in bloom filters | ||
// since we do not rebuild top or mid level, but we should not be worried about that | ||
// most of the time events at block n(a) occur also on block n(b) or n+1(b) | ||
self.top.accrue_bloom::<ethbloom::BloomRef>(pos.top, bloom)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding possible bloomdb true negative corrupt situation, given that we only insert one bloom at a time, I think a simple solution maybe to create a metadata file, storing the current writing bloom position before attempting the insertion. Then, if we find any error case in below, next time when the program starts, we would only need to rewrite that particular bloom position to fix the corruption.
This doesn't solve the disk consistency issue. Otherwise we will need to flush
within the for loop, which may be a bad performance penalty. So not sure whether we would want this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, if we find any error case in below, next time when the program starts, we would only need to rewrite that particular bloom position to fix the corruption.
That's what I did initially. Then I thought about the problem again and realised that all of this does not matter. Existence of blooms is not consensus critical. If one of the writes fail (or partially fail), the particular block will be reimported on the next launch of parity and the database will overwrite corrupted data
/// | ||
/// This database does not guarantee atomic writes. | ||
pub struct Database { | ||
database: Mutex<db::Database>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reasons we use Mutex
but not RwLock
? I think it would be safe for multiple parties to read blooms at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! There is a reason, https://users.rust-lang.org/t/how-to-handle-match-with-irrelevant-ok--/6291/15
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr
Reading a file is mutating the object that represent the file stream, namely altering the position. The very data being mutated is stored in the kernel space, and simultaneous mutation won’t cause the kernel any trouble because it is smart enough to synchronize accesses.
Still, two threads simultaneously reading the same file stream will break internal logic. (I guess even a single read() call on BufReader could yield a non-contiguous slice).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I made iterator function of db::Database
mut
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it!
Please add a ✔️ tick if you think this looks good. |
ethcore/src/blockchain/blockchain.rs
Outdated
@@ -152,7 +169,8 @@ pub trait BlockProvider { | |||
} | |||
|
|||
/// Returns numbers of blocks containing given bloom. | |||
fn blocks_with_bloom(&self, bloom: &Bloom, from_block: BlockNumber, to_block: BlockNumber) -> Vec<BlockNumber>; | |||
fn blocks_with_bloom<'a, B, I, II>(&self, blooms: II, from_block: BlockNumber, to_block: BlockNumber) -> Vec<BlockNumber> | |||
where BloomRef<'a>: From<B>, II: IntoIterator<Item = B, IntoIter = I> + Copy, I: Iterator<Item = B>, Self: Sized; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that each trait definition
deserves its own line (to make it easier to read)
e.g,
where
BloomRef<'a>: From<B>,
II: IntoIterator<Item = B, IntoIter = I> + Copy,
I: Iterator<Item = B>,
Self: Sized;
ethcore/src/blockchain/blockchain.rs
Outdated
.map(|b| b as BlockNumber) | ||
.collect() | ||
fn blocks_with_bloom<'a, B, I, II>(&self, blooms: II, from_block: BlockNumber, to_block: BlockNumber) -> Vec<BlockNumber> | ||
where BloomRef<'a>: From<B>, II: IntoIterator<Item = B, IntoIter = I> + Copy, I: Iterator<Item = B> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that each trait definition deserves its own line (to make it easier to read)
e.g,
where
BloomRef<'a>: From<B>,
II: IntoIterator<Item = B, IntoIter = I> + Copy,
I: Iterator<Item = B>,
Self: Sized;
ethcore/src/snapshot/service.rs
Outdated
@@ -622,7 +621,8 @@ impl Service { | |||
|
|||
match is_done { | |||
true => { | |||
db.flush().map_err(UtilError::from)?; | |||
// TODO: flush also blooms? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand the API
doesn't provide flushing
but it does flush
everytime new blooms are inserted?!
If so clarify comment alternatively change the API and actually flush!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's an obsolete comment ;)
Yes, this db doesn't provide flushing, cause writes are nonatomic and happen immediately when someone calls insert
ethcore/src/trace/db.rs
Outdated
// config, | ||
bloom_config: BloomConfig, | ||
// tracing enabled | ||
cache_manager: RwLock<CacheManager<H256>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docs for cache_manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh... it's not a public field and no change has been made here :p but I'll add a description ;p
ethcore/src/trace/db.rs
Outdated
for key in blooms_keys { | ||
self.note_used(CacheId::Bloom(key)); | ||
} | ||
// TODO: replace it with database for trace blooms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be fixed in this PR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yes, obsolete comment 😅
ethcore/src/trace/db.rs
Outdated
let numbers = chain.filter(filter); | ||
let possibilities = filter.bloom_possibilities(); | ||
let numbers = self.db.trace_blooms() | ||
.filter(filter.range.start as u64, filter.range.end as u64, &possibilities).expect("TODO: blooms pr"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is fixed change the expect
message!
let blooms = rlp::decode_list::<Bloom>(&group); | ||
(number, blooms) | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is identical
except the variables to #L38-#L52 consider moving it to a function instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not exactly the same, notice that endianness is completely different. also the function would have 4 variables and imo would be less readable than what we have here. As this is a migration, this code will never be modified again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense then NVM!
util/blooms-db/Cargo.toml
Outdated
[package] | ||
name = "blooms-db" | ||
version = "0.1.0" | ||
authors = ["debris <marek.kotewicz@gmail.com>"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be admin@parity.io
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot one thing, also add a license I guess it should be: license = "GPL-3.0"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it doesn't matter for subcrates :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, you are right. We usually have a license
field so I'll also add it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, a really high-quality PR but some minor things to fix!
@niklasad1 @sorpaas I addressed grumbles. Can you review the pr again and mark as accepted (if it's good) :) |
also, I merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
* 'master' of https://github.com/paritytech/parity: new blooms database (openethereum#8712) ethstore: retry deduplication of wallet file names until success (openethereum#8910) Update ropsten.json (openethereum#8926)
* master: new blooms database (#8712)
This pr introduces new database for header blooms.
Why?
The only time when we access blooms is when we are filtering logs. Filtering logs requires iteration over all blooms in a database and this is very inefficient with RocksDB. As blockchain got bigger and bigger it took more and more time to effectively filter all blooms starting from the genesis block.
How?
I built increment only, sequential database for storing blooms. It consists of three database layers.
top
,mid
,bot
.bot
layer contains all header bloomsmid
layer contains blooms created from 16 consecutive blooms of thebot
layer.top
layer contains blooms from 16 consecutive blooms from themid
layer.Blooms at each layer are placed next to each other on a disk. So it's very efficient to find
O(1)~
and iterateO(1)~
over them. It's a huge improvement compared to the previous database implementation where every lookup and step of iteration was probablyO(log n)
.size of a database (for 1000000 blocks)
bot
layer - 256mb (1000000 * size_of_bloom (256b))mid
layer - 16mb (bot
/ 16)top
layer - 1mb (mid
/ 16)Other issues addressed
Blooms recomputation in case of a fork is very slow #8552 - this pr also addresses Blooms recomputation in case of a fork is very slow #8552 by simply not recomputing blooms at higher levels. Why? Because in case of a fork, it's likely that the same transaction on a different chain will be included in block of the same number or in one of the following blocks. As long as the distance between block on chain a, and block on chain b, is on average significantly lower than 128, false-positive ration of bloom filter should not increase. And even if the
top
andmid
bloom return the false-positive, the underlying bottom bloom is replaced with a new one, so this will not lead to incorrect results.https://github.com/paritytech/parity/blob/343b29866c4cc592afa9038995217b6ba99e70fa/util/blooms-db/src/db.rs#L96-L98
this pr also addresses all issues with bloom recomputation on forks. Recomputation does not happen and there is no in-memory cache, so blooms are always valid
todo list (in this pr)
BlockChainDB
)BlockChainDB
should not expose&RwLock
in the interfacebloomchain
cratebenchmarks
So far I run only a single benchmark for blooms-db. The results are promising, but I believe that they do not fully show how much everything has improved.
benchmark from old stackoverflow thread
old parity:
this branch:
This pr, should also reduce disk writes by roughly 70gb when doing full sync