Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for incorrect hash value on snapshot ingest (v2) #7559

Merged
merged 3 commits into from Dec 20, 2019

Conversation

@ryoqun
Copy link
Member

ryoqun commented Dec 19, 2019

(This PR is based on @sakridge 's #7427)

Problem

No checking on snapshot ingestion that data which is hashed matches the hash value in the append vec.

Summary of Changes

  • Preparatory refactorings (which @ryoqun did) :
    • This PR makes the verification function return an enum, not bool to make it easy to unit test for individual bad cases. I think this is a desired prudent approach here because these are corner cases which is rarely touched on the production so it's quite possible for bit rot and silent false-{positive, negative}s.
    • In turn, because of the new enum, I did some naming clarification for logically clear naming for each member of the enum because otherwise the new and radical naming would be stranger to the other parts of code or the traditional naming based on the status-quo would be too confusing.
  • Re-hash the account data and check for differences (which original @sakridge 's PR did)
    • Additionally, This PR does some modifications to the original logic. Namely, making the added check actually work, bailing out early to avoid wasted following calculation and thereby fixing low security risks if the new check failed.

Naming changes

1. internal_state => bank_hash at the AccountDB layer:

The word internal_state is originated from bank module's Bank::{hash,verify_hash}_internal_state, and used only here. The names for banks are fine as it is, because it is calculating (=hash-ing, verb) the state of Bank, which is internal from the other layers, and verify-ing the calculation (=hash, noun) itself.

But the internal_state isn't clear-cut fit for AccountsDB. The same word is used for different things. AccountsDB calls its returning BankHash its "internal_state" while bank calls its returning Hash including AccountDB's BankHash its "internal_state". It's confusing and AccountsDB should just use the bank_hash name. Because of following section's reasoning, I chose it over the other candidate (= slot_hash).

In my opinion, using exactly same names for different layers is a bit confusing unless they are perfectly delegating work and forwarding args and concepts or they are well-defined concepts across different layers and components. These are not true for the internal_state.

Finally by introducing bank_hash, it cleanly align with the account.hash (= account_hash) in the new enum.

2. slot_hashes => bank_hashes at the AccountsDB API

Sadly, SlotHashes/slot_hashes are already taken and established as a sysvar. While both components use the same word "slot hashes", yet its definitions are different. Bank/AccountsDB's slot_hash is one of inputs for the sysvar's "slot hash". It's confusing at best. So, just use the straight naming derived from type name of BankHash here.

I avoided renaming the AccontsDB::slot_hashes field to AccountsDB::bank_hashes as it will introduce too noise for this PR, which is already a bit sizable. I'll do a quick dumb-replace PR after this PR.

Part of #7167

sakridge and others added 2 commits Nov 22, 2019
@@ -7,9 +7,10 @@ use std::fmt;
use std::mem;
use std::str::FromStr;

pub const HASH_BYTES: usize = 32;

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

A tiny nicety. This aligns with BANK_HASH_BYTES in bank_hash.rs.


fn store_with_hashes(&self, slot_id: Slot, accounts: &[(&Pubkey, &Account)], hashes: &[Hash]) {

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

Extracted into new fn purely for the purpose of unit-testing.

@@ -379,7 +384,6 @@ pub struct AccountsDB {
/// the accounts
min_num_stores: usize,

/// slot to BankHash and a status flag to indicate if the hash has been initialized or not

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

A bit aggressive, but delete this comment as it's outdated and now adds no additional information by the comment.

pub storage: RwLock<AccountStorage>,

/// distribute the accounts across storage lists
pub next_id: AtomicUsize,

/// write version

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

A bit aggressive here, but remove this comment as it adds no additional information to the commented code itself. use fewer comments and remove otherwise as it's too common for engineers to forget to maintain them up-to-date (including me!) as this is the case.

@@ -354,13 +361,11 @@ pub struct AccountsDB {
/// Keeps tracks of index into AppendVec on a per slot basis
pub accounts_index: RwLock<AccountsIndex<AccountInfo>>,

/// Account storage

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

Remove this as the same reason as this.

@@ -175,6 +175,13 @@ pub enum AccountStorageStatus {
Candidate = 2,
}

#[derive(Debug)]
pub enum BankHashVerificatonError {

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

In my opinion, the naming here is very clear after all the hustle of naming changes of this PR. :)

let hash = BankHash::from_hash(&Self::hash_account(slot, &account, pubkey));
let hash = Self::hash_account(slot, &account, pubkey);
if hash != account.hash {
*mismatch_found = true;

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

Originally assigned false, but this didn't work because the Default of bool is false... So I inverted the boolean value.

*mismatch_found = true;
}
if *mismatch_found {
return;

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

Added by me, bail out early as the PR description explains.

@@ -969,28 +973,49 @@ impl AccountsDB {
datapoint_info!("accounts_db-stores", ("total_count", total_count, i64));
}

pub fn verify_hash_internal_state(&self, slot: Slot, ancestors: &HashMap<Slot, usize>) -> bool {
let mut hash_state = BankHash::default();

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

Moved to its use site and renamed.

@@ -2195,4 +2225,78 @@ pub mod tests {
"Account-based hashing must be consistent with StoredAccount-based one."
);
}

#[test]
fn test_verify_bank_hash() {

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

As the coverage report from the CI will reveal, these tests does the 100% coverage. :p

let ancestors = vec![(some_slot, 0)].into_iter().collect();

let accounts = &[(&key, &account)];
// update AccountsDB's hash state but discard real account hashes

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

oops, hash state should be written as bank hash to reflect new naming...

@@ -476,7 +476,7 @@ impl Accounts {
}
}

pub fn hash_internal_state(&self, slot_id: Slot) -> BankHash {

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

Changed from verb_object to noun_preposition style here, as it's rather a getter fn, not doing actual calculating here.

pub fn verify_hash_internal_state(&self, slot: Slot, ancestors: &HashMap<Slot, usize>) -> bool {
self.accounts_db.verify_hash_internal_state(slot, ancestors)
pub fn verify_bank_hash(&self, slot: Slot, ancestors: &HashMap<Slot, usize>) -> bool {
self.accounts_db.verify_bank_hash(slot, ancestors).is_ok()

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

At the very layer boundary of "Bank<=>AccountsDB", we discard detailed error and encapsulate the result into a bool.

@@ -1611,7 +1611,7 @@ impl Bank {
pub fn verify_hash_internal_state(&self) -> bool {
self.rc
.accounts
.verify_hash_internal_state(self.slot(), &self.ancestors)
.verify_bank_hash(self.slot(), &self.ancestors)

This comment has been minimized.

Copy link
@ryoqun

ryoqun Dec 19, 2019

Author Member

At this place, we semantically translate verify-ing Bank's internal_state into AccountDB's bank_hash. So outwards rename propagation from AccountsDB ends here. :)

@ryoqun ryoqun requested a review from sakridge Dec 19, 2019
@codecov

This comment has been minimized.

Copy link

codecov bot commented Dec 19, 2019

Codecov Report

Merging #7559 into master will decrease coverage by 12.2%.
The diff coverage is 82.6%.

@@            Coverage Diff            @@
##           master   #7559      +/-   ##
=========================================
- Coverage    77.7%   65.5%   -12.3%     
=========================================
  Files         244     245       +1     
  Lines       52151   61966    +9815     
=========================================
+ Hits        40556   40601      +45     
- Misses      11595   21365    +9770
Copy link
Member

sakridge left a comment

lgtm; thanks @ryoqun

@mergify mergify bot dismissed sakridge’s stale review Dec 19, 2019

Pull request has been modified.

@ryoqun

This comment has been minimized.

Copy link
Member Author

ryoqun commented Dec 20, 2019

lgtm; thanks @ryoqun

Thanks for quick review! I'm merging this with a passing CI build with this really tiny commit after the LGTM: 2eb146f.

@ryoqun ryoqun merged commit 3c361eb into solana-labs:master Dec 20, 2019
12 checks passed
12 checks passed
Summary 1 rule matches and 3 potential rules
Details
buildkite/solana Build #16928 passed (35 minutes, 41 seconds)
Details
buildkite/solana/bench Passed (13 minutes, 20 seconds)
Details
buildkite/solana/checks Passed (1 minute, 42 seconds)
Details
buildkite/solana/coverage Passed (14 minutes, 27 seconds)
Details
buildkite/solana/local-cluster Passed (19 minutes, 4 seconds)
Details
buildkite/solana/move Passed (5 minutes, 12 seconds)
Details
buildkite/solana/pipeline-upload Passed (2 seconds)
Details
buildkite/solana/shellcheck Passed (29 seconds)
Details
buildkite/solana/stable Passed (33 minutes, 51 seconds)
Details
buildkite/solana/stable-perf Passed (13 minutes, 8 seconds)
Details
ci-gate Pull Request accepted for CI pipeline
@ryoqun ryoqun changed the title Check account hashes in snapshot Check for incorrect hash value on snapshot ingest (v2) Dec 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.