Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include rent_epoch and executable into account hash #7415

Merged
merged 3 commits into from Dec 12, 2019

Conversation

ryoqun
Copy link
Member

@ryoqun ryoqun commented Dec 11, 2019

Problem

rent_epoch and executable isn't included into the account hash calculation, thereby not into the bank hash too. This means those fields are falsifiable by malicious validator.

From git history, it seems that the two weren't included just due to oversight.

Summary of Changes

Include the two into the account hash calculation. Also, add rather a peculiar test for any future account's modification to be paired with account hash calculation update...

Part of #7167

*byte = (INPUT_LEN - i) as u8;
}

//UNSAFE: forcibly cast the special byte pattern to actual account fields.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first UNSAFE here. :) Is there any better way to archive similar test result?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat idea. An unsafe in a test is less concerning to me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unsafe in a test is less concerning to me

Thx! This will be a precedent for the follow-up PRs, in which we need more of these crafted binaries to protect validator from crashing from external data. :)

Self::hash_account_data(slot, account.lamports, &account.data, pubkey)
Self::hash_account_data(
slot,
account.lamports,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lamports and rent_epoch are both u64. So interchanging them caused no compile error. Spent some time to figure out why hashes are different. ;) So I added another assertion covering that kind of bug.

hasher.hash(&data);

if executable {
Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to myself: when sanitizing executable in follow up PR, check the upper 7-bit are cleared.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reply to myself: yeah, I've done this correctly at #7464

};
let account = stored_account.clone_account();
let expected_account_hash =
Hash::from_str("GGTsxvxwnMsNfN6yYbBVQaRgvbVLfxeWnGXNyB8iXDyE").unwrap();
Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this and that, this will be stable really well justifying the magic hash.

Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of my pure laziness, I've skipped to calculate this test vector by hand. :p

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh! #9917

StoredMeta,
AccountMeta,
[u8; ACCOUNT_DATA_LEN],
usize, // for StoredAccount::offset
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to write this like typeof(StoredAccount::offset) but it seems I can't...

@codecov
Copy link

codecov bot commented Dec 11, 2019

Codecov Report

Merging #7415 into master will decrease coverage by 2.7%.
The diff coverage is 76.1%.

@@           Coverage Diff            @@
##           master   #7415     +/-   ##
========================================
- Coverage    70.7%   67.9%   -2.8%     
========================================
  Files         245     244      -1     
  Lines       55227   56868   +1641     
========================================
- Hits        39054   38640    -414     
- Misses      16173   18228   +2055

@ryoqun
Copy link
Member Author

ryoqun commented Dec 11, 2019

@danpaul000 Really FYI, this breaks an ABI, #6123. Can ignore this mention because I'm pretty sure there are plenty of other PRs in the master with imcompatible ABI.

@ryoqun
Copy link
Member Author

ryoqun commented Dec 11, 2019

CI passed, proving there were no existing unit test for the code changed in this PR because I didn't need to touch any existing tests... ;)

@ryoqun ryoqun changed the title Hash epoch rent executable Include rent_epoch and executable into account hash Dec 11, 2019
@mvines
Copy link
Member

mvines commented Dec 11, 2019

@danpaul000 Really FYI, this breaks an ABI, #6123. Can ignore this mention because I'm pretty sure there are plenty of other PRs in the master with imcompatible ABI.

Breaking ABI between 0.21 and 0.22 is ok, that's gonna happen elsewhere too. But after 0.22 ships we're going to try to be much more careful about this

@@ -31,6 +33,8 @@ pub struct StoredMeta {
pub data_len: u64,
}

/// This struct will be backed by mmaped and snapshotted data files.
/// So the data layout must be stable and consistent across the entire cluster!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than comments, I wonder if we should define an attribute to start clearly structs that are part of the Solana ABI, like #[solana_abi]. One day this'll then allow us to programmatically find all these structs and compare their values between release to flag ABI changes.

This is totally out of scope of this PR but if that sounds good, @ryoqun do you want to think about this more and make a design proposal for how we might start to track ABI changes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, let's talk about ABI compat! That task looks interesting to me! I'll give it a try!

This reminds of my rookie days when I found an ABI-related bug: #6388 Always, these are pesky bugs.. Hopefully, I can find good solution!

@@ -191,6 +191,8 @@ pub struct AccountStorageEntry {
/// any accounts in it
/// status corresponding to the storage, lets us know that
/// the append_vec, once maxed out, then emptied, can be reclaimed
/// Currently serialized count isn't used at all. Rather, we recalculate it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I've forgot to comment about this. You're right. No wonder to be confused.

First, this comment isn't related to this PR at all.
I added this comment to properly document the current implementation reality, which have changed over the course of prior snapshot stability PRs. I'm sure @sakridge can understand this comment, who shares the needed context.

Specifically this comment is referring to this change

@sakridge what do you think fixing this before strict ABI audit thing will be under the effect? Or, at the least, we should create an issue. After all, it seems we can reconstruct this information when ingesting snapshots without problems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we can create an issue and we can move this comment out of this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created! #7442

*byte = (INPUT_LEN - i) as u8;
}

//UNSAFE: forcibly cast the special byte pattern to actual account fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat idea. An unsafe in a test is less concerning to me

}

pub fn hash_account_data(slot: Slot, lamports: u64, data: &[u8], pubkey: &Pubkey) -> Hash {
pub fn hash_account_data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about passing the entire account: &Account into this function rather than the 4 account fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about the same, too! But there is a reason for the status quo: This function should be accessible from the other code path (no Accounts there): https://github.com/solana-labs/solana/pull/7415/files#diff-2099c5256db4eb5975c8834af38f6456R810

Of course, we can always normalize to Account by stored_account.clone_account(), but performance might be affected or rustc is clever enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok. I think the current code is fine then, probably not worth getting fancy yet. Maybe when we add the 5th or 6th account field later :)

@ryoqun
Copy link
Member Author

ryoqun commented Dec 11, 2019

@sakridge @rob-solana

From this old comment:

@sakridge once this is in place, we'll have to decide whether you hash() accounts before or after I deduct rent...

Is there any remaining concern about this comment from the old PR, still applicable to the current situation?

I naively thought that rent_epoch can be treated equally like other fields. So, considering that comment's concern, we're hash()-ing after we deduct rent and when storing the updated account into the AccountsDB. Is there any problem for this?

rent_epoch: Epoch,
data: &[u8],
pubkey: &Pubkey,
) -> Hash {
if lamports == 0 {
Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this hash algorithm seems littered with many modifications... #6315 #5573 #3801

@rob-solana
Copy link
Contributor

@sakridge @rob-solana

From this old comment:

@sakridge once this is in place, we'll have to decide whether you hash() accounts before or after I deduct rent...

Is there any remaining concern about this comment from the old PR, still applicable to the current situation?

I naively thought that rent_epoch can be treated equally like other fields. So, considering that comment's concern, we're hash()-ing after we deduct rent and when storing the updated account into the AccountsDB. Is there any problem for this?

hash should be the last thing, after all account updates, including deduction and distribution of rent

rob-solana
rob-solana previously approved these changes Dec 11, 2019
Copy link
Contributor

@rob-solana rob-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work

sakridge
sakridge previously approved these changes Dec 11, 2019
Copy link
Member

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mergify mergify bot dismissed stale reviews from rob-solana and sakridge December 12, 2019 03:08

Pull request has been modified.

@ryoqun
Copy link
Member Author

ryoqun commented Dec 12, 2019

Merging this because I got approvals (They got removed by this tiny commit, though)

@@ -2103,4 +2130,56 @@ pub mod tests {

Ok(())
}

#[test]
fn test_hash_stored_account() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants