Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include rent_epoch and executable into account hash #7415

Merged
merged 3 commits into from Dec 12, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
85 changes: 82 additions & 3 deletions runtime/src/accounts_db.rs
Expand Up @@ -34,7 +34,7 @@ use solana_measure::measure::Measure;
use solana_rayon_threadlimit::get_thread_count;
use solana_sdk::account::Account;
use solana_sdk::bank_hash::BankHash;
use solana_sdk::clock::Slot;
use solana_sdk::clock::{Epoch, Slot};
use solana_sdk::hash::{Hash, Hasher};
use solana_sdk::pubkey::Pubkey;
use solana_sdk::sysvar;
Expand Down Expand Up @@ -808,16 +808,32 @@ impl AccountsDB {
Self::hash_account_data(
slot,
account.account_meta.lamports,
account.account_meta.executable,
account.account_meta.rent_epoch,
account.data,
&account.meta.pubkey,
)
}

pub fn hash_account(slot: Slot, account: &Account, pubkey: &Pubkey) -> Hash {
Self::hash_account_data(slot, account.lamports, &account.data, pubkey)
Self::hash_account_data(
slot,
account.lamports,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lamports and rent_epoch are both u64. So interchanging them caused no compile error. Spent some time to figure out why hashes are different. ;) So I added another assertion covering that kind of bug.

account.executable,
account.rent_epoch,
&account.data,
pubkey,
)
}

pub fn hash_account_data(slot: Slot, lamports: u64, data: &[u8], pubkey: &Pubkey) -> Hash {
pub fn hash_account_data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about passing the entire account: &Account into this function rather than the 4 account fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about the same, too! But there is a reason for the status quo: This function should be accessible from the other code path (no Accounts there): https://github.com/solana-labs/solana/pull/7415/files#diff-2099c5256db4eb5975c8834af38f6456R810

Of course, we can always normalize to Account by stored_account.clone_account(), but performance might be affected or rustc is clever enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok. I think the current code is fine then, probably not worth getting fancy yet. Maybe when we add the 5th or 6th account field later :)

slot: Slot,
lamports: u64,
executable: bool,
rent_epoch: Epoch,
data: &[u8],
pubkey: &Pubkey,
) -> Hash {
if lamports == 0 {
Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this hash algorithm seems littered with many modifications... #6315 #5573 #3801

return Hash::default();
}
Expand All @@ -831,8 +847,17 @@ impl AccountsDB {
LittleEndian::write_u64(&mut buf[..], slot);
hasher.hash(&buf);

LittleEndian::write_u64(&mut buf[..], rent_epoch);
hasher.hash(&buf);

hasher.hash(&data);

if executable {
Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to myself: when sanitizing executable in follow up PR, check the upper 7-bit are cleared.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reply to myself: yeah, I've done this correctly at #7464

hasher.hash(&[1u8; 1]);
} else {
hasher.hash(&[0u8; 1]);
}

hasher.hash(&pubkey.as_ref());

hasher.result()
Expand Down Expand Up @@ -1189,10 +1214,12 @@ impl AccountsDB {
pub mod tests {
// TODO: all the bank tests are bank specific, issue: 2194
use super::*;
use crate::append_vec::AccountMeta;
use bincode::serialize_into;
use rand::{thread_rng, Rng};
use solana_sdk::account::Account;
use std::fs;
use std::str::FromStr;
use tempfile::TempDir;

#[test]
Expand Down Expand Up @@ -2103,4 +2130,56 @@ pub mod tests {

Ok(())
}

#[test]
fn test_hash_stored_account() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

// This test uses some UNSAFE trick to detect most of account's field
// addition and deletion without changing the hash code

const ACCOUNT_DATA_LEN: usize = 3;
// the type of InputTuple elements must not contain references;
// they should be simple scalars or data blobs
type InputTuple = (
Slot,
StoredMeta,
AccountMeta,
[u8; ACCOUNT_DATA_LEN],
usize, // for StoredAccount::offset
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to write this like typeof(StoredAccount::offset) but it seems I can't...

Hash,
);
const INPUT_LEN: usize = std::mem::size_of::<InputTuple>();
type InputBlob = [u8; INPUT_LEN];
let mut blob: InputBlob = [0u8; INPUT_LEN];

// spray memory with decreasing counts so that, data layout can be detected.
for (i, byte) in blob.iter_mut().enumerate() {
*byte = (INPUT_LEN - i) as u8;
}

//UNSAFE: forcibly cast the special byte pattern to actual account fields.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first UNSAFE here. :) Is there any better way to archive similar test result?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat idea. An unsafe in a test is less concerning to me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unsafe in a test is less concerning to me

Thx! This will be a precedent for the follow-up PRs, in which we need more of these crafted binaries to protect validator from crashing from external data. :)

let (slot, meta, account_meta, data, offset, hash): InputTuple =
unsafe { std::mem::transmute::<InputBlob, InputTuple>(blob) };

let stored_account = StoredAccount {
meta: &meta,
account_meta: &account_meta,
data: &data,
offset,
hash: &hash,
};
let account = stored_account.clone_account();
let expected_account_hash =
Hash::from_str("GGTsxvxwnMsNfN6yYbBVQaRgvbVLfxeWnGXNyB8iXDyE").unwrap();
Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this and that, this will be stable really well justifying the magic hash.

Copy link
Member Author

@ryoqun ryoqun Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of my pure laziness, I've skipped to calculate this test vector by hand. :p

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh! #9917


assert_eq!(
AccountsDB::hash_stored_account(slot, &stored_account),
expected_account_hash,
"StoredAccount's data layout might be changed; update hashing if needed."
);
assert_eq!(
AccountsDB::hash_account(slot, &account, &stored_account.meta.pubkey),
expected_account_hash,
"Account-based hashing must be consistent with StoredAccount-based one."
);
}
}
8 changes: 5 additions & 3 deletions runtime/src/append_vec.rs
Expand Up @@ -22,6 +22,8 @@ macro_rules! align_up {
}

/// Meta contains enough context to recover the index from storage itself
/// This struct will be backed by mmaped and snapshotted data files.
/// So the data layout must be stable and consistent across the entire cluster!
#[derive(Clone, PartialEq, Debug)]
pub struct StoredMeta {
/// global write version
Expand All @@ -31,6 +33,8 @@ pub struct StoredMeta {
pub data_len: u64,
}

/// This struct will be backed by mmaped and snapshotted data files.
/// So the data layout must be stable and consistent across the entire cluster!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than comments, I wonder if we should define an attribute to start clearly structs that are part of the Solana ABI, like #[solana_abi]. One day this'll then allow us to programmatically find all these structs and compare their values between release to flag ABI changes.

This is totally out of scope of this PR but if that sounds good, @ryoqun do you want to think about this more and make a design proposal for how we might start to track ABI changes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, let's talk about ABI compat! That task looks interesting to me! I'll give it a try!

This reminds of my rookie days when I found an ABI-related bug: #6388 Always, these are pesky bugs.. Hopefully, I can find good solution!

#[derive(Serialize, Deserialize, Clone, Debug, Default, Eq, PartialEq)]
pub struct AccountMeta {
/// lamports in the account
Expand Down Expand Up @@ -187,9 +191,7 @@ impl AppendVec {
}

fn get_slice(&self, offset: usize, size: usize) -> Option<(&[u8], usize)> {
let len = self.len();

if len < offset + size {
if offset + size > self.len() {
return None;
}
let data = &self.map[offset..offset + size];
Expand Down