Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds a new shred variant embedding merkle tree hashes of the erasure batch #25237

Merged
merged 1 commit into from
Jun 7, 2022

Conversation

behzadnouri
Copy link
Contributor

@behzadnouri behzadnouri commented May 15, 2022

Problem

Coding shreds can only be signed once erasure codings are already
generated. Therefore coding shreds recovered from erasure codings lack
slot leader's signature and so cannot be retransmitted to the rest of
the cluster.

Summary of Changes

shred/merkle.rs implements a new shred variant where we generate merkle
tree for each erasure encoded batch and each shred includes:

  • full 32-byte Hash root of the merkle tree.
  • slot leader's signature of the root of the merkle tree.
  • merkle tree nodes along the branch the shred belongs to, where hashes
    are trimmed to 20 bytes during tree construction.

This schema results in the same signature for all shreds within an
erasure batch.

When recovering shreds from erasure codes, we can reconstruct merkle
tree for the batch and for each recovered shred also recover respective
merkle tree branch; then snap the slot leader's signature from any of
the shreds received from turbine and retransmit all recovered code or
data shreds.

Backward compatibility is achieved by encoding shred variant at byte 65
of payload (previously shred-type at this position):

  • 0b0101_1010 indicates a legacy coding shred, which is also equal to
    ShredType::Code for backward compatibility.
  • 0b1010_0101 indicates a legacy data shred, which is also equal to
    ShredType::Data for backward compatibility.
  • 0b0100_???? indicates a merkle coding shred with merkle branch size
    indicated by the last 4 bits.
  • 0b1000_???? indicates a merkle data shreds with merkle branch size
    indicated by the last 4 bits.

Merkle root and branch are encoded at the end of the shred payload.

@behzadnouri behzadnouri force-pushed the shred-trait branch 3 times, most recently from 3e172cc to d525399 Compare May 17, 2022 15:10
@behzadnouri behzadnouri added the noCI Suppress CI on this Pull Request label May 17, 2022
@behzadnouri behzadnouri force-pushed the shred-trait branch 14 times, most recently from c08e60d to 98bb8cf Compare May 22, 2022 18:31
@behzadnouri behzadnouri changed the title adds new shred variant containing merkle tree hashes of the erasure batch adds new shred variant embedding merkle tree hashes of the erasure batch May 22, 2022
@behzadnouri behzadnouri removed the noCI Suppress CI on this Pull Request label May 22, 2022
@behzadnouri behzadnouri force-pushed the shred-trait branch 2 times, most recently from ce57ee6 to 3d58537 Compare May 23, 2022 12:38
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 24, 2022
sigverify_shres relies on wire layout of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 24, 2022
sigverify_shres relies on wire layout of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 24, 2022
sigverify_shres relies on wire layout of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 24, 2022
sigverify_shres relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 24, 2022
sigverify_shres relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 24, 2022
sigverify_shres relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 24, 2022
sigverify_shres relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 30, 2022
In preparation of
solana-labs#25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 30, 2022
In preparation of
solana-labs#25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 30, 2022
The commit implements two new types:
    pub enum ShredCode {
        Legacy(legacy::ShredCode),
    }
    pub enum ShredData {
        Legacy(legacy::ShredData),
    }

solana-labs#25237
will extend these two types by adding variants for
    Merkle(merkle::ShredCode)
and
    Merkle(merkle::ShredData)
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 30, 2022
In preparation of
solana-labs#25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b0101_1010
    }

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type:
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b1010_0101
        MerkleCode(/*proof_size:*/ u8), // 0b0100_????
        MerkleData(/*proof_size:*/ u8), // 0b1000_????
    }
behzadnouri added a commit to behzadnouri/solana that referenced this pull request May 30, 2022
In preparation of
solana-labs#25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b0101_1010
    }

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type:
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b1010_0101
        MerkleCode(/*proof_size:*/ u8), // 0b0100_????
        MerkleData(/*proof_size:*/ u8), // 0b1000_????
    }
@behzadnouri behzadnouri force-pushed the shred-trait branch 2 times, most recently from 73548e8 to 2150fab Compare May 30, 2022 21:47
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Jun 1, 2022
In preparation of
solana-labs#25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b0101_1010
    }

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type:
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b1010_0101
        MerkleCode(/*proof_size:*/ u8), // 0b0100_????
        MerkleData(/*proof_size:*/ u8), // 0b1000_????
    }
behzadnouri added a commit that referenced this pull request Jun 2, 2022
In preparation of
#25237
which adds a new shred variant with merkle tree branches, the commit
embeds versioning into shred binary by encoding a new ShredVariant type
at byte 65 of payload replacing previously ShredType at this offset.
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b0101_1010
    }

* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.

Following commits will add merkle variants to this type:
    enum ShredVariant {
        LegacyCode, // 0b0101_1010
        LegacyData, // 0b1010_0101
        MerkleCode(/*proof_size:*/ u8), // 0b0100_????
        MerkleData(/*proof_size:*/ u8), // 0b1000_????
    }
@behzadnouri behzadnouri force-pushed the shred-trait branch 3 times, most recently from 8732a2c to 804da55 Compare June 3, 2022 12:50
Comment on lines +389 to +394
#[cfg(test)]
fn make_merkle_tree(mut nodes: Vec<Hash>) -> Vec<Hash> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily marked as #[cfg(test)] to avoid clippy dead-code errors;
but this code will be used in shreds construction.

Comment on lines +405 to +414
#[cfg(test)]
fn make_merkle_branch(
mut index: usize, // leaf index ~ shred's erasure shard index.
mut size: usize, // number of leaves ~ erasure batch size.
tree: &[Hash],
) -> Option<MerkleBranch> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, #[cfg(test)] is temporary.

@behzadnouri
Copy link
Contributor Author

Computationally it shouldn't make any difference. The benefit is that you save 12 bytes (~1%) of the packet payload.

I wouldn't say 12 bytes is a significant cost to truncate the only piece of payload which is signed.
e.g. if you are looking at the metrics on chronograph would you notice if some metrics has gone up or down by 1%?

I'm not sure about the extent of security benefits to use 32 bytes for the root hash when it is generated from 20 byte intermediate hashes.

Two 20 bytes hashes are combined to make the root. So the root is generated from 40 bytes not 20 bytes.
The root is the only piece that is signed in the shred payload. So might make sense to be more cautious with it compared to intermediate hashes. It is also used to check if the shreds belong to the same erasure batch.

At the end of the day, I acknowledge that this is a bit paranoid, but what makes me paranoid about this is the fact that just because we cannot think of an attack vector here it does not mean that there is no attack possible; and if there is a chance of security compromise, saving just 12 bytes does not sound to me a good trade-off to take that risk.

@jbiseda
Copy link
Contributor

jbiseda commented Jun 3, 2022

Computationally it shouldn't make any difference. The benefit is that you save 12 bytes (~1%) of the packet payload.

I wouldn't say 12 bytes is a significant cost to truncate the only piece of payload which is signed. e.g. if you are looking at the metrics on chronograph would you notice if some metrics has gone up or down by 1%?

I'm not sure about the extent of security benefits to use 32 bytes for the root hash when it is generated from 20 byte intermediate hashes.

Two 20 bytes hashes are combined to make the root. So the root is generated from 40 bytes not 20 bytes. The root is the only piece that is signed in the shred payload. So might make sense to be more cautious with it compared to intermediate hashes. It is also used to check if the shreds belong to the same erasure batch.

At the end of the day, I acknowledge that this is a bit paranoid, but what makes me paranoid about this is the fact that just because we cannot think of an attack vector here it does not mean that there is no attack possible; and if there is a chance of security compromise, saving just 12 bytes does not sound to me a good trade-off to take that risk.

From #network-protocols:
https://discord.com/channels/428295358100013066/478692221441409024/981340421605167105

jbiseda — 05/31/2022
Is there any benefit to using a 32-byte hash for the merkle root hash if all intermediate hashes are truncated to 20-bytes? It seems an attack would just focus on one of the intermediate 20-byte hashes.

anatoly "don't trust anyone" — 05/31/2022
root should be 20
[5:03 PM]
yea
[5:03 PM]
there is no point

fn proof_size(&self) -> Result<u8, Error> {
match self.common_header.shred_variant {
ShredVariant::MerkleData(proof_size) => Ok(proof_size),
_ => Err(Error::InvalidShredVariant),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this check here? Or should this be an assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to check


#[derive(Clone, Debug, Eq, PartialEq)]
struct MerkleBranch {
root: Hash,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Hash/MerkleProofEntry/ and remove SIZE_OF_MERKLE_ROOT

ledger/src/shred/merkle.rs Outdated Show resolved Hide resolved
fn proof_size(&self) -> Result<u8, Error> {
match self.common_header.shred_variant {
ShredVariant::MerkleCode(proof_size) => Ok(proof_size),
_ => Err(Error::InvalidShredVariant),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to check

ledger/src/shred/merkle.rs Show resolved Hide resolved
let chunk = self
.payload
.get(SIZE_OF_SIGNATURE..SIZE_OF_CODING_SHRED_HEADERS + shard_size)
.ok_or(Error::InvalidPayloadSize(self.payload.len()))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this checked by size_of_erasure_encoded_slice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now using the safe api until the code is in steady state

ledger/src/shred/merkle.rs Show resolved Hide resolved
let shard_size = Self::size_of_erasure_encoded_slice(proof_size)?;
self.payload
.get(SIZE_OF_CODING_SHRED_HEADERS..SIZE_OF_CODING_SHRED_HEADERS + shard_size)
.ok_or(Error::InvalidPayloadSize(self.payload.len()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked by size_of_erasure_encoded_slice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now using the safe api until the code is in steady state

ledger/src/shred/merkle.rs Show resolved Hide resolved
…batch

Coding shreds can only be signed once erasure codings are already
generated. Therefore coding shreds recovered from erasure codings lack
slot leader's signature and so cannot be retransmitted to the rest of
the cluster.

shred/merkle.rs implements a new shred variant where we generate merkle
tree for each erasure encoded batch and each shred includes:
* root of the merkle tree (which is 32-byte Hash).
* slot leader's signature of the root of the merkle tree.
* merkle tree nodes along the branch the shred belongs to, where hashes
  are trimmed to 20 bytes during tree construction.

This schema results in the same signature for all shreds within an
erasure batch.

When recovering shreds from erasure codes, we can reconstruct merkle
tree for the batch and for each recovered shred also recover respective
merkle tree branch; then snap the slot leader's signature from any of
the shreds received from turbine and retransmit all recovered code or
data shreds.

Backward compatibility is achieved by encoding shred variant at byte 65
of payload (previously shred-type at this position):
* 0b0101_1010 indicates a legacy coding shred, which is also equal to
  ShredType::Code for backward compatibility.
* 0b1010_0101 indicates a legacy data shred, which is also equal to
  ShredType::Data for backward compatibility.
* 0b0100_???? indicates a merkle coding shred with merkle branch size
  indicated by the last 4 bits.
* 0b1000_???? indicates a merkle data shred with merkle branch size
  indicated by the last 4 bits.

Merkle root and branch are encoded at the end of the shred payload.
@behzadnouri
Copy link
Contributor Author

Computationally it shouldn't make any difference. The benefit is that you save 12 bytes (~1%) of the packet payload.

I wouldn't say 12 bytes is a significant cost to truncate the only piece of payload which is signed. e.g. if you are looking at the metrics on chronograph would you notice if some metrics has gone up or down by 1%?

I'm not sure about the extent of security benefits to use 32 bytes for the root hash when it is generated from 20 byte intermediate hashes.

Two 20 bytes hashes are combined to make the root. So the root is generated from 40 bytes not 20 bytes. The root is the only piece that is signed in the shred payload. So might make sense to be more cautious with it compared to intermediate hashes. It is also used to check if the shreds belong to the same erasure batch.
At the end of the day, I acknowledge that this is a bit paranoid, but what makes me paranoid about this is the fact that just because we cannot think of an attack vector here it does not mean that there is no attack possible; and if there is a chance of security compromise, saving just 12 bytes does not sound to me a good trade-off to take that risk.

From #network-protocols: https://discord.com/channels/428295358100013066/478692221441409024/981340421605167105

jbiseda — 05/31/2022 Is there any benefit to using a 32-byte hash for the merkle root hash if all intermediate hashes are truncated to 20-bytes? It seems an attack would just focus on one of the intermediate 20-byte hashes.

anatoly "don't trust anyone" — 05/31/2022 root should be 20 [5:03 PM] yea [5:03 PM] there is no point

I had seen that conversation before writing the earlier comment: #25237 (comment)
I still don't think saving 12 bytes is going to optimize anything.
Nonetheless, given the new shred versioning schema, it should be easy to switch back to 32 bytes if needed, so I changed the root size to 20 bytes.

Copy link
Contributor

@jbiseda jbiseda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

fn proof_size(&self) -> Result<u8, Error> {
match self.common_header.shred_variant {
ShredVariant::MerkleData(proof_size) => Ok(proof_size),
_ => Err(Error::InvalidShredVariant),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this check here? Or should this be an assert?

@@ -3779,7 +3779,7 @@ pub(crate) mod tests {
fn test_dead_fork_entry_deserialize_failure() {
// Insert entry that causes deserialization failure
let res = check_dead_fork(|_, bank| {
let gibberish = [0xa5u8; SIZE_OF_DATA_SHRED_PAYLOAD];
let gibberish = [0xa5u8; /*legacy data-shred capacity:*/1051];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks

@behzadnouri behzadnouri merged commit 5f04512 into solana-labs:master Jun 7, 2022
@behzadnouri behzadnouri deleted the shred-trait branch June 7, 2022 22:41
jeffwashington pushed a commit to jeffwashington/solana that referenced this pull request Jun 29, 2022
sigverify_shreds relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
jeffwashington pushed a commit to jeffwashington/solana that referenced this pull request Jun 29, 2022
sigverify_shreds relies on wire layout specs of shreds:
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L39-L46
https://github.com/solana-labs/solana/blob/0376ab41a/ledger/src/sigverify_shreds.rs#L298-L305

In preparation of
solana-labs#25237
which adds a new shred variant with different layout and signed message,
this commit removes shred layout specification from sigverify and
instead encapsulate that in shred module.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants