Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LIP draft for "Use SHA3-256 hash of block header as blockID" #4

Merged
merged 6 commits into from Mar 7, 2019

Conversation

@AndreasKendziorra
Copy link
Contributor

commented Jan 28, 2019

This LIP proposes to take the SHA3-256 hash of the block header as the blockID of a block. This implies an increase of the blockID length and the usage of a different hash function. The motivation of the proposal is to increase the security of the Lisk blockchain.

@janhack
Copy link
Contributor

left a comment

The reference implementation section is missing.

Reference Implementation section added as requested

@AndreasKendziorra AndreasKendziorra requested a review from janhack Feb 11, 2019

@webmaster128
Copy link

left a comment

Good description of an interesting attack vector. However, to avoid the descriped attack I propose a different solution as follows. Right now the block header is composed as:

bytes_t BlockHeader::serialize() const
{

    std::size_t size = 4 // version
            + 4 // timestamp
            + 8 // previous block
            + 4 // number of transactions
            + 8 // total amount
            + 8 // total fee
            + 8 // reward
            + 4 // payload length
            + 32 // payload hash
            + 32 // generator pubkey
            ;

where previous block denotes the 8 bytes ID of the previous block. If instead of the block ID you put the full 32 bytes block header hash in there, the problem is solved. Minimal change and no user visible disruption through the whole stack of Lisk and Lisk-related software.

Furthermore I don't see a convincing argument to replace SHA2 with SHA3. I'm not against using SHA3 at all but this sounds like

  1. we change block IDs for good reason,
  2. when we're doing a hardfork anyway, let's use the new fancy shiny stuff,
  3. now let's find arguments to justify the change of 2.

## Motivation

In the current Lisk protocol, a blockID consists of 8 bytes of the SHA-256 digest of the block header. This small length comes with some advantages, such as less stored data or better visualisation on small displays. However, it comes at the price of providing low resistance to collisions and pre-images. More precisely, the low resistance could be exploited in the following scenario: An attacker could try to find another block with the same blockID for a given block in the blockchain. In the successful case, other community members may be convinced by this altered history, as the altered chain might appear to be valid. We will see that this kind of attack is limited to delegates and currently economically unattractive. However, the length of the blockID shall be increased to prevent this kind of attack also in the future. Moreover, the hash function used to compute the blockID shall be replaced by its successor, namely SHA3-256 (see [below](#Why-SHA-3?) for details).

This comment has been minimized.

Copy link
@webmaster128

webmaster128 Feb 19, 2019

SHA3 should be considered an alternative to SHA2 rather than a successor based on the idea to have a standardized alternative just in case SHA2 becomes insecure. SHA3 may very well be broken before SHA2.

This comment has been minimized.

Copy link
@AndreasKendziorra

AndreasKendziorra Feb 20, 2019

Author Contributor

If you don't like the wording, I can change it.


##### SHA-256

SHA-256 is the currently used hash function for generating blockIDs. Although there are no known weaknesses of this hash function that could be exploited with regard to blockIDs, is has weaknesses due to its construction (these can be exploited in length extension attacks). Therefore, we give a preference to KECCAK, a hash function without known weaknesses.

This comment has been minimized.

Copy link
@webmaster128

webmaster128 Feb 19, 2019

The so called length extension weakness is a result of misusing the algorithm in cases where a message authentication code should be used. I don't see how this property is relevant in our case.

This comment has been minimized.

Copy link
@AndreasKendziorra

AndreasKendziorra Feb 20, 2019

Author Contributor

See my other answer.

@AndreasKendziorra

This comment has been minimized.

Copy link
Contributor Author

commented Feb 20, 2019

Good description of an interesting attack vector. However, to avoid the descriped attack I propose a different solution as follows. Right now the block header is composed as:

bytes_t BlockHeader::serialize() const
{

    std::size_t size = 4 // version
            + 4 // timestamp
            + 8 // previous block
            + 4 // number of transactions
            + 8 // total amount
            + 8 // total fee
            + 8 // reward
            + 4 // payload length
            + 32 // payload hash
            + 32 // generator pubkey
            ;

where previous block denotes the 8 bytes ID of the previous block. If instead of the block ID you put the full 32 bytes block header hash in there, the problem is solved. Minimal change and no user visible disruption through the whole stack of Lisk and Lisk-related software.

As far as I understand, you want to change the previousBlock property of the block header, but want to keep the current blockID system in every other part of the Lisk ecosystem (Lisk explorer, RPCs, API of Lisk Core, Lisk Hub, etc). Is this correct? If yes, then I don't like this idea. In my opinion, there should be a unique way to identify a block in Lisk. Using several ways to identify blocks leads only to confusion. Especially if there is a collision one way, but not in the other. Therefore, if we change the length of the hash used for the previousBlock property in the block header, then we should do it everywhere else too.

Furthermore I don't see a convincing argument to replace SHA2 with SHA3. I'm not against using SHA3 at all but this sounds like

1. we change block IDs for good reason,

2. when we're doing a hardfork anyway, let's use the new fancy shiny stuff,

3. now let's find arguments to justify the change of 2.

I more or less agree with step 1 and step 2, although I would put it slightly different:

  1. we change block IDs for good reason,
  2. when we're doing a hardfork anyway, let's use the hash function that seems to be most suitable one.

As you stated in your other comment, one cannot clearly say which is the more secure hash function family. And you are right that SHA3 may be broken before SHA2. I also agree that there is no obvious impact of the length extension weakness (or its construction in general) for the blockID system. But the protocol will change in several places in the next month, including some hashing steps of the protocol. Ideally, we use everywhere the same hash function family within the protocol including all future uses cases. To avoid any possibility that the underlying Merkle-Damagard construction of SHA2 can pose a risk for a future use case, I would prefer to switch to SHA3 sooner than later.

@webmaster128

This comment has been minimized.

Copy link

commented Feb 20, 2019

As far as I understand, you want to change the previousBlock property of the block header, but want to keep the current blockID system in every other part of the Lisk ecosystem (Lisk explorer, RPCs, API of Lisk Core, Lisk Hub, etc).

Correct

If yes, then I don't like this idea. In my opinion, there should be a unique way to identify a block in Lisk. Using several ways to identify blocks leads only to confusion.

I agree with you when it comes to designing a new system. However, in this case we have an ecosystem with thousends of users and hundreds of third party software that needs to be adapted.

The block ID derivation is a three step process anyway

------> block header  --------->  block (header) hash  ------------------>  block ID
compose                SHA256                          int(first 8 bytes)

Working with block hash and block ID adds a little bit of vocabulary, but is not too hard to follow.

Especially if there is a collision one way, but not in the other.

The worst thing that can happen here is a different block being rejected if the block ID already exists.

To compare the two options

change block ID change previous block reference
new block ID 11735675832314690113 6b3488dc6f9469e90018946f3dc130fdf78bda3f1551601f0f105be432cddf07
protection against chain manipulations as described in motivation yes yes
collision resistent at 1 block/10s yes yes
affected parts of the stack block header generation, block database table block header generation, block database table, p2p layer, API, explorer, wallets, UI
affected software lisk core, snapshot validator lisk core, snapshot validator, lisk-elements, Lisk Hub, exchanges, 3rd party wallets, every other 3rd party software
time to deploy weeks months to years

Regarding SHA: glad to hear we're on the same page in most aspects. To simplify the discussion and review process, I suggest extracting the change SHA2 -> SHA3 into a different LIP as those are two independent things and that should have their own motiviation. Of course it makes sense to apply both LIPs in a single fork.

@shuse2

shuse2 approved these changes Feb 21, 2019

@AndreasKendziorra

This comment has been minimized.

Copy link
Contributor Author

commented Feb 21, 2019

About changing the previousBlock property vs. changing the whole blockID system: When it comes to maintaining 3rd party software, your approach is obviously better. When it comes to user-friendliness, the proposed one is better. Both approaches have valid arguments. If you strongly believe that Lisk should follow your idea, I'm encouraging you to create a separate LIP for it. It's not set in stone that my LIP will be implemented.

About splitting the LIP: this LIP proposes a blockID/block-hash system that should be used in the entire Lisk project. And I'm proposing the system that is the most suitable one from my point of view. I don't think it makes sense to split it into different LIPs (for example to increase the length in LIP1 and changing from SHA2 to SHA3 in LIP2). But again, if you strongly believe there should be LIPs that propose only parts of my proposal, feel free create a separate LIP.

@AndreasKendziorra AndreasKendziorra requested a review from karmacoma Mar 4, 2019

karmacoma added some commits Mar 7, 2019

@karmacoma karmacoma merged commit f1fc8ac into master Mar 7, 2019

@karmacoma karmacoma deleted the lip-andreaskend-use_SHA3-256_for_blockID branch Mar 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.