Replay Attack Protection: Include Blocklimit and Blockhash in each Transaction #134

Open
aakilfernandes opened this Issue Jul 13, 2016 · 49 comments

Comments

Projects
None yet
@aakilfernandes

Specification

In each transaction (prior to signing), include the following

  1. The blockhash of a recent block
  2. A single byte blocklimit

In order for a transaction to be valid, it must be included in the blockchain within blocklimit blocks of a block with hash blockhash. Transactions with a blocklimit and blockhash of 0 are always valid, regardless of the chain history.

Reasoning

This would offer a number of improvements

  1. It would prevent transactions on one chain from being processed on another chain. This would mean the same private keys could be used on both chains.
  2. It would provide definitive proof that a transaction would not be included in the chain (you would know within blocklimit blocks a transaction would not be broadcast, therefore you could safely ignore it).
@aakilfernandes

This comment has been minimized.

Show comment
Hide comment
@aakilfernandes

aakilfernandes Jul 13, 2016

Ps thanks to @pipermerriam and @PeterBorah for informing this

Ps thanks to @pipermerriam and @PeterBorah for informing this

@charlieknoll

This comment has been minimized.

Show comment
Hide comment
@charlieknoll

charlieknoll Jul 18, 2016

This would create a problem with offline transactions signed long before they are submitted to the network.

This would create a problem with offline transactions signed long before they are submitted to the network.

@pipermerriam

This comment has been minimized.

Show comment
Hide comment
@pipermerriam

pipermerriam Jul 18, 2016

Member

@charlieknoll maybe you missed this part of the proposal.

Transactions with a blocklimit and blockhash of 0 are always valid, regardless of the chain history.

I believe that addresses your concern.

Member

pipermerriam commented Jul 18, 2016

@charlieknoll maybe you missed this part of the proposal.

Transactions with a blocklimit and blockhash of 0 are always valid, regardless of the chain history.

I believe that addresses your concern.

@charlieknoll

This comment has been minimized.

Show comment
Hide comment
@charlieknoll

charlieknoll Jul 18, 2016

@pipermerriam yes I missed that, thanks. Looks like a good idea to me.

@pipermerriam yes I missed that, thanks. Looks like a good idea to me.

@aakilfernandes aakilfernandes changed the title from Include Blocklimit and Blockhash in each Transaction to Replay Attack Protection: Include Blocklimit and Blockhash in each Transaction Jul 31, 2016

@koeppelmann

This comment has been minimized.

Show comment
Hide comment
@koeppelmann

koeppelmann Jul 31, 2016

Good proposal - would like to see it implemented!

Good proposal - would like to see it implemented!

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Jul 31, 2016

Only using a blockhash would probably be enough, as only during few transactions during a fork would be affected, the proposal is fine as is clients should however default to a 0 blocklimit.

Otherwise users might run into problems when signing transactions offline or when there is a long delay between signing and publishing a transaction, ie multisig.

tscs37 commented Jul 31, 2016

Only using a blockhash would probably be enough, as only during few transactions during a fork would be affected, the proposal is fine as is clients should however default to a 0 blocklimit.

Otherwise users might run into problems when signing transactions offline or when there is a long delay between signing and publishing a transaction, ie multisig.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Jul 31, 2016

Collaborator

We've been discussing something similar in the Go-ethereum channel. A couple of slight changes make it significantly more general:

  • As others have suggested, 'blocklimit' is redundant.
  • Include the genesis hash in the list of hashes that will be checked; this allows distinguishing between different Ethereum networks (testnet vs mainnet, Ethereum clones)

A more significant change is to specify either that every nth (say, 10,000th) block hash gets added to the list, allowing distinctions based on forks after a short period, or to specify that the node should look up the hash in a contract at a given address. That contract would be implemented such that it allows anyone to submit a valid recent blockhash to be stored indefinitely, allowing anyone to verify that the current chain had that blockhash in the past, and therefore allowing anyone to create permanent 'fork points' at any time.

Collaborator

Arachnid commented Jul 31, 2016

We've been discussing something similar in the Go-ethereum channel. A couple of slight changes make it significantly more general:

  • As others have suggested, 'blocklimit' is redundant.
  • Include the genesis hash in the list of hashes that will be checked; this allows distinguishing between different Ethereum networks (testnet vs mainnet, Ethereum clones)

A more significant change is to specify either that every nth (say, 10,000th) block hash gets added to the list, allowing distinctions based on forks after a short period, or to specify that the node should look up the hash in a contract at a given address. That contract would be implemented such that it allows anyone to submit a valid recent blockhash to be stored indefinitely, allowing anyone to verify that the current chain had that blockhash in the past, and therefore allowing anyone to create permanent 'fork points' at any time.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Jul 31, 2016

Using the genesis hash is probably a good idea.

For the "every n-th" block either a merkle tree root (or some subleaf) of these blocks can be used.

Transactions should still be able to include a nonce to make sure that if an account is doing many transaction, they can't be replayed within the span of the n blocks of the merkle tree.

For offline transactions, the tx could use a scheme based on TxNonce = H(Account_Address + LastTxNonce so the offline client need not know the chain and still safely transact.

tscs37 commented Jul 31, 2016

Using the genesis hash is probably a good idea.

For the "every n-th" block either a merkle tree root (or some subleaf) of these blocks can be used.

Transactions should still be able to include a nonce to make sure that if an account is doing many transaction, they can't be replayed within the span of the n blocks of the merkle tree.

For offline transactions, the tx could use a scheme based on TxNonce = H(Account_Address + LastTxNonce so the offline client need not know the chain and still safely transact.

@aakilfernandes

This comment has been minimized.

Show comment
Hide comment
@aakilfernandes

aakilfernandes Jul 31, 2016

That contract would be implemented such that it allows anyone to submit a valid recent blockhash to be stored indefinitely, allowing anyone to verify that the current chain had that blockhash in the past, and therefore allowing anyone to create permanent 'fork points' at any time.

This is interesting. However it wouldn't offer replay protection to private chains or forks where the contract wasn't deployed. Also there's no incentive to submit blocks.

As others have suggested, 'blocklimit' is redundant

Agreed. I think setting some sort of pool of blocks to check (last 256 + every 10k since genesis + genesis) is sufficient.

@Arachnid is the go team working on replay attack protection? I'm wondering if I should turn this into a formal EIP or let those with better knowledge of the problem take the lead.

That contract would be implemented such that it allows anyone to submit a valid recent blockhash to be stored indefinitely, allowing anyone to verify that the current chain had that blockhash in the past, and therefore allowing anyone to create permanent 'fork points' at any time.

This is interesting. However it wouldn't offer replay protection to private chains or forks where the contract wasn't deployed. Also there's no incentive to submit blocks.

As others have suggested, 'blocklimit' is redundant

Agreed. I think setting some sort of pool of blocks to check (last 256 + every 10k since genesis + genesis) is sufficient.

@Arachnid is the go team working on replay attack protection? I'm wondering if I should turn this into a formal EIP or let those with better knowledge of the problem take the lead.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Jul 31, 2016

Collaborator

This is interesting. However it wouldn't offer replay protection to private chains or forks where the contract wasn't deployed.

Deploying that contract would be part of the process of deploying a new chain, just like the built-in contracts.

Also there's no incentive to submit blocks.

The cost is minimal, and once you've submitted a block, you can use it in future TXes for replay prevention. I think tscs37's point that it would allow someone to bypass the "expiry" feature by submitting its blockhash is a good one.

My goal was to prevent the expansion of state required by fast and light clients beyond what's under the state hash and the 256 most recent block hashes, which checking in every nth block would do - clients would need to explicitly fetch those block headers when fast syncing in order to be able to validate TXes.

@Arachnid is the go team working on replay attack protection? I'm wondering if I should turn this into a formal EIP or let those with better knowledge of the problem take the lead.

We've been discussing it, and I believe someone's working on an EIP; you might ask about it in the go-ethereum channel.

Collaborator

Arachnid commented Jul 31, 2016

This is interesting. However it wouldn't offer replay protection to private chains or forks where the contract wasn't deployed.

Deploying that contract would be part of the process of deploying a new chain, just like the built-in contracts.

Also there's no incentive to submit blocks.

The cost is minimal, and once you've submitted a block, you can use it in future TXes for replay prevention. I think tscs37's point that it would allow someone to bypass the "expiry" feature by submitting its blockhash is a good one.

My goal was to prevent the expansion of state required by fast and light clients beyond what's under the state hash and the 256 most recent block hashes, which checking in every nth block would do - clients would need to explicitly fetch those block headers when fast syncing in order to be able to validate TXes.

@Arachnid is the go team working on replay attack protection? I'm wondering if I should turn this into a formal EIP or let those with better knowledge of the problem take the lead.

We've been discussing it, and I believe someone's working on an EIP; you might ask about it in the go-ethereum channel.

@Smithgift

This comment has been minimized.

Show comment
Hide comment
@Smithgift

Smithgift Jul 31, 2016

👍. This has more uses than merely large scale forks a la the current situation. If you wish for transaction A to work and only work in a given microfork (for example, a transaction that only makes sense if another transaction was or wasn't sent) then this is quite useful. Or if you wish to embargo a 51% attacker's chain, then this is a very simple way to do it.

👍. This has more uses than merely large scale forks a la the current situation. If you wish for transaction A to work and only work in a given microfork (for example, a transaction that only makes sense if another transaction was or wasn't sent) then this is quite useful. Or if you wish to embargo a 51% attacker's chain, then this is a very simple way to do it.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Jul 31, 2016

I think the intention should less be to prevent replays across forks. This can be a neat side-effect.

But it can be used to make light and offline clients much more easy to write and if done the right way, could allow single addresses to issue concurrent transactions.

With the current nonce system it's not easily possible to distribute the generation of transaction, ie thinking of a large exchange or bank.

tscs37 commented Jul 31, 2016

I think the intention should less be to prevent replays across forks. This can be a neat side-effect.

But it can be used to make light and offline clients much more easy to write and if done the right way, could allow single addresses to issue concurrent transactions.

With the current nonce system it's not easily possible to distribute the generation of transaction, ie thinking of a large exchange or bank.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Jul 31, 2016

Contributor

I also like this idea. Mostly, I like the contract-route. So I would suggest the following.

  1. A transaction must contain a "valid-after-hash" field. Let's call it VAH. This makes it a hard fork.
  2. The VAH must contain a hash from pool P, in order to be valid.
  3. The pool P consists of the following hashes:
    • The hashes of all odd-numbered blocks from the most recent 256 blocks.
    • The genesis hash
    • All hashes in the contract C.

The reason for using hashes only from odd-numbered blocks, is that we can ensure that C only contains hashes from even-numbered blocks. This makes it possible to obtain the expiration-feature on transactions.

I sketched up an example of how contract C could look:

contract HashSave{

    mapping  (bytes32 => bool) hashes;

    function save(uint blockNumber) returns (bytes32)
    {   
        if( blockNumber % 2 != 0 ) throw;
        //Could instead add more safeguards;
        // the check below only allows to save every 128th block, 
        // around once every 30 minutes with 14s blocktime
        //if( blockNumber % 128 != 0 ) throw;

        bytes32 hash = block.blockhash(blockNumber);
        if (hash != 0) {
            hashes[hash] = true;
            return hash;
            //Todo, add event
        }
        return 0;
    }
}

As you can see, anyone can submit a hash, "savepoint", to the contract. If we are afraid that this will become a DoS-vector, where an attacker submit every (other) hash, which makes verification of tx:s more difficult (if this verification fails, there's no gas cost, remember) since the pool cache grows, we may want to add further restrictions, so that only every N:th block hash can be saved.

Pros:

  • This makes it easy to handle forks safely, both for regular wallets and offline-wallets.
    • Regular wallets can use a recent odd-numbered blockhash (presumably from six blocks back or so, since chain can be reorganized).
    • Offline wallets can use genesis, or a 'savepoint'/'forkpoint' on the desired chain.
  • Light clients need only recent hashes (they already have those) and the current state.
  • It is possible to make tx:s that expire after a while (most of them probably will). This will also make it easier to clean out the transaction-pool caches of pending (but stale) transactions.

Cons:

  • Is a hardfork
  • Adds a verification-step to transaction. This validation is not paid for by the sender.
Contributor

holiman commented Jul 31, 2016

I also like this idea. Mostly, I like the contract-route. So I would suggest the following.

  1. A transaction must contain a "valid-after-hash" field. Let's call it VAH. This makes it a hard fork.
  2. The VAH must contain a hash from pool P, in order to be valid.
  3. The pool P consists of the following hashes:
    • The hashes of all odd-numbered blocks from the most recent 256 blocks.
    • The genesis hash
    • All hashes in the contract C.

The reason for using hashes only from odd-numbered blocks, is that we can ensure that C only contains hashes from even-numbered blocks. This makes it possible to obtain the expiration-feature on transactions.

I sketched up an example of how contract C could look:

contract HashSave{

    mapping  (bytes32 => bool) hashes;

    function save(uint blockNumber) returns (bytes32)
    {   
        if( blockNumber % 2 != 0 ) throw;
        //Could instead add more safeguards;
        // the check below only allows to save every 128th block, 
        // around once every 30 minutes with 14s blocktime
        //if( blockNumber % 128 != 0 ) throw;

        bytes32 hash = block.blockhash(blockNumber);
        if (hash != 0) {
            hashes[hash] = true;
            return hash;
            //Todo, add event
        }
        return 0;
    }
}

As you can see, anyone can submit a hash, "savepoint", to the contract. If we are afraid that this will become a DoS-vector, where an attacker submit every (other) hash, which makes verification of tx:s more difficult (if this verification fails, there's no gas cost, remember) since the pool cache grows, we may want to add further restrictions, so that only every N:th block hash can be saved.

Pros:

  • This makes it easy to handle forks safely, both for regular wallets and offline-wallets.
    • Regular wallets can use a recent odd-numbered blockhash (presumably from six blocks back or so, since chain can be reorganized).
    • Offline wallets can use genesis, or a 'savepoint'/'forkpoint' on the desired chain.
  • Light clients need only recent hashes (they already have those) and the current state.
  • It is possible to make tx:s that expire after a while (most of them probably will). This will also make it easier to clean out the transaction-pool caches of pending (but stale) transactions.

Cons:

  • Is a hardfork
  • Adds a verification-step to transaction. This validation is not paid for by the sender.
@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Jul 31, 2016

I like your mockup, this could certainly work.

Only concern is the savepoint-numbers. I'd prefer ever 512th block to be saved, maybe even every 1024th block. A savepoint every couple hours is with much safety margin to operate safely over a fork with a good majority hashpower.

This could be included into the weak-subjectivity of Serenity, so that these savepoints also act as beacons for clients to safely join the network.

To make this more useful I'd recommend the following configuration; instead of using a bytes32 => bool I'd use a mapping of bytes32 => bytes32 where the key is the blockhash of the beacon and the value is a merkle-root hash of all beaconed blocks, which should be easily calculable.

The save function should simply attempt to save the last block that could be considered a beacon. If none of the last 256 blocks are a beacon candidate it throws. (Which requires the beacon range to be above 256 or else you get double calls on the same beacon) It should largely operate automatic with no external input parameters.

This could allow full nodes like geth to utilize this contract for an even faster sync to the network where they sync to the state of the last beacon and then sync backwards to the network while being able to operate within minutes. Since a forked chain will eventually within the beacon-interval end up with a different beacon merkle root, the fast sync will very quickly detect it and can consider other chain sources.

On creation the contract precompile should include the genesis block and in a second operation the block number 1920010.

tscs37 commented Jul 31, 2016

I like your mockup, this could certainly work.

Only concern is the savepoint-numbers. I'd prefer ever 512th block to be saved, maybe even every 1024th block. A savepoint every couple hours is with much safety margin to operate safely over a fork with a good majority hashpower.

This could be included into the weak-subjectivity of Serenity, so that these savepoints also act as beacons for clients to safely join the network.

To make this more useful I'd recommend the following configuration; instead of using a bytes32 => bool I'd use a mapping of bytes32 => bytes32 where the key is the blockhash of the beacon and the value is a merkle-root hash of all beaconed blocks, which should be easily calculable.

The save function should simply attempt to save the last block that could be considered a beacon. If none of the last 256 blocks are a beacon candidate it throws. (Which requires the beacon range to be above 256 or else you get double calls on the same beacon) It should largely operate automatic with no external input parameters.

This could allow full nodes like geth to utilize this contract for an even faster sync to the network where they sync to the state of the last beacon and then sync backwards to the network while being able to operate within minutes. Since a forked chain will eventually within the beacon-interval end up with a different beacon merkle root, the fast sync will very quickly detect it and can consider other chain sources.

On creation the contract precompile should include the genesis block and in a second operation the block number 1920010.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 1, 2016

Collaborator

It's worth noting that using odd/even numbers means that a reorg could change the meaning of your transaction; we need to define what happens to a transaction whose VAH is an uncle; does it get ignored?

I don't see a problem with making the save-interval on the contract small, because any resource consumption attack requires a lot of resources from the attacker, for very low return (lookups are O(log n), and it would take a lot of ether to bloat the contract past what can be easily cached).

@tscs37 Can you elaborate on why you'd store a merkle root against the hash, and how it would work?

Also, it's unclear to me that having such a contract would improve sync; the contract wouldn't by nature be any more reliable than the information it can get from the nodes it's trying to sync with.

Collaborator

Arachnid commented Aug 1, 2016

It's worth noting that using odd/even numbers means that a reorg could change the meaning of your transaction; we need to define what happens to a transaction whose VAH is an uncle; does it get ignored?

I don't see a problem with making the save-interval on the contract small, because any resource consumption attack requires a lot of resources from the attacker, for very low return (lookups are O(log n), and it would take a lot of ether to bloat the contract past what can be easily cached).

@tscs37 Can you elaborate on why you'd store a merkle root against the hash, and how it would work?

Also, it's unclear to me that having such a contract would improve sync; the contract wouldn't by nature be any more reliable than the information it can get from the nodes it's trying to sync with.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 1, 2016

Storing a merkle root allows clients to very quickly sync up with the correct network if multiple competing chains are present, allowing forks even against the GHOST block-weight-protocol if the client implements it.

This means that if, for example, miners execute a softfork, even a majority hashrate has no meaning as users can opt to use the non-softforked chain.

It also enables weak subjectivity; a single merkle root hash is enough for a client to instantly sync to the correct chain by inspecting the block data/state.


I'd still try to keep the save interval somewhat big, a short interval is not that necessary and at 512 blocks, the wait maximum wait time is 58.9 minutes and also means the last savepoint is at maximum 58.9 minutes old.


Re: Uncles; as uncles are valid parts of the chain, they should also be allowed as VAH values.

tscs37 commented Aug 1, 2016

Storing a merkle root allows clients to very quickly sync up with the correct network if multiple competing chains are present, allowing forks even against the GHOST block-weight-protocol if the client implements it.

This means that if, for example, miners execute a softfork, even a majority hashrate has no meaning as users can opt to use the non-softforked chain.

It also enables weak subjectivity; a single merkle root hash is enough for a client to instantly sync to the correct chain by inspecting the block data/state.


I'd still try to keep the save interval somewhat big, a short interval is not that necessary and at 512 blocks, the wait maximum wait time is 58.9 minutes and also means the last savepoint is at maximum 58.9 minutes old.


Re: Uncles; as uncles are valid parts of the chain, they should also be allowed as VAH values.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 1, 2016

Contributor

My thoughts:

Re reorgs and odd/even: a reorg never change the block number.
Re uncles: they should be ignored, since they are not accessible from within the EVM.

Contributor

holiman commented Aug 1, 2016

My thoughts:

Re reorgs and odd/even: a reorg never change the block number.
Re uncles: they should be ignored, since they are not accessible from within the EVM.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 1, 2016

The problem with using blocknumbers is that they are not replay safe.

If the users fears that a reorg might drop the transaction, they can use a block that is farther in the past. Using a block hash from a couple beacon intervals is enough safety against reorgs.

Uncles are not accessible to the EVM but to the validating nodes. This is equally solved by waiting for blocks to matures a bit as it is by using uncles.

This might be an interesting extension to the EVM, to be able to access uncle data like their blocknumber and hash as part of future consensus algorithms.

tscs37 commented Aug 1, 2016

The problem with using blocknumbers is that they are not replay safe.

If the users fears that a reorg might drop the transaction, they can use a block that is farther in the past. Using a block hash from a couple beacon intervals is enough safety against reorgs.

Uncles are not accessible to the EVM but to the validating nodes. This is equally solved by waiting for blocks to matures a bit as it is by using uncles.

This might be an interesting extension to the EVM, to be able to access uncle data like their blocknumber and hash as part of future consensus algorithms.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 1, 2016

Contributor

Nice discussion going!

So a few thoughts...

Uncles

I have two objection to accepting uncles as VAH. Firstly, from a practical perspective - uncles are not accessible from within the blockchain, thus can never be saved by contract C. Also, I believe it would add a lot of complexity, since uncles (and grand-uncles) are complex as it is.

Secondly, from a more theoretic perspective, uncles represent "roads not taken". By stating "This transaction is only valid after hash X", we want the transaction to only happen if block X has happened. So if X is an uncle, it means that the chain we aimed for with the VAH has explicitly been abandoned.

Blocknumbers

The problem with using blocknumbers is that they are not replay safe

@tscs37 Could you elaborate? Replay safe in what fashion? Using blocknumbers how?

I touched on the problem with reorganisations above: "Regular wallets can use a recent odd-numbered blockhash (presumably from six blocks back or so, since chain can be reorganized).".

I dont understand any other potential problems with chain reorganisations. Any mined block (both mainchain and uncles) have a hash and a blocknumber. A block can never be reorganized from being number 34535 into being 34536.

More clever use of key->value

I don't understand the syncing problem that we're solving with merkle trees either, actually. And I'm not saying that as in "you're wrong", but genuinely I do not understand :)

This could allow full nodes like geth to utilize this contract for an even faster sync to the network
where they sync to the state of the last beacon and then sync backwards

If they sync up until the last hour, aren't they already synced ?

Having said that, I agree that if we can put something more clever and usable instead of true, I'm all for it. We don't need to worry about gas consumption, almost the opposite.

Beacon interval

I'm torn on the intervals. I definitely feel that 8hours is unnecessarily long. If, somehow, for some reason, an unplanned HF occurs, there could be a lot of replayed transactions occuring in that timespan. 1 hour sounds better to me.

it would take a lot of ether to bloat the contract past what can be easily cached

Would it really? BTCRelay has 110742 entries, all bitcoin block headers since a while back. I'm sure it wasn't cheap, but doable. And it's pay once, live with it forever.

Contributor

holiman commented Aug 1, 2016

Nice discussion going!

So a few thoughts...

Uncles

I have two objection to accepting uncles as VAH. Firstly, from a practical perspective - uncles are not accessible from within the blockchain, thus can never be saved by contract C. Also, I believe it would add a lot of complexity, since uncles (and grand-uncles) are complex as it is.

Secondly, from a more theoretic perspective, uncles represent "roads not taken". By stating "This transaction is only valid after hash X", we want the transaction to only happen if block X has happened. So if X is an uncle, it means that the chain we aimed for with the VAH has explicitly been abandoned.

Blocknumbers

The problem with using blocknumbers is that they are not replay safe

@tscs37 Could you elaborate? Replay safe in what fashion? Using blocknumbers how?

I touched on the problem with reorganisations above: "Regular wallets can use a recent odd-numbered blockhash (presumably from six blocks back or so, since chain can be reorganized).".

I dont understand any other potential problems with chain reorganisations. Any mined block (both mainchain and uncles) have a hash and a blocknumber. A block can never be reorganized from being number 34535 into being 34536.

More clever use of key->value

I don't understand the syncing problem that we're solving with merkle trees either, actually. And I'm not saying that as in "you're wrong", but genuinely I do not understand :)

This could allow full nodes like geth to utilize this contract for an even faster sync to the network
where they sync to the state of the last beacon and then sync backwards

If they sync up until the last hour, aren't they already synced ?

Having said that, I agree that if we can put something more clever and usable instead of true, I'm all for it. We don't need to worry about gas consumption, almost the opposite.

Beacon interval

I'm torn on the intervals. I definitely feel that 8hours is unnecessarily long. If, somehow, for some reason, an unplanned HF occurs, there could be a lot of replayed transactions occuring in that timespan. 1 hour sounds better to me.

it would take a lot of ether to bloat the contract past what can be easily cached

Would it really? BTCRelay has 110742 entries, all bitcoin block headers since a while back. I'm sure it wasn't cheap, but doable. And it's pay once, live with it forever.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 2, 2016

I dont understand any other potential problems with chain reorganisations. Any mined block (both mainchain and uncles) have a hash and a blocknumber. A block can never be reorganized from being number 34535 into being 34536.

If a fork were to occur at block 34534, then both chains now have differing blocks 34535 and 34536, so using blocknumbers will not be replay safe across those chains.


I'm in favor of slightly longer beacon intervals because it allows forks to assert themselves.

Until the winning chain is determined some time may pass and until then it might be beneficial to replay as much as possible on both chains. After one or a few hours the network should have mostly stabilized on one fork.

tscs37 commented Aug 2, 2016

I dont understand any other potential problems with chain reorganisations. Any mined block (both mainchain and uncles) have a hash and a blocknumber. A block can never be reorganized from being number 34535 into being 34536.

If a fork were to occur at block 34534, then both chains now have differing blocks 34535 and 34536, so using blocknumbers will not be replay safe across those chains.


I'm in favor of slightly longer beacon intervals because it allows forks to assert themselves.

Until the winning chain is determined some time may pass and until then it might be beneficial to replay as much as possible on both chains. After one or a few hours the network should have mostly stabilized on one fork.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 2, 2016

Contributor

Ah, I think there has been a misunderstanding. I never meant to use blocknumbers, but hashes from a pool.
So what I meant was that the pool contains hashes from odd-numbered recent blocks (and contract C hashes, and genesis).
And the hashes from odd-numbered recent blocks have expiry built in.

Contributor

holiman commented Aug 2, 2016

Ah, I think there has been a misunderstanding. I never meant to use blocknumbers, but hashes from a pool.
So what I meant was that the pool contains hashes from odd-numbered recent blocks (and contract C hashes, and genesis).
And the hashes from odd-numbered recent blocks have expiry built in.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 2, 2016

Collaborator

Would it really? BTCRelay has 110742 entries, all bitcoin block headers since a while back. I'm sure it wasn't cheap, but doable. And it's pay once, live with it forever.

So, thinking about this more, submitting lots of hashes doesn't seem like a viable attack strategy. In order to force a client to keep an otherwise-irrelevant hash in memory, you have to submit transactions that reference it, which means they're valid, and thus cost gas. The only other attack is to ensure that the contract's list is too large to be entirely cached in memory - which clients will likely assume anyway for simplicity - and thus force the client to do one trie lookup for each transaction you submit with an invalid hash.

We could specify that transactions from accounts with a balance but an invalid hash are valid transactions, but have no effect on the state besides charging them a modest amount of gas for the lookup, but I personally don't think the effort of doing a second trie lookup (the account balance being the first) is a big enough threat to warrant that.

Collaborator

Arachnid commented Aug 2, 2016

Would it really? BTCRelay has 110742 entries, all bitcoin block headers since a while back. I'm sure it wasn't cheap, but doable. And it's pay once, live with it forever.

So, thinking about this more, submitting lots of hashes doesn't seem like a viable attack strategy. In order to force a client to keep an otherwise-irrelevant hash in memory, you have to submit transactions that reference it, which means they're valid, and thus cost gas. The only other attack is to ensure that the contract's list is too large to be entirely cached in memory - which clients will likely assume anyway for simplicity - and thus force the client to do one trie lookup for each transaction you submit with an invalid hash.

We could specify that transactions from accounts with a balance but an invalid hash are valid transactions, but have no effect on the state besides charging them a modest amount of gas for the lookup, but I personally don't think the effort of doing a second trie lookup (the account balance being the first) is a big enough threat to warrant that.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 2, 2016

We could specify that transactions from accounts with a balance but an invalid hash are valid transactions, but have no effect on the state besides charging them a modest amount of gas for the lookup, ...

If the client operates correctly then invalid hashes should not appear, so invalid hashes are a sign the transaction is from another network or malfunctioning client.

In either case causing the account a subtraction of gas could cause issues again, where transactions of one network are replayed to cause gascosts on the other.


After some thinking I believe that a 128 Beacon Interval would be far enough to not store excessive data on the chain and also not too far that a fork will cause havoc with replays for very long (~about 30 seconds)

Beacons should expire after some time so clients, even light clients, can keep the entirety of the beacon list in memory.

By default this can be placed at 1024 * 1024, this means in general transactions are valid for 120 days (roughly) before they'll be dropped and there are only about 8192 active beacons. If the blocktime decreases by more than a factor of 10 it might be viable to fork so that this value can be increased accordingly.

If a transaction is expected to hang around in the mempool for a while it can use the Genesis Hash, which is basically beacon #0 and virtually never expires.

In short that means a client needs to hold the following memory contents:

1x 32 bytes for the Genesis hash
8192x 32 bytes for Beacons + 32 bytes for Expiry Block Number
1x 32 bytes for incoming Beacons as buffer + 32 bytes for incoming Beacons Expiry Numbers
= 524384 bytes = 512.1 kBytes

The Beacon Contract should also store the block in which the beacon was stored, ie the Beacon 1920111 was stored at Block 1920122.

tscs37 commented Aug 2, 2016

We could specify that transactions from accounts with a balance but an invalid hash are valid transactions, but have no effect on the state besides charging them a modest amount of gas for the lookup, ...

If the client operates correctly then invalid hashes should not appear, so invalid hashes are a sign the transaction is from another network or malfunctioning client.

In either case causing the account a subtraction of gas could cause issues again, where transactions of one network are replayed to cause gascosts on the other.


After some thinking I believe that a 128 Beacon Interval would be far enough to not store excessive data on the chain and also not too far that a fork will cause havoc with replays for very long (~about 30 seconds)

Beacons should expire after some time so clients, even light clients, can keep the entirety of the beacon list in memory.

By default this can be placed at 1024 * 1024, this means in general transactions are valid for 120 days (roughly) before they'll be dropped and there are only about 8192 active beacons. If the blocktime decreases by more than a factor of 10 it might be viable to fork so that this value can be increased accordingly.

If a transaction is expected to hang around in the mempool for a while it can use the Genesis Hash, which is basically beacon #0 and virtually never expires.

In short that means a client needs to hold the following memory contents:

1x 32 bytes for the Genesis hash
8192x 32 bytes for Beacons + 32 bytes for Expiry Block Number
1x 32 bytes for incoming Beacons as buffer + 32 bytes for incoming Beacons Expiry Numbers
= 524384 bytes = 512.1 kBytes

The Beacon Contract should also store the block in which the beacon was stored, ie the Beacon 1920111 was stored at Block 1920122.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 2, 2016

Collaborator

If the client operates correctly then invalid hashes should not appear, so invalid hashes are a sign the transaction is from another network or malfunctioning client.

Or it's an attack. I don't believe costing a little gas on an alternate fork is a significant concern, but as I said, I also don't think it's really necessary to do this to protect against DoS attacks.

Beacons should expire after some time so clients, even light clients, can keep the entirety of the beacon list in memory.

This would remove one of the main points of having beacons, which is to support offline clients that can't fetch recent blocks from the network. I don't believe that fetching a trie entry to validate a transaction is an undue burden.

Collaborator

Arachnid commented Aug 2, 2016

If the client operates correctly then invalid hashes should not appear, so invalid hashes are a sign the transaction is from another network or malfunctioning client.

Or it's an attack. I don't believe costing a little gas on an alternate fork is a significant concern, but as I said, I also don't think it's really necessary to do this to protect against DoS attacks.

Beacons should expire after some time so clients, even light clients, can keep the entirety of the beacon list in memory.

This would remove one of the main points of having beacons, which is to support offline clients that can't fetch recent blocks from the network. I don't believe that fetching a trie entry to validate a transaction is an undue burden.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 2, 2016

Offline Wallets can utilize the Genesis Hash to construct a transaction even if no network connection is present, if that is not enough security, the user can simply transport a single and recent beacon hash to the offline wallet.

I think for 99.99% of all offline wallets we can safely assume the user is capable of transporting a beacon hash to the signing software that is no older than 120 days, which is a low estimate on a 10 second blocktime.

To make this simpler, a transaction may be allowed to only use 8 hexadecimal digits of a blockhash, similar to a git commit hash.

tscs37 commented Aug 2, 2016

Offline Wallets can utilize the Genesis Hash to construct a transaction even if no network connection is present, if that is not enough security, the user can simply transport a single and recent beacon hash to the offline wallet.

I think for 99.99% of all offline wallets we can safely assume the user is capable of transporting a beacon hash to the signing software that is no older than 120 days, which is a low estimate on a 10 second blocktime.

To make this simpler, a transaction may be allowed to only use 8 hexadecimal digits of a blockhash, similar to a git commit hash.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 2, 2016

Collaborator

Offline Wallets can utilize the Genesis Hash to construct a transaction even if no network connection is present, if that is not enough security, the user can simply transport a single and recent beacon hash to the offline wallet.

The genesis hash won't protect against forks, and "simply transport a single and recent beacon hash" is papering over a lot of complexity.

I honestly don't see the point in expiring beacons. It adds complexity both to the contract and to the implementations, for something that I think is a nonissue.

Collaborator

Arachnid commented Aug 2, 2016

Offline Wallets can utilize the Genesis Hash to construct a transaction even if no network connection is present, if that is not enough security, the user can simply transport a single and recent beacon hash to the offline wallet.

The genesis hash won't protect against forks, and "simply transport a single and recent beacon hash" is papering over a lot of complexity.

I honestly don't see the point in expiring beacons. It adds complexity both to the contract and to the implementations, for something that I think is a nonissue.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 2, 2016

Contributor

Agree w @Arachnid.

Contributor

holiman commented Aug 2, 2016

Agree w @Arachnid.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 2, 2016

Right, expiring beacons makes it complicated.

If the blockhash is valid indefinitely than wallet owners can use any hash as long as it is not before a recent fork.

tscs37 commented Aug 2, 2016

Right, expiring beacons makes it complicated.

If the blockhash is valid indefinitely than wallet owners can use any hash as long as it is not before a recent fork.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 2, 2016

I've been playing a bit with solidity and I think I came up with a simple contract that conveys the beacon thing well enough, IMO.

https://gist.github.com/tscs37/dccee713495dd1502f8dfede1740dda8

deployed on testnet at: 0x8c7Aec1F480d8DAc5F045DA62e091AFAd71311FC

to validate a client would simply do this:

if (beacons.checkBeacon(tx.beacon) < block.number)
    return valid
else
    return invalid

the arithmetic works in 32bit just as it works in 256bit so light clients won't need to perform full 256bit arithmetic of the EVM.

it stores the blocknumber with a 1 offset so that the zero value of the mapping wraps around and becomes bigger than the blocknumber, additionally the genesis block is not a special case in the contract, which is advantagous if someone wants to start a new chain.

I also noticed that a nonce will still be needed, even if the Beacon Interval goes down to 2 or even 1 as even then transactions cannot be replayed on other networks but on the same network they can.

For Offline Wallets there are three options now:

  • Use the Genesis Hash, not secure against HF-replays
  • Use some hash after a recent hardfork, secure against HF-replays
  • Use legacy format and rely on nonces as before

tscs37 commented Aug 2, 2016

I've been playing a bit with solidity and I think I came up with a simple contract that conveys the beacon thing well enough, IMO.

https://gist.github.com/tscs37/dccee713495dd1502f8dfede1740dda8

deployed on testnet at: 0x8c7Aec1F480d8DAc5F045DA62e091AFAd71311FC

to validate a client would simply do this:

if (beacons.checkBeacon(tx.beacon) < block.number)
    return valid
else
    return invalid

the arithmetic works in 32bit just as it works in 256bit so light clients won't need to perform full 256bit arithmetic of the EVM.

it stores the blocknumber with a 1 offset so that the zero value of the mapping wraps around and becomes bigger than the blocknumber, additionally the genesis block is not a special case in the contract, which is advantagous if someone wants to start a new chain.

I also noticed that a nonce will still be needed, even if the Beacon Interval goes down to 2 or even 1 as even then transactions cannot be replayed on other networks but on the same network they can.

For Offline Wallets there are three options now:

  • Use the Genesis Hash, not secure against HF-replays
  • Use some hash after a recent hardfork, secure against HF-replays
  • Use legacy format and rely on nonces as before
@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 2, 2016

Contributor

I looked through it briefly, some comments:

function getLastBeaconHash() constant public returns (bytes32) {
    return block.blockhash(block.number - (block.number % BeaconInterval));
}

This will only work if the saveBeacon is actually invoked every beacon interval. In my mind, I'm thinking that this will only be called maybe a once or twice a year (barring attacks). So a small interval means higher resolution when we legitimately need to call it - at hard forks. But I don't expect benign users to actually call it at regular intervals.

And yeah, I agree that some const methods are needed to get the last couple of beacons or so, so maybe that will have to be added also.

I didn't consider the genesiss hash be part of the contract C. I think from a test-perspective, it may be easier if genesis is left out (then they can use the same code and prepolulate it into the testnet/privatenet, instead of actually deploying a contract with the proper genesis).

So the contract's only 'mission', is to provide checkpoints after forks.

And yes, nonce is still needed.

And lastly, the "legacy format" will have to go. This must be a mandatory field, RLP-encoding does not allow optional fields. If we don't want to add another flag, or special meaning to hash 0x000...0, but I don't see any reason to make it optional really.

Contributor

holiman commented Aug 2, 2016

I looked through it briefly, some comments:

function getLastBeaconHash() constant public returns (bytes32) {
    return block.blockhash(block.number - (block.number % BeaconInterval));
}

This will only work if the saveBeacon is actually invoked every beacon interval. In my mind, I'm thinking that this will only be called maybe a once or twice a year (barring attacks). So a small interval means higher resolution when we legitimately need to call it - at hard forks. But I don't expect benign users to actually call it at regular intervals.

And yeah, I agree that some const methods are needed to get the last couple of beacons or so, so maybe that will have to be added also.

I didn't consider the genesiss hash be part of the contract C. I think from a test-perspective, it may be easier if genesis is left out (then they can use the same code and prepolulate it into the testnet/privatenet, instead of actually deploying a contract with the proper genesis).

So the contract's only 'mission', is to provide checkpoints after forks.

And yes, nonce is still needed.

And lastly, the "legacy format" will have to go. This must be a mandatory field, RLP-encoding does not allow optional fields. If we don't want to add another flag, or special meaning to hash 0x000...0, but I don't see any reason to make it optional really.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 2, 2016

I noticed that flaw too, but if a client is using the contract to fetch the last beacon it could also check if the beacon is valid.

The contract could simply store the last saved beacon and use that for getLastBeaconHash() and getLastBeaconNumber()

Some genesis is needed for the deployment because when the contract goes live and starts validating transactions, it has no beacons and therefore all transactions would be invalid or not replay safe.

So it needs an initial genesis beacon to cover the deployment on the chain.

Personally, I'm hoping people will call the saveBeacon() function more often than a couple times a week. The gas cost of doing so is minimal and it provides network security.

Since legacy is a nono, the options for fallback are either the 0x0 hash or using the genesis hash.

0x0 would provide 0 cross-network security, genesis would provide some.

As the contract already includes genesis making the hash 0x0 a special case of always being valid would be sufficient.

tscs37 commented Aug 2, 2016

I noticed that flaw too, but if a client is using the contract to fetch the last beacon it could also check if the beacon is valid.

The contract could simply store the last saved beacon and use that for getLastBeaconHash() and getLastBeaconNumber()

Some genesis is needed for the deployment because when the contract goes live and starts validating transactions, it has no beacons and therefore all transactions would be invalid or not replay safe.

So it needs an initial genesis beacon to cover the deployment on the chain.

Personally, I'm hoping people will call the saveBeacon() function more often than a couple times a week. The gas cost of doing so is minimal and it provides network security.

Since legacy is a nono, the options for fallback are either the 0x0 hash or using the genesis hash.

0x0 would provide 0 cross-network security, genesis would provide some.

As the contract already includes genesis making the hash 0x0 a special case of always being valid would be sufficient.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 2, 2016

Contributor

Some genesis is needed for the deployment because when the contract goes live and starts validating transactions, it has no beacons and therefore all transactions would be invalid or not replay safe.

Not at all. See my proposal above:

A transaction must contain a "valid-after-hash" field.

  • The VAH must contain a hash from pool P, in order to be valid.
  • The pool P consists of the following hashes:
    • The hashes of all odd-numbered blocks from the most recent 256 blocks.
    • The genesis hash
    • All hashes in the contract C

So the genesis does not need to be in C. The C contract is one of three ways to 'anchor' a transaction to a hash.

I'd rather not have an opt-out 0x00..0 hash, but if someone makes a strong case for it, I could probably be convinced otherwise.

Contributor

holiman commented Aug 2, 2016

Some genesis is needed for the deployment because when the contract goes live and starts validating transactions, it has no beacons and therefore all transactions would be invalid or not replay safe.

Not at all. See my proposal above:

A transaction must contain a "valid-after-hash" field.

  • The VAH must contain a hash from pool P, in order to be valid.
  • The pool P consists of the following hashes:
    • The hashes of all odd-numbered blocks from the most recent 256 blocks.
    • The genesis hash
    • All hashes in the contract C

So the genesis does not need to be in C. The C contract is one of three ways to 'anchor' a transaction to a hash.

I'd rather not have an opt-out 0x00..0 hash, but if someone makes a strong case for it, I could probably be convinced otherwise.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 2, 2016

I updated the gist accordingly and redeployed to 0xa950fa6eF54afeb3B6EA5BFdF3fe15A50862274D on the testnet.

The Contract should sufficiently act as a Beacon Pool for Clients to pick up.

Otherwise I don't see any futher problem with this.

The 0x0 hash is pretty much an option that might be useful for Offline Wallets, but it offers no advantages over using the genesis hash.

tscs37 commented Aug 2, 2016

I updated the gist accordingly and redeployed to 0xa950fa6eF54afeb3B6EA5BFdF3fe15A50862274D on the testnet.

The Contract should sufficiently act as a Beacon Pool for Clients to pick up.

Otherwise I don't see any futher problem with this.

The 0x0 hash is pretty much an option that might be useful for Offline Wallets, but it offers no advantages over using the genesis hash.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 3, 2016

Contributor

The 0x0 hash is pretty much an option that might be useful for Offline Wallets, but it offers no advantages over using the genesis hash.

Yes, the problem I see is that there is an assymetry regarding who decides and who takes the risk. So, if there is a hardware wallet developer, they could choose to use 0x0 for their hw wallet. They know that this is risky, but only a very small risk, and only in special circumstances.

Later on, a non-technical user who is using this wallet may not know about the potential once-on-a-blue-moon vulnerability, and lose funds because of it. So the person who decided to take the risk is a very technical person, but the person losing funds because of it knew nothing about it.

I'd rather not provide an 'insecure replay-prone mode', but instead force wallet-developers to actually make the wallets configurable with with blockhash.

Contributor

holiman commented Aug 3, 2016

The 0x0 hash is pretty much an option that might be useful for Offline Wallets, but it offers no advantages over using the genesis hash.

Yes, the problem I see is that there is an assymetry regarding who decides and who takes the risk. So, if there is a hardware wallet developer, they could choose to use 0x0 for their hw wallet. They know that this is risky, but only a very small risk, and only in special circumstances.

Later on, a non-technical user who is using this wallet may not know about the potential once-on-a-blue-moon vulnerability, and lose funds because of it. So the person who decided to take the risk is a very technical person, but the person losing funds because of it knew nothing about it.

I'd rather not provide an 'insecure replay-prone mode', but instead force wallet-developers to actually make the wallets configurable with with blockhash.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 3, 2016

Collaborator

So, Vlad and I had an interesting conversation last night about this; he had a simpler proposal that I haven't been able to find any flaws in.

Effectively, his suggestion is just that the block hash should be required to be the hash of the most recent fork block. Clients only need a list of current and planned fork block numbers in order to be able to compute the correct value at a given block height, and as soon as there's a fork, all transactions published before the fork become invalid on the hardfork side.

This seems significantly simpler than our proposal, and I can't think of any significant downsides. What do others think?

Collaborator

Arachnid commented Aug 3, 2016

So, Vlad and I had an interesting conversation last night about this; he had a simpler proposal that I haven't been able to find any flaws in.

Effectively, his suggestion is just that the block hash should be required to be the hash of the most recent fork block. Clients only need a list of current and planned fork block numbers in order to be able to compute the correct value at a given block height, and as soon as there's a fork, all transactions published before the fork become invalid on the hardfork side.

This seems significantly simpler than our proposal, and I can't think of any significant downsides. What do others think?

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 3, 2016

This could work but it only protects against planned hardforks.

If there was an unplanned fork of any kind that created a network split, it would not protect against replays at all.

Using a pool of hashes from uneven blocks and the beacons allows clients to safely tether to their specific chain with very little chance of being vulnerable to replay attacks in the long term.

tscs37 commented Aug 3, 2016

This could work but it only protects against planned hardforks.

If there was an unplanned fork of any kind that created a network split, it would not protect against replays at all.

Using a pool of hashes from uneven blocks and the beacons allows clients to safely tether to their specific chain with very little chance of being vulnerable to replay attacks in the long term.

@Smithgift

This comment has been minimized.

Show comment
Hide comment
@Smithgift

Smithgift Aug 3, 2016

@tscs37 👍. I don't think we can plan on planned forks being the only event.

That said, I think it would be worthwhile even if it is only for hardforks, considering that any future hardfork may two survivors. (Or more?)

I believe this will be a temporary (if quite useful) patch, as once accounts are contracts, too, this can be done on EVM, not protocol, level. A standard would still be desirable for that day.

@tscs37 👍. I don't think we can plan on planned forks being the only event.

That said, I think it would be worthwhile even if it is only for hardforks, considering that any future hardfork may two survivors. (Or more?)

I believe this will be a temporary (if quite useful) patch, as once accounts are contracts, too, this can be done on EVM, not protocol, level. A standard would still be desirable for that day.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 3, 2016

Collaborator

An unplanned HF is almost certainly due to a bug in one or more client implementations - and I'd expect the bug to be fixed and the network to be merged again as soon as possible; in that situation replays across the forks are desirable, since they reduce the odds of a rollback when that happens.

Collaborator

Arachnid commented Aug 3, 2016

An unplanned HF is almost certainly due to a bug in one or more client implementations - and I'd expect the bug to be fixed and the network to be merged again as soon as possible; in that situation replays across the forks are desirable, since they reduce the odds of a rollback when that happens.

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 3, 2016

Contributor

Effectively, his suggestion is just that the block hash should be required to be the hash of the most recent fork block. Clients only need a list of current and planned fork block numbers in order to be able to compute the correct value at a given block height, and as soon as there's a fork, all transactions published before the fork become invalid on the hardfork side.

Isn't that more or less what we're doing with the C contract? When I wrote "will only be called maybe a once or twice a year", I was referring to only the forks. And the knowledge about the most recent fork, where would that come from, if a it's not from a contract?

So it sounds like the proposal above, but without genesis and without most-recent hashes. And it sounds like that would miss a couple of things:

  • Less useful on private networks, unless genesis is considered the most recent fork block before any fork occurs.
  • Does not make stale transactions time out, meaning they can slosh around in transaction-pools waiting for balance to become large enough. From purely anecdotal evidence, what I've read on various channels (gitter / reddit), this seems to have caused problems from time to time.
Contributor

holiman commented Aug 3, 2016

Effectively, his suggestion is just that the block hash should be required to be the hash of the most recent fork block. Clients only need a list of current and planned fork block numbers in order to be able to compute the correct value at a given block height, and as soon as there's a fork, all transactions published before the fork become invalid on the hardfork side.

Isn't that more or less what we're doing with the C contract? When I wrote "will only be called maybe a once or twice a year", I was referring to only the forks. And the knowledge about the most recent fork, where would that come from, if a it's not from a contract?

So it sounds like the proposal above, but without genesis and without most-recent hashes. And it sounds like that would miss a couple of things:

  • Less useful on private networks, unless genesis is considered the most recent fork block before any fork occurs.
  • Does not make stale transactions time out, meaning they can slosh around in transaction-pools waiting for balance to become large enough. From purely anecdotal evidence, what I've read on various channels (gitter / reddit), this seems to have caused problems from time to time.
@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 3, 2016

Collaborator

Isn't that more or less what we're doing with the C contract? When I wrote "will only be called maybe a once or twice a year", I was referring to only the forks.

Yes, but without the contract it'd have fewer moving parts.

And the knowledge about the most recent fork, where would that come from, if a it's not from a contract?

From the client; it needs to know the block number of planned & past forks in order to implement them anyway.

Less useful on private networks, unless genesis is considered the most recent fork block before any fork occurs.

I think that would be the most sensible way to treat it.

Does not make stale transactions time out, meaning they can slosh around in transaction-pools waiting for balance to become large enough. From purely anecdotal evidence, what I've read on various channels (gitter / reddit), this seems to have caused problems from time to time.

We could extend it to "most recent fork block hash or 256 most recent block hashes" to regain that property (this may have been what Vlad suggested, but I can't recall for certain).

It's worth noting that someone deliberately seeking to flood the transaction pool can do it with either scheme by choosing the effectively-infinite option instead. If we really wanted to mitigate this for good, I'd suggest adding an expiry timestamp field to the TX, with a max allowable expiry period.

Collaborator

Arachnid commented Aug 3, 2016

Isn't that more or less what we're doing with the C contract? When I wrote "will only be called maybe a once or twice a year", I was referring to only the forks.

Yes, but without the contract it'd have fewer moving parts.

And the knowledge about the most recent fork, where would that come from, if a it's not from a contract?

From the client; it needs to know the block number of planned & past forks in order to implement them anyway.

Less useful on private networks, unless genesis is considered the most recent fork block before any fork occurs.

I think that would be the most sensible way to treat it.

Does not make stale transactions time out, meaning they can slosh around in transaction-pools waiting for balance to become large enough. From purely anecdotal evidence, what I've read on various channels (gitter / reddit), this seems to have caused problems from time to time.

We could extend it to "most recent fork block hash or 256 most recent block hashes" to regain that property (this may have been what Vlad suggested, but I can't recall for certain).

It's worth noting that someone deliberately seeking to flood the transaction pool can do it with either scheme by choosing the effectively-infinite option instead. If we really wanted to mitigate this for good, I'd suggest adding an expiry timestamp field to the TX, with a max allowable expiry period.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 3, 2016

If we really wanted to mitigate this for good, I'd suggest adding an expiry timestamp field to the TX, with a max allowable expiry period.

I think this would be great to have on transactions, expiring transactions would be a sane default for some clients.

Yes, but without the contract it'd have fewer moving parts.

It requires the maintainer of any client to forever include all hashes of forks, possibly hardcoded or as a configuration file that is delivered with the client.

Putting it on the chain means the developers only need to know the address of that contract and then they can validate it on any network, they don't have to care about any past forks, about the hashes of those blocks, etc., the EVM handles validation of transactions, the beacons are maintained by the community as a whole.

tscs37 commented Aug 3, 2016

If we really wanted to mitigate this for good, I'd suggest adding an expiry timestamp field to the TX, with a max allowable expiry period.

I think this would be great to have on transactions, expiring transactions would be a sane default for some clients.

Yes, but without the contract it'd have fewer moving parts.

It requires the maintainer of any client to forever include all hashes of forks, possibly hardcoded or as a configuration file that is delivered with the client.

Putting it on the chain means the developers only need to know the address of that contract and then they can validate it on any network, they don't have to care about any past forks, about the hashes of those blocks, etc., the EVM handles validation of transactions, the beacons are maintained by the community as a whole.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 4, 2016

Collaborator

It requires the maintainer of any client to forever include all hashes of forks, possibly hardcoded or as a configuration file that is delivered with the client.

No, just the block numbers - and they have to be included in the client anyway, so it knows when to change rulesets.

Collaborator

Arachnid commented Aug 4, 2016

It requires the maintainer of any client to forever include all hashes of forks, possibly hardcoded or as a configuration file that is delivered with the client.

No, just the block numbers - and they have to be included in the client anyway, so it knows when to change rulesets.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 4, 2016

Personally, I think it would be in general a good move to put some or all of transaction validation onto the EVM, so this could be a first step to that.

Less Code on the Client Side and a globally shared "client".

tscs37 commented Aug 4, 2016

Personally, I think it would be in general a good move to put some or all of transaction validation onto the EVM, so this could be a first step to that.

Less Code on the Client Side and a globally shared "client".

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 21, 2016

Contributor

@Arachnid I have thought a bit about the scheme with only a set of blocknumbers, delivered with the client (possibly hardcoded).

Let's say there's another planned (controversial) HF. Before HF, all clients run software S.
In preparation of HF, we release S', which has block X added to it's blocknumbers of forkblocks.

Now, let's say there's some controversy. And some people prefer not to apply the HF. And some just don't update. Then we have

  • HF on S' : Chose to update, and can use hash of X (on forked chain) in transactions.
  • Non-HF on S': Chose to update, but not fork, and can use hash of X (on non-forked chain) in transactions
  • Non-HF on S : Did not update, and must use a previous hash, since they do not recognize X as a valid fork-block. These transactions can be replayed by the two above.

So, unless I'm mistaken in my reasoning above, I believe an on-chain fork-block-oracle which both updating and non-updating clients can agree about, is better.

On a side-note, should we branch this EIP off into a separate EIP, since what we're discussing here has deviated quite a lot from the original suggestion?

Contributor

holiman commented Aug 21, 2016

@Arachnid I have thought a bit about the scheme with only a set of blocknumbers, delivered with the client (possibly hardcoded).

Let's say there's another planned (controversial) HF. Before HF, all clients run software S.
In preparation of HF, we release S', which has block X added to it's blocknumbers of forkblocks.

Now, let's say there's some controversy. And some people prefer not to apply the HF. And some just don't update. Then we have

  • HF on S' : Chose to update, and can use hash of X (on forked chain) in transactions.
  • Non-HF on S': Chose to update, but not fork, and can use hash of X (on non-forked chain) in transactions
  • Non-HF on S : Did not update, and must use a previous hash, since they do not recognize X as a valid fork-block. These transactions can be replayed by the two above.

So, unless I'm mistaken in my reasoning above, I believe an on-chain fork-block-oracle which both updating and non-updating clients can agree about, is better.

On a side-note, should we branch this EIP off into a separate EIP, since what we're discussing here has deviated quite a lot from the original suggestion?

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 21, 2016

Collaborator

Now, let's say there's some controversy. And some people prefer not to apply the HF. And some just don't update. Then we have

If we specify that a transaction is only valid if it specifies the hash of the latest fork block, then this is solved straightforwardly: upgraded clients that decline the fork should continue to treat the previous hash as the only acceptable one, as will non-upgraded clients. Only upgraded clients choosing the fork will accept the new fork block hash, ensuring that there are exactly two networks, and transactions can't be replayed between them.

I'd very much like to hear @vladzamfir chime in on this, though, since it was originally his suggestion.

On a side-note, should we branch this EIP off into a separate EIP, since what we're discussing here has deviated quite a lot from the original suggestion?

I don't think so; the goal is to find a solution to replay protection, and if we created a new EIP, at most one of the two would be implemented anyway.

Collaborator

Arachnid commented Aug 21, 2016

Now, let's say there's some controversy. And some people prefer not to apply the HF. And some just don't update. Then we have

If we specify that a transaction is only valid if it specifies the hash of the latest fork block, then this is solved straightforwardly: upgraded clients that decline the fork should continue to treat the previous hash as the only acceptable one, as will non-upgraded clients. Only upgraded clients choosing the fork will accept the new fork block hash, ensuring that there are exactly two networks, and transactions can't be replayed between them.

I'd very much like to hear @vladzamfir chime in on this, though, since it was originally his suggestion.

On a side-note, should we branch this EIP off into a separate EIP, since what we're discussing here has deviated quite a lot from the original suggestion?

I don't think so; the goal is to find a solution to replay protection, and if we created a new EIP, at most one of the two would be implemented anyway.

@tscs37

This comment has been minimized.

Show comment
Hide comment
@tscs37

tscs37 Aug 21, 2016

The problem the on-chain verification solves is that of manipulating a client to relay transactions anyway.

Some software S' could include transactions with invalid fork hashes anyways, basically accepting old transactions and thusly create a one-way bridge to replay transactions. If we had a controversial fork now, developers could decide to include transactions from the original chain but not let transactions from their chain onto the original chain.

If the verification of transactions moved on chain, then that means that if one wanted to create this one way bridge, it would most likely incur slightly higher transactions costs (due to the additional logic in the contract) and also require to rewrite a contract on chain, so not as simple as simply changing the network ID or apply a simple consensus breaking patch (ie reduce difficulty)

tscs37 commented Aug 21, 2016

The problem the on-chain verification solves is that of manipulating a client to relay transactions anyway.

Some software S' could include transactions with invalid fork hashes anyways, basically accepting old transactions and thusly create a one-way bridge to replay transactions. If we had a controversial fork now, developers could decide to include transactions from the original chain but not let transactions from their chain onto the original chain.

If the verification of transactions moved on chain, then that means that if one wanted to create this one way bridge, it would most likely incur slightly higher transactions costs (due to the additional logic in the contract) and also require to rewrite a contract on chain, so not as simple as simply changing the network ID or apply a simple consensus breaking patch (ie reduce difficulty)

@holiman

This comment has been minimized.

Show comment
Hide comment
@holiman

holiman Aug 21, 2016

Contributor

Ah, I see, I thought the idea was a list of fork-blocks, but it's just the one.

Contributor

holiman commented Aug 21, 2016

Ah, I see, I thought the idea was a list of fork-blocks, but it's just the one.

@Arachnid

This comment has been minimized.

Show comment
Hide comment
@Arachnid

Arachnid Aug 21, 2016

Collaborator

Some software S' could include transactions with invalid fork hashes anyways, basically accepting old transactions and thusly create a one-way bridge to replay transactions. If we had a controversial fork now, developers could decide to include transactions from the original chain but not let transactions from their chain onto the original chain.

But why would they want to do that in the first place? And why should we try and stop them if they do? This seems like an odd reason to prefer one technical implementation over another.

Ah, I see, I thought the idea was a list of fork-blocks, but it's just the one.

Right. A list is required in order to permit syncing the blockchain from the genesis block, but at any one time there is only one valid hash.

Collaborator

Arachnid commented Aug 21, 2016

Some software S' could include transactions with invalid fork hashes anyways, basically accepting old transactions and thusly create a one-way bridge to replay transactions. If we had a controversial fork now, developers could decide to include transactions from the original chain but not let transactions from their chain onto the original chain.

But why would they want to do that in the first place? And why should we try and stop them if they do? This seems like an odd reason to prefer one technical implementation over another.

Ah, I see, I thought the idea was a list of fork-blocks, but it's just the one.

Right. A list is required in order to permit syncing the blockchain from the genesis block, but at any one time there is only one valid hash.

@konradkonrad

This comment has been minimized.

Show comment
Hide comment
@konradkonrad

konradkonrad Oct 28, 2016

Note, I had a similar proposal/RFC in #127. I closed it, since I liked the more far reaching applications of this proposal initially.

My assumption about generalized replay protection is this:

the goal is to tie signed transactions to a specific blockchain

Now that this issue has somewhat moved from the initial proposal, I have a couple of comments & questions:

BeaconBlocks

I'm not 100% if I support the idea of somewhat fixed BeaconBlocks (every N blocks, every user interaction with contract, ...). It does solve the issues of planned hardforks just fine. However a more general "base block" approach allows for more flexible applications. Consider

  • A sends TX1 and sees it mined in block B
  • also A sees a different transaction TX2 with strategical value for A mined in block B
  • based on this intel, A might consider sending TX3, bound to a chain in which TX2 happened in block B, hence using blockhash B as the binding blockhash

Why were BeaconBlocks instead of arbitrary binding blocks considered?

No blocklimit

For replay protection blocklimit is not strictly necessary. It does however allow for a time to live for transactions, which may help with "strategic transactions", where a user is only willing to transact, if the transaction happens in a certain timeframe.

konradkonrad commented Oct 28, 2016

Note, I had a similar proposal/RFC in #127. I closed it, since I liked the more far reaching applications of this proposal initially.

My assumption about generalized replay protection is this:

the goal is to tie signed transactions to a specific blockchain

Now that this issue has somewhat moved from the initial proposal, I have a couple of comments & questions:

BeaconBlocks

I'm not 100% if I support the idea of somewhat fixed BeaconBlocks (every N blocks, every user interaction with contract, ...). It does solve the issues of planned hardforks just fine. However a more general "base block" approach allows for more flexible applications. Consider

  • A sends TX1 and sees it mined in block B
  • also A sees a different transaction TX2 with strategical value for A mined in block B
  • based on this intel, A might consider sending TX3, bound to a chain in which TX2 happened in block B, hence using blockhash B as the binding blockhash

Why were BeaconBlocks instead of arbitrary binding blocks considered?

No blocklimit

For replay protection blocklimit is not strictly necessary. It does however allow for a time to live for transactions, which may help with "strategic transactions", where a user is only willing to transact, if the transaction happens in a certain timeframe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment