Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consensus upgrade to avoid transaction ID collision of deferred transactions #6115

Closed
arhag opened this issue Oct 23, 2018 · 4 comments

Comments

Projects
None yet
1 participant
@arhag
Copy link
Contributor

commented Oct 23, 2018

Background

A contract-generated transaction (a deferred transaction), which is provided by the contract using the send_deferred intrinsic, does not need to have any particular value set for its expiration and TaPoS header fields. TaPoS and expiration validation does not apply for contract-generated transactions (they do still apply for delayed input transactions which are also considered deferred transactions).

Nevertheless, in the current implementation of apply_context::schedule_deferred_transaction these fields are replaced (overriding whatever arbitrary values were set by the contract) with particular values that would be considered valid if they were to be validated. The particular choice of values chosen has the following side-effects:

  • Because the TaPoS field is replaced with data that cannot be known to the contract, the contract is not able to determine what the transaction ID of the deferred transaction will actually be when it is scheduled.
  • Since the TaPoS information is set based on the head block ID, there is a low likelihood of transaction ID collision with any prior scheduled deferred transaction (excluding the issues with the replace deferred transaction bug described in #6103). This conclusion of low likelihood of collision however assumes that block producers continue to follow the default policy (i.e. not enforced by consensus rules) of retiring scheduled deferred transactions at the earliest in the block after the block in which they were scheduled in.
  • Even though it is possible that a contract-generated transaction may conflict by ID with an input transaction, it is not very likely because of the protections provided by setting the TaPoS field to the head block ID and setting the expiration to the current block timestamp rounded up to the nearest second.

While the likelihood of transaction ID collision is low (given certain assumptions about producer behavior), it would be desirable for transaction ID collision to be virtually impossible regardless of the policy set by producers about how to retire deferred transactions. In the context of this document, "virtually impossible" means no more likely than getting a hash collision by taking a SHA256 cryptographic hash of two distinct bit streams.

Another potentially desirable property for scheduling contract-generated transactions is to make it possible for the contract to determine what the ID of the scheduled deferred transaction will actually be. However, while that is a desirable property, it is not a property we require since there really should be no need for a contract to know that information since they already have a sender_id to refer to sent deferred transactions.

Candidate solutions

Of the various candidate solutions considered, two of them stand out.

The first candidate (referred to as "Global deferred sequence number as TaPoS field") is to add a new global sequence number which tracks the new deferred transactions that were successfully scheduled as of the end of an action. The sequence number would be used for the TaPoS fields of the deferred transaction to provide collision resistance. This method makes it virtually impossible for collisions to occur. However, it does not make it possible for a contract to determine what the transaction ID will be (it is not acceptable to add an intrinsic to access that sequence number).

The second candidate (referred to as "Required transaction extension for contract-generated transactions only") is to use the transaction extensions feature to provide the necessary bits to make each potentially-colliding deferred transaction unique. This method makes it virtually impossible for collisions to occur, and it also makes it possible for a contract to determine what the transaction ID will be.

The details of these two candidate solutions are discussed in the subsections below.

Global deferred sequence number as TaPoS field

Under this approach, a new uint64_t field global_deferred_sequence would be added to dynamic_global_property_object (this forces a replay from genesis to be required).

The send_deferred intrinsic called without replacement would add the sender_id to a set local to the apply_context (reset on each exec_one call) and increment global_deferred_sequence. If the send_deferred intrinsic was called with replacement, it would do the same thing, except it would only increment global_deferred_sequence if the sender_id was not already in the set. If the cancel_deferred intrinsic was called with a sender_id that was already in the set, it would decrement global_deferred_sequence.

In apply_context::schedule_deferred_transaction, the expiration would do the following instead after activate of the consensus upgrade feature:

  • The expiration field would be set to 0. This ensures that it is virtually impossible for the ID of the deferred transaction to collide with the ID of any input transaction (since an expiration of 0 is not valid for an input transaction).
  • The two TaPoS fields (collectively holding a 48-bit number) would store the lower 48-bits of global_deferred_sequence. As long as less than 2^48 deferred transactions were scheduled in the history of the blockchain, this ensures that it is virtually impossible for the ID of the deferred transaction to collide with the ID of any other deferred transaction.

Since the global_deferred_sequence number would not be accessible to the contract, it would not be possible for the contract to know what the deferred transaction ID will be. It is important to not introduce an intrinsic to access this global sequence number since that may severely restrict potential future modes of nodeos operation that may only wish to run a subset of contracts (it would be required to run all other actions just to determine how global_deferred_sequence may change). It is possible to avoid actually executing the other actions in this hypothetical future limited mode of operation if the global_deferred_sequence was committed into every action_receipt, but it is still not recommended since it complicates the processing required.

Required transaction extension for contract-generated transactions only

Under this approach, the expiration and TaPoS fields are all reset to zero, and uniqueness of the deferred transaction is provided through a new transaction extension with a uint16_t type of 0 which is only meant for contract-generated transactions.

This means input transactions (whether delayed or not) would still not be allowed to include transaction extensions even after the consensus upgrade feature was activated (and also would never be allowed to use an extension with an type of 0).

Contract-generated transactions would be expected to have exactly one of this transaction extension with type 0 in the transaction_extensions field of the transaction header at the time they were finished being scheduled. However, to maintain backwards-compatibility apply_context::schedule_deferred_transaction could inject this extension in with the appropriate payload data if it was not already included in the transaction provided by the contract via the send_deferred intrinsic.

The payload data for this transaction extension would include in order: the ID of the transaction in which the contract that called send_deferred was executing within; the receiver account of the contract that called send_deferred; and the sender_id chosen for the send_deferred call.

This payload data provides the uniqueness needed to ensure that it is virtually impossible for the ID of contract-generated transaction to collide with any other transaction ID. A contract that sends an identical transaction with the same sender_id it sent before while executing due to an earlier transaction will end up having a different payload. A contract cannot send an identical transaction with same sender_id twice within the same transaction without canceling or replacing the original because uniqueness is enforced on the IDs of pending deferred transactions, and there is no possible way for a deferred transaction to be retired while another transaction is still executing.

Finally, it is possible for the sending contract to compute the transaction ID of the deferred transaction that will ultimately be scheduled. The contract would know that the expiration and TaPoS fields must all be 0. To calculate the payload data of the transaction extension it would need to know the receiver and sender_id (two things it should already know), and the transaction ID of the transaction it is current executing within. While there is no intrinsic to provide this information (though that could be added for convenient with this consensus upgrade feature), it is already possible for the contract to compute this by retrieving the entire transaction with the read_transaction intrinsic and computing the hash to get the transaction ID.

Consensus upgrade feature

The main goal of the consensus upgrade feature described in this document is to ensure that it is virtually impossible for deferred transactions to end up with a transaction ID that collides with the ID of any other transaction that is scheduled/retired in the blockchain. A secondary goal is to make this transaction ID predictable by the contract that sent the deferred transaction.

A new consensus protocol upgrade feature will be added to trigger the changes described in this consensus upgrade proposal. The actual digest for the feature understood at the blockchain level is to be determined. For the purposes of this proposal the codename NO_DUPLICATE_DEFERRED_ID will be use to stand-in for whatever the feature identifier will actually end up being.

To ease the code changes and upgrade process required, this proposal avoids the first candidate solution ("Global deferred sequence number as TaPoS field"). That approach would require a replay from genesis and adds a new field to an existing index (which requires a change to the chain_snapshot_header version as well). Furthermore, it doesn't even satisfy the secondary goal.

That leaves the second candidate solution ("Required transaction extension for contract-generated transactions only") as the method used to satisfy the goals of this consensus upgrade feature.

A transaction extension with a uint16_t type of 0 is called a "generation context for a deferred transaction" for the purposes of this document. For a "generation context for a deferred transaction" extension to be well-formed, it must have payload data that consists precisely of the following byte stream (in order):

  1. A 32-byte transaction ID (called the sender transaction ID of the extension payload).
  2. A 16-byte serialization of the uint128_t sender ID (called the sender ID of the extension payload).
  3. An 8-byte serialization of an eosio::name representing an account name (called the sending account of the extension payload).

An input transaction (whether delayed or not) is never allowed to have an extension with a type of 0.

Changes to transaction_context

  • There should be no assertion that trx.transaction_extensions.size() == 0 in the constructor of transaction_context or in transaction_context::init.
  • An assertion that trx.transaction_extensions.size() == 0 should be added to transaction_context::init_for_input_trx (ideally prior to the call to init). This assumes that no other consensus upgrade feature has added new transaction extensions that are valid for input transactions to include.
  • An assertion that (trx.expiration.sec_since_epoch() == 0) || (trx.transaction_extensions.size() == 0) should be added to transaction_context::init_for_deferred_trx (ideally prior to the call to init). This should likely be the assertion to check even if other consensus upgrade features add new transaction extensions that are valid for contract-generated transactions to include (those checks should occur in apply_context::scheduled_deferred_transaction).

Changes to apply_context::scheduled_deferred_transaction

If NO_DUPLICATE_DEFERRED_ID has not been activated:

  • Replace the expiration field of the provided transaction with the pending block time rounded up to the nearest second.
  • Replace the TaPoS fields with the data from the head block ID.

If NO_DUPLICATE_DEFERRED_ID has been activated:

  • If the provided transaction trx has non-empty transaction_extensions then check the following:
    • Assert that trx.transaction_extensions does not have more than one extension with a type of 0.
    • Assert that either trx.transaction_extensions does not have an extension with a type of 0, or if it does, assert the following:
      • trx.expiration is 0.
      • trx.ref_block_num and trx.ref_block_prefix (the TaPoS fields) are both 0.
      • trx.transaction_extensions.front() is a well-formed "generated context for a deferred transaction" extension that also satisfies the following conditions:
        • the sender transaction ID of the extension payload is identical to the trx_context.id (which is the ID of the transaction under which the contract that called the send_deferred intrinsic is executing under);
        • the sender ID of the extension payload is identical to sender_id (which is the sender ID passed into send_deferred);
        • the sending account of the extension payload is identical to receiver (which is the receiver of the action from which the send_deferred intrinsic was called).
    • For now (unless other consensus upgrade features change this) continue to assert that no other transaction extension types are included in trx.transaction_extensions.
  • If the provided transaction trx does not have a "generated context for a deferred transaction" extension included in trx.transaction_extensions then do the following:
    1. Replace trx.expiration with 0.
    2. Replace both trx.ref_block_num and trx.ref_block_prefix (the TaPoS fields) with 0.
    3. Prepend to trx.transaction_extensions the well-formed "generated context for a deferred transaction" extension with the following values in its payload:
      • the sender transaction ID of the extension payload is trx_context.id;
      • the sender ID of the extension payload is sender_id;
      • the sending account of the extension payload is receiver.

Changes to controller_impl::apply_onerror

If NO_DUPLICATE_DEFERRED_ID has not been activated:

  • Continue to set the expiration of the onerror transaction to the pending block time rounded up to the nearest second.
  • Continue to set the TaPoS fields of the onerror transaction with the data from the head block ID.

If NO_DUPLICATE_DEFERRED_ID has been activated:

  • Set the expiration of the onerror transaction to 0.
  • Set the TaPoS fields of the onerror transaction to 0. (Optional. Old behavior is also fine without changing the guarantees below.)

For contract-generated transactions that were scheduled after NO_DUPLICATE_DEFERRED_ID, we can guarantee that its onerror transaction cannot be a identical to the onerror transaction of any other deferred transaction since the unique failed deferred transaction is included in the action data payload of the eosio::onerror action. However, deferred transactions that were scheduled before NO_DUPLICATE_DEFERRED_ID activation have no such guarantee.

If two deferred transactions were scheduled before NO_DUPLICATE_DEFERRED_ID activation but one or both retire after NO_DUPLICATE_DEFERRED_ID, the only way for it to not be virtually impossible for them to have the same ID is if they were both scheduled and retired in the same block. This should not be possible if producer policy disallows retiring deferred transactions in the same block they were scheduled in, at least up until NO_DUPLICATE_DEFERRED_ID activation. Even if producer policy allows this, it should not be possible if NO_DUPLICATE_DEFERRED_ID activation can only occur at the beginning of a block prior to applying any transactions.

The remaining ID conflicts regarding onerror transactions to consider are the conflicts between an the ID of an onerror transaction and the ID of a non-onerror transaction. By changing the expiration field (which, along with the TaPoS fields of the implicit error transaction, are not validated) to 0, it becomes virtually impossible for the onerror transaction ID to collide with that of any input transaction immediately after NO_DUPLICATE_DEFERRED_ID activation whether it is an onerror transaction for a deferred transaction that was scheduled before or after NO_DUPLICATE_DEFERRED_ID activation. It would also be virtually impossible for an onerror transaction that retires after NO_DUPLICATE_DEFERRED_ID activation to have ID collision with a deferred transaction that retires without failure and that was scheduled before or after NO_DUPLICATE_DEFERRED_ID activation.

Changes to controller_impl::get_on_block_transaction

If NO_DUPLICATE_DEFERRED_ID has not been activated:

  • Continue to set the expiration of the onblock transaction to the pending block time rounded up to the nearest second.
  • Continue to set the TaPoS fields of the onblock transaction with the data from the head block ID.

If NO_DUPLICATE_DEFERRED_ID has been activated:

  • Set the expiration of the onerror transaction to 0.
  • Set the TaPoS fields of the onerror transaction to 0. (Optional. Old behavior is also fine without changing the guarantees below.)

For the same reasons as described in the previous sub-sections, it should be virtually impossible for a deferred transaction, whether scheduled before or after NO_DUPLICATE_DEFERRED_ID activation, to have the same ID as an onblock transaction generated after NO_DUPLICATE_DEFERRED_ID activation assuming NO_DUPLICATE_DEFERRED_ID activation can only occur at the beginning of a block prior to applying any transactions. However, due to the nature of the onblock transaction (which includes the head block header in its payload), it should unconditionally be virtually impossible to get a ID collision with a deferred transaction.

The remaining ID conflicts regarding onblock transactions to consider are the conflicts between an the ID of an onblock transaction and the ID of an input transaction. By changing the expiration field (which, along with the TaPoS fields of the implicit onblock transaction, are not validated) to 0, it becomes virtually impossible for the onerror transaction ID to collide with that of any input transaction immediately after NO_DUPLICATE_DEFERRED_ID activation.

@arhag arhag added the HARDFORK label Oct 23, 2018

@arhag

This comment has been minimized.

Copy link
Contributor Author

commented Dec 7, 2018

This issue depends on #6429. It will also be setup by default to be a protocol feature requiring pre-activation, thus it also depends on #6431.

@arhag

This comment has been minimized.

Copy link
Contributor Author

commented Apr 4, 2019

Note that for deferred transactions scheduled after this feature is activated, the transaction that can be read (with read_transaction) from the execution of an action of that deferred transaction, as well as the transaction returned to the onerror handler, will include the transaction extension.

If the onerror handler tries to just resend that transaction without modifications (e.g. an auto retry), it will fail since the contents of the extension will not match the new context in which the scheduling is happening. Contracts would need to modify the extension of that transaction prior to sending it. The easiest way to do this is to just clear the transaction_extensions field.

This is relevant information to any BPs considering when/if to activate this protocol feature on a live blockchain, since it is a (subtle and hopefully rare) backwards incompatibility for contracts. However, deserializing the packed transaction with the extension into an eosio::transaction type within a contract should not be a problem (it will just now have a non-empty transaction_extensions field). Furthermore, there is no guarantee that the onerror handler would have enough time to execute anyway, so no contracts should be relying on an assumption that the side-effects of an onerror handler will be applied if the original deferred transaction objectively fails to execute.

arhag added a commit that referenced this issue Apr 4, 2019

better way of rejecting disallowed transaction extensions #6115
Allows the `num_failed` tracker and blacklist of producer_plugin to work 
as intended.
Preserves the current pattern of not retiring (except for case with 
expired status) deferred transaction with invalid extensions even after 
NO_DUPLICATE_DEFERRED_ID activation.

arhag added a commit that referenced this issue Apr 5, 2019

arhag added a commit that referenced this issue Apr 5, 2019

arhag added a commit that referenced this issue Apr 5, 2019

arhag added a commit that referenced this issue Apr 5, 2019

arhag added a commit that referenced this issue Apr 5, 2019

revert previous commit (the issue was not in the bash script) and fix…
… the getAllBuiltinFeatureDigestsToPreactivate function in Node.py #6115

arhag added a commit that referenced this issue Apr 8, 2019

Added protocol_feature_tests/no_duplicate_deferred_id_test unit test …
…to test the NO_DUPLICATE_DEFERRED_ID protocol feature. #6115

Updated the deferred_test test contract to support testing requirements 
of new test.
@arhag

This comment has been minimized.

Copy link
Contributor Author

commented Apr 11, 2019

Resolved by #7072.

@arhag arhag closed this Apr 11, 2019

@arhag

This comment has been minimized.

Copy link
Contributor Author

commented Apr 22, 2019

This protocol feature depends on the REPLACE_DEFERRED protocol feature (#6103). This means that the REPLACE_DEFERRED protocol feature should be activated prior to activating the NO_DUPLICATE_DEFERRED_ID protocol feature. It is also possible to activate both in the same block as long as REPLACE_DEFERRED comes first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.