Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP-2583: Penalty for account trie misses #2583

Merged
merged 7 commits into from
Aug 29, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
261 changes: 261 additions & 0 deletions EIPS/eip-draft-trie-penalty.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,261 @@
---
eip: <to be assigned>
holiman marked this conversation as resolved.
Show resolved Hide resolved
title: Penalty for account trie misses
author: Martin Holst Swende (@holiman)
discussions-to: TBD
status: Draft
type: Standards Track
category: Core
created: 2020-02-21
---


## Simple Summary

This EIP propose to introduce a gas penalty for opcodes which access the account for trie non-existant accounts.

## Abstract

This EIP proposes to add a gas penalty for accesses to the account trie, where the address being looked up does not exist. Non-existing accounts can be used in
DoS attacks, since they bypass cache mechanisms, thus creating a large discrepancy between 'normal' mode of execution and 'worst-case' execution of an opcode.

## Motivation

As the ethereum trie becomes more and more saturated, the number of disk lookups that a node is required to do in order to access a piece of state increases too. This means that checking e.g. `EXTCODEHASH` of an account at block `5` was _inherently_ a cheaper operation that it is at, say `8.5M`.

From an implementation perspective, a node can (and does) use various caching mechanisms to cope with the problem, but there's an inherent problem with caches: when they yield a 'hit', they're great, but when they 'miss', they're useless.

This is attackable. By forcing a node to lookup non-existant keys, an attacker can maximize the number of disk lookups.
Sidenote: even if the 'non-existence' is cached, it's trivial to use a new non-existent key the next time, and never hit the same non-existent key again. Thus, caching 'non-existence' might be dangerous, since it will evict 'good' entries.

So far, the attempts to handle this problem has been in raising the gas cost, e.g. [EIP-150](https://eips.ethereum.org/EIPS/eip-150), [EIP-1884](https://eips.ethereum.org/EIPS/eip-1884).

However, when determining gas-costs, a secondary problem that arises due to the large discrepancy between 'happy-path' and 'notorious path' -- how do we determine the pricing?

- The 'happy-path', assuming all items are cached?
- Doing so would that would underprice all trie-accesses, and could be DoS-attacked.
- The 'normal' usage, based on benchmarks of actual usage?
- This is basically what we do now, but that means that intentionally notorious executions are underpriced -- which constitutes a DoS vulnerability.
- The 'paranoid' case: price everything as if caching did not exist?
- This would severely harm basically every contract due to the gas-cost increase. Also, if the gas limits were raised in order to allow the same amount of computation as before, the notorious case could again be used for DoS attacks.

From an engineering point of view, a node implementor is left with few options:

- Implement bloom filters for existence. This is difficult, not least because of the problems of reorgs, and the fact that it's difficult to undo bloom filter modifications.
- Implement flattened account databases. This is also difficult, both because of reorgs and also because it needs to be an _additional_ data structure aside from the `trie` -- we need the `trie` for consensus. So it's an extra data structure of around `15G` that needs to be kept in check. This is currently being pursued by the Geth-team.

This EIP proposes a mechanism to alleviate the situation.

## Specification

We define the constant `penalty` as `TBD` (suggested `2000` gas).

For opcodes which access the account trie, whenever the operation is invoked targeting an `address` which does not exist in the trie, then `penalty` gas is deducted from the available `gas`.

### Detailed specification

These are the opcodes which triggers lookup into the main account trie:

| Opcode | Affected | Comment |
| ----- | ---------| ----------|
| BALANCE| Yes | `balance(inexistant_addr)` would incur `penalty`|
| EXTCODEHASH| Yes | `extcodehash(inexistant_addr)` would incur `penalty`|
| EXTCODECOPY| Yes | `extcodecopy(inexistant_addr)` would incur `penalty`|
| EXTCODESIZE| Yes | `extcodesize(inexistant_addr)` would incur `penalty`|
| CALL | Yes| See details below about call variants|
| CALLCODE | Yes| See details below about call variants|
| DELEGATECALL | Yes| See details below about call variants|
| STATICCALL | Yes| See details below about call variants|
| SELFDESTRUCT | No| See details below. |
| CREATE | No | Create destination not explicitly settable, and assumed to be inexistant already.|
| CREATE2 | No | Create destination not explicitly settable, and assumed to be inexistant already.|


### Notes on Call-derivatives


A `CALL` triggers a lookup of the `CALL` destination address. The base cost for `CALL` is at `700` gas. A few other characteristics determine the actual gas cost of a call:

1. If the `CALL` (or `CALLCODE`) transfers value, an additional `9K` is added as cost.
1.1 If the `CALL` destination did not previously exist, an additional `25K` gas is added to the cost.

This EIP adds a second rule in the following way:

2. If the call does _not_ transfer value and the callee does _not_ exist, then `penalty` gas is added to the cost.

In the table below,
- `value` means non-zero value transfer,
- `!value` means zero value transfer,
- `dest` means destination already exists, or is a `precompile`
- `!dest` means destination does not exist and is not a `precompile`

| Op | value,dest| value, !dest |!value, dest| !value, !dest|
| -- | --------- | -- | --| -- |
|CALL | no change | no change| no change| `penalty`|
|CALLCODE | no change | no change| no change| `penalty`|
|DELEGATECALL | N/A | N/A| no change| `penalty` |
|STATICCALL | N/A | N/A| no change| `penalty` |

Whether the rules of this EIP is to be applied for regular ether-sends in `transactions` is TBD. See the 'Backwards Compatibility'-section for some more discussion on that topic.

### Note on `SELFDESTRUCT`

The `SELFDESTRUCT` opcode also triggers an account trie lookup of the `beneficiary`. However, due to the following reasons, it has been omitted from having a `penalty` since it already costs `5K` gas.

### Clarifications:

- The `base` costs of any opcodes are not modified by the EIP.
- The opcode `SELFBALANCE` is not modified by this EIP, regardless of whether the `self` address exists or not.

### Determining the `penalty`

A transaction with `10M` gas can today cause ~`14K` trie lookups.

- A `penalty` of `1000`would lower the number to ~`5800` lookups, `41%` of the original.
- A `penalty` of `2000`would lower the number to ~`3700` lookups, `26%` of the original.
- A `penalty` of `3000`would lower the number to ~`2700` lookups, `20%` of the original.
- A `penalty` of `4000`would lower the number to ~`2100` lookups, `15%` of the original.

There exists a roofing function for the `penalty`. Since the `penalty` is deducted from `gas`, that means that a malicious contract can always invoke a malicious relay to perform the trie lookup.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is worth considering the idea which was brought up as part of "Stateless Summit". Have a separate counter, this time for the entire transaction frame, for the penalty counting. This same counter could be used for witness size counting.

Wouldn't it eliminate this attack vector?


In such a scenario, the `malicious` would spend `~750` gas each call to `relay`, and would need to provide the `relay` with at least `700` gas to do a trie access.

Thus, the effective `cost` would be on the order of `1500`. It can thus be argued that `penalty` above `~800` would not achieve better protection against state-inexistence attacks.


## Rationale

With this scheme, we could continue to price these operations based on the 'normal' usage, but gain protection from attacks that try to maximize disk lookups/cache misses.
This EIP does not modify anything regarding storage trie accesses, which might be relevant for a future EIP. However, there are a few crucial differences.


1. Storage tries are typically small, and there's a high cost to populate a storage trie with sufficient density for it to be in the same league as the account trie.
2. If an attacker wants to use an existing large storage trie, e.g. some popular token, he would typically have to make a `CALL` to cause a lookup in that token -- something like `token.balanceOf(<inexistant-address>)`.
That adds quite a lot of extra gas-impediments, as each `CALL` is another `700` gas, plus gas for arguments to the `CALL`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dependent on the assumption that there are no popular contracts (i.e. ones with an existing large storage trie) with a method that reads a large number of (caller-determined) storage entries? If so, it would be nice to state this assumption explicitly here. (e.g. a theoretical popular token contract with a sumBalanceOf(address[]) method or something like that)

Also if I understood you correctly, I would remove "plus gas for arguments to the CALL", since the CALL opcode doesn't charge gas for input data and presumably the DoS-attacker would generate the cache-miss arguments on-chain rather than passing them as tx data. The point about the minimum of 700 gas per cache-miss is still valid though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess it could be several different ways to do this. However, it's much eaiser to get a 'random' number as a single stack-op, (e.g. gas), than if you have to pre-fill a memory segment with 'random' data and pass it to another method. Every argument you send is another 32 byte memory expanded on the caller-side.

One could experiment more in-depth about exactly what the costs would be, but I don't think it would be marginal.

Also if I understood you correctly, I would remove "plus gas for arguments to the CALL", since the CALL opcode doesn't charge gas for input data

You are correct that it doesn't, however, I was referring to the fact that you need to get your input first to stack, then load into memory. If you do it over and over (as opposed to a one-off multi-parameter thing), then you need to flip some bits between calls, to not hit the same thing over and over again.



## Backwards Compatibility

This EIP requires a hard-fork.

### Ether transfers

A regular `transaction` from one EOA to another, with value, is not affected.

A `transaction` with `0` value, to a destination which does not exist, would be. This scenario is highly unlikely to matter, since such a `transaction` is useless -- even during success, all it would accomplish would be to spend some `gas`. With this EIP, it would potentially spend some more gas.

### Layer 2

Regarding layer-2 backward compatibility, this EIP is a lot less disruptive than EIPs which modify the `base` cost of an opcode. For state accesses, there are
seldom legitimate scenarios where

1. A contract checks `BALANCE`/`EXTCODEHASH`/`EXTCODECOPY`/`EXTCODESIZE` of another contract `b`, _and_,
2. If such `b` does not exist, continues the execution

#### Solidity remote calls
Example: When a remote call is made in Solidity:
```
recipient.invokeMethod(1)
```

- Solidity does a pre-flight `EXTCODESIZE` on `recipient`.
- If the pre-flight check returns `0`, then `revert(0,0)` is executed, to stop the execution.
- If the pre-flight check returns non-zero, then the execution continues and the `CALL` is made.

With this EIP in place, the 'happy-path' would work as previously, and the 'notorious'-path where `recipient` does not exist would cost an extra `penalty` gas, but the actual execution-flow would be unchanged.

#### ERC223

[ERC223 Token Standard](https://github.com/ethereum/EIPs/issues/223) is, at the time of writing, marked as 'Draft', but is deployed and in use on mainnet today.

The ERC specifies that when a token `transfer(_to,...)` method is invoked, then:

> This function must transfer tokens and invoke the function `tokenFallback (address, uint256, bytes)` in `_to`, if `_to` is a contract.
> ...
> NOTE: The recommended way to check whether the `_to` is a contract or an address is to assemble the code of `_to`. If there is no code in `_to`, then this is an externally owned address, otherwise it's a contract.

The reference implementations from [Dexaran](https://github.com/Dexaran/ERC223-token-standard/tree/development/token/ERC223) and [OpenZeppelin](https://github.com/OpenZeppelin/openzeppelin-contracts/blob/1bc923b6a222e79a90f20305a459b0ee779eb918/contracts/token/ERC721/ERC721.sol#L499) both implement the `isContract` check using an `EXTCODESIZE` invocation.

This scenario _could_ be affected, but in practice should not be. Let's consider the possibilities:

1. The `_to` is a contract: Then `ERC223` specifies that the function `tokenFallback(...)` is invoked.
- The gas expenditure for that call is at least`700` gas.
- In order for the `callee` to be able to perform any action, best practice it to ensure that it has at least `2300` gas along with the call.
- In summary: this path requires there to be least `3000` extra gas available (which is not due to any `penalty`)
2. The `_to` exists, but is no contract. The flow exits here, and is not affected by this EIP
2. The `_to` does not exist: A `penalty` is deducted.

In summary, it would seem that `ERC223` should not be affected, as long as the `penalty` does not go above around `3000` gas.


### Other

The contract [`Dentacoin`](https://etherscan.io/address/0x08d32b0da63e2c3bcf8019c9c5d849d7a9d791e6#code) would be affected.

```
function transfer(address _to, uint256 _value) returns (bool success) {
... // omitted for brevity
if (balances[msg.sender] >= _value && balances[_to] + _value > balances[_to]) { // Check if sender has enough and for overflows
balances[msg.sender] = safeSub(balances[msg.sender], _value); // Subtract DCN from the sender

if (msg.sender.balance >= minBalanceForAccounts && _to.balance >= minBalanceForAccounts) { // Check if sender can pay gas and if recipient could
balances[_to] = safeAdd(balances[_to], _value); // Add the same amount of DCN to the recipient
Transfer(msg.sender, _to, _value); // Notify anyone listening that this transfer took place
return true;
} else {
balances[this] = safeAdd(balances[this], DCNForGas); // Pay DCNForGas to the contract
balances[_to] = safeAdd(balances[_to], safeSub(_value, DCNForGas)); // Recipient balance -DCNForGas
Transfer(msg.sender, _to, safeSub(_value, DCNForGas)); // Notify anyone listening that this transfer took place

if(msg.sender.balance < minBalanceForAccounts) {
if(!msg.sender.send(gasForDCN)) throw; // Send eth to sender
}
if(_to.balance < minBalanceForAccounts) {
if(!_to.send(gasForDCN)) throw; // Send eth to recipient
}
}
} else { throw; }
}
```

The contract checks `_to.balance >= minBalanceForAccounts`, and if the `balance` is too low, some `DCN` is converted to `ether` and sent to the `_to`. This is a mechanism to ease on-boarding, whereby a new user who has received some `DCN` can immediately create a transaction.

Before this EIP:

- When sending `DCN` to a non-existing address, the additional `gas` expenditure would be:
- `9000` for an ether-transfer
- `25000` for a new account-creation
- (`2300` would be refunded to the caller later)
- A total runtime `gas`-cost of `34K` gas would be required to handle this case.

After this EIP:

- In addition to the `34K` an additional `penalty` would be added.
- Possibly two, since the reference implementation does the balance-check twice, but it's unclear whether the compiled code would indeed perform the check twice.
- A total runtime `gas`-cost of `34K+penalty` (or `34K + 2 * penalty`) would be required to handle this case.

It can be argued that the extra penalty of `2-3K` gas can be considered marginal in relation to the other `34K` gas already required to handle this.

## Test Cases

The following cases need to be considered and tested:

- That during creation of a brand new contract, within the constructor, the `penalty` should not be applied for calls concerning the self-address.
- TBD: How the `penalty` is applied in the case of a contract which has performed a `selfdestruct`
- a) previously in the same call-context,
- b) previously in the same transaction,
- c) previously in the same block,
For any variant of `EXTCODEHASH(destructed)`, `CALL(destructed)`, `CALLCODE(destructed)` etc.
- The effects on a `transaction` with `0` value going to a non-existant account.

## Security Considerations

See 'Backwards Compatibility'

## Implementation

Not yet available.

## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).