Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP-2583: Penalty for account trie misses #2583

Merged
merged 7 commits into from
Aug 29, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
295 changes: 295 additions & 0 deletions EIPS/eip-2583.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
---
eip: 2583
title: Penalty for account trie misses
author: Martin Holst Swende (@holiman)
discussions-to: https://ethereum-magicians.org/t/eip-2583-penalties-for-trie-misses/4190
status: Draft
type: Standards Track
category: Core
created: 2020-02-21
---


## Simple Summary

This EIP propose to introduce a gas penalty for opcodes which access the account for trie non-existent accounts.
Copy link
Contributor

@MicahZoltu MicahZoltu Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EIPs don't propose, they assert. One may propose an EIP be adopted (in some other venue than the EIP document), but the EIP itself is an assertion of some change.


## Abstract

This EIP proposes to add a gas penalty for accesses to the account trie, where the address being looked up does not exist. Non-existing accounts can be used in
DoS attacks, since they bypass cache mechanisms, thus creating a large discrepancy between 'normal' mode of execution and 'worst-case' execution of an opcode.
Comment on lines +19 to +20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this feels like motivation, not abstract. Abstract should be a terse human-friendly version of the specification section.


## Motivation

As the ethereum trie becomes more and more saturated, the number of disk lookups that a node is required to do in order to access a piece of state increases too. This means that checking e.g. `EXTCODEHASH` of an account at block `5` was _inherently_ a cheaper operation that it is at, say `8.5M`.

From an implementation perspective, a node can (and does) use various caching mechanisms to cope with the problem, but there's an inherent problem with caches: when they yield a 'hit', they're great, but when they 'miss', they're useless.

This is attackable. By forcing a node to lookup non-existent keys, an attacker can maximize the number of disk lookups.
Sidenote: even if the 'non-existence' is cached, it's trivial to use a new non-existent key the next time, and never hit the same non-existent key again. Thus, caching 'non-existence' might be dangerous, since it will evict 'good' entries.

So far, the attempts to handle this problem has been in raising the gas cost, e.g. [EIP-150](https://eips.ethereum.org/EIPS/eip-150), [EIP-1884](https://eips.ethereum.org/EIPS/eip-1884).

However, when determining gas-costs, a secondary problem that arises due to the large discrepancy between 'happy-path' and 'notorious path' -- how do we determine the pricing?

- The 'happy-path', assuming all items are cached?
- Doing so would that would underprice all trie-accesses, and could be DoS-attacked.
- The 'normal' usage, based on benchmarks of actual usage?
- This is basically what we do now, but that means that intentionally notorious executions are underpriced -- which constitutes a DoS vulnerability.
- The 'paranoid' case: price everything as if caching did not exist?
- This would severely harm basically every contract due to the gas-cost increase. Also, if the gas limits were raised in order to allow the same amount of computation as before, the notorious case could again be used for DoS attacks.

From an engineering point of view, a node implementor is left with few options:

- Implement bloom filters for existence. This is difficult, not least because of the problems of reorgs, and the fact that it's difficult to undo bloom filter modifications.
- Implement flattened account databases. This is also difficult, both because of reorgs and also because it needs to be an _additional_ data structure aside from the `trie` -- we need the `trie` for consensus. So it's an extra data structure of around `15G` that needs to be kept in check. This is currently being pursued by the Geth-team.

This EIP proposes a mechanism to alleviate the situation.

## Specification

We define the constant `penalty` as `TBD` (suggested `2000` gas).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We define the constant `penalty` as `TBD` (suggested `2000` gas).
We define the constant `penalty` as `2000`.

Recommended during Draft (not required until Last Call): EIPs are assertions, put the number you want in and you can always change it later (while it is in draft). Alternatively, if you don't want a sticky default just put in TBD and discuss 2000 in the discussion-to link.


For opcodes which access the account trie, whenever the operation is invoked targeting an `address` which does not exist in the trie, then `penalty` gas is deducted from the available `gas`.

### Detailed specification

These are the opcodes which triggers lookup into the main account trie:

| Opcode | Affected | Comment |
| ----- | ---------| ----------|
| BALANCE| Yes | `balance(inexistent_addr)` would incur `penalty`|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is inexistent a word? I thought only nonexistent was? Spell checker says no, Bing thinks I want to translate from French when I type it in, but there are a couple online dictionaries that have an entry for inexistent.

Recommend switching to nonexistent unless there is a strong belief that this is in fact a word and means something different than nonexistent.

| EXTCODEHASH| Yes | `extcodehash(inexistent_addr)` would incur `penalty`|
| EXTCODECOPY| Yes | `extcodecopy(inexistent_addr)` would incur `penalty`|
| EXTCODESIZE| Yes | `extcodesize(inexistent_addr)` would incur `penalty`|
| CALL | Yes| See details below about call variants|
| CALLCODE | Yes| See details below about call variants|
| DELEGATECALL | Yes| See details below about call variants|
| STATICCALL | Yes| See details below about call variants|
| SELFDESTRUCT | No| See details below. |
| CREATE | No | Create destination not explicitly settable, and assumed to be inexistent already.|
| CREATE2 | No | Create destination not explicitly settable, and assumed to be inexistent already.|


### Notes on Call-derivatives


Comment on lines +75 to +76
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: should be 1 newline or 0 (depending on personal style, have seen both used in this repository).

A `CALL` triggers a lookup of the `CALL` destination address. The base cost for `CALL` is at `700` gas. A few other characteristics determine the actual gas cost of a call:

1. If the `CALL` (or `CALLCODE`) transfers value, an additional `9K` is added as cost.
1.1 If the `CALL` destination did not previously exist, an additional `25K` gas is added to the cost.

This EIP adds a second rule in the following way:

2. If the call does _not_ transfer value and the callee does _not_ exist, then `penalty` gas is added to the cost.

In the table below,
- `value` means non-zero value transfer,
- `!value` means zero value transfer,
- `dest` means destination already exists, or is a `precompile`
- `!dest` means destination does not exist and is not a `precompile`

| Op | value,dest| value, !dest |!value, dest| !value, !dest|
| -- | --------- | -- | --| -- |
|CALL | no change | no change| no change| `penalty`|
|CALLCODE | no change | no change| no change| `penalty`|
|DELEGATECALL | N/A | N/A| no change| `penalty` |
|STATICCALL | N/A | N/A| no change| `penalty` |

Whether the rules of this EIP is to be applied for regular ether-sends in `transactions` is TBD. See the 'Backwards Compatibility'-section for some more discussion on that topic.

### Note on `SELFDESTRUCT`

The `SELFDESTRUCT` opcode also triggers an account trie lookup of the `beneficiary`. However, due to the following reasons, it has been omitted from having a `penalty` since it already costs `5K` gas.

### Clarifications:

- The `base` costs of any opcodes are not modified by the EIP.
- The opcode `SELFBALANCE` is not modified by this EIP, regardless of whether the `self` address exists or not.

### Determining the `penalty`

A transaction with `10M` gas can today cause ~`14K` trie lookups.

- A `penalty` of `1000`would lower the number to ~`5800` lookups, `41%` of the original.
- A `penalty` of `2000`would lower the number to ~`3700` lookups, `26%` of the original.
- A `penalty` of `3000`would lower the number to ~`2700` lookups, `20%` of the original.
- A `penalty` of `4000`would lower the number to ~`2100` lookups, `15%` of the original.

There exists a roofing function for the `penalty`. Since the `penalty` is deducted from `gas`, that means that a malicious contract can always invoke a malicious relay to perform the trie lookup. Let's refer to this as the 'shielded relay' attack.

In such a scenario, the `malicious` would spend `~750` gas each call to `relay`, and would need to provide the `relay` with at least `700` gas to do a trie access.

Thus, the effective `cost` would be on the order of `1500`. It can thus be argued that `penalty` above `~800` would not achieve better protection against state-inexistence attacks.
Comment on lines +112 to +123
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in the Rationale section. The rationale is where you discuss why you chose the constants you chose. The specification section should merely assert that the constant is X and it is applied here, here, and here.



## Rationale

With this scheme, we could continue to price these operations based on the 'normal' usage, but gain protection from attacks that try to maximize disk lookups/cache misses.
This EIP does not modify anything regarding storage trie accesses, which might be relevant for a future EIP. However, there are a few crucial differences.


1. Storage tries are typically small, and there's a high cost to populate a storage trie with sufficient density for it to be in the same league as the account trie.
2. If an attacker wants to use an existing large storage trie, e.g. some popular token, he would typically have to make a `CALL` to cause a lookup in that token -- something like `token.balanceOf(<inexistent-address>)`.
That adds quite a lot of extra gas-impediments, as each `CALL` is another `700` gas, plus gas for arguments to the `CALL`.


## Backwards Compatibility

This EIP requires a hard-fork.

### Ether transfers

A regular `transaction` from one EOA to another, with value, is not affected.

A `transaction` with `0` value, to a destination which does not exist, would be. This scenario is highly unlikely to matter, since such a `transaction` is useless -- even during success, all it would accomplish would be to spend some `gas`. With this EIP, it would potentially spend some more gas.

### Layer 2

Regarding layer-2 backward compatibility, this EIP is a lot less disruptive than EIPs which modify the `base` cost of an opcode. For state accesses, there are
seldom legitimate scenarios where

1. A contract checks `BALANCE`/`EXTCODEHASH`/`EXTCODECOPY`/`EXTCODESIZE` of another contract `b`, _and_,
2. If such `b` does not exist, continues the execution

#### Solidity remote calls
Example: When a remote call is made in Solidity:
```
recipient.invokeMethod(1)
```

- Solidity does a pre-flight `EXTCODESIZE` on `recipient`.
- If the pre-flight check returns `0`, then `revert(0,0)` is executed, to stop the execution.
- If the pre-flight check returns non-zero, then the execution continues and the `CALL` is made.

With this EIP in place, the 'happy-path' would work as previously, and the 'notorious'-path where `recipient` does not exist would cost an extra `penalty` gas, but the actual execution-flow would be unchanged.

#### ERC223

[ERC223 Token Standard](https://github.com/ethereum/EIPs/issues/223) is, at the time of writing, marked as 'Draft', but is deployed and in use on mainnet today.

The ERC specifies that when a token `transfer(_to,...)` method is invoked, then:

> This function must transfer tokens and invoke the function `tokenFallback (address, uint256, bytes)` in `_to`, if `_to` is a contract.
> ...
> NOTE: The recommended way to check whether the `_to` is a contract or an address is to assemble the code of `_to`. If there is no code in `_to`, then this is an externally owned address, otherwise it's a contract.

The reference implementations from [Dexaran](https://github.com/Dexaran/ERC223-token-standard/tree/development/token/ERC223) and [OpenZeppelin](https://github.com/OpenZeppelin/openzeppelin-contracts/blob/1bc923b6a222e79a90f20305a459b0ee779eb918/contracts/token/ERC721/ERC721.sol#L499) both implement the `isContract` check using an `EXTCODESIZE` invocation.

This scenario _could_ be affected, but in practice should not be. Let's consider the possibilities:

1. The `_to` is a contract: Then `ERC223` specifies that the function `tokenFallback(...)` is invoked.
- The gas expenditure for that call is at least`700` gas.
- In order for the `callee` to be able to perform any action, best practice it to ensure that it has at least `2300` gas along with the call.
- In summary: this path requires there to be least `3000` extra gas available (which is not due to any `penalty`)
2. The `_to` exists, but is no contract. The flow exits here, and is not affected by this EIP
2. The `_to` does not exist: A `penalty` is deducted.

In summary, it would seem that `ERC223` should not be affected, as long as the `penalty` does not go above around `3000` gas.


### Other

The contract [`Dentacoin`](https://etherscan.io/address/0x08d32b0da63e2c3bcf8019c9c5d849d7a9d791e6#code) would be affected.

```
function transfer(address _to, uint256 _value) returns (bool success) {
... // omitted for brevity
if (balances[msg.sender] >= _value && balances[_to] + _value > balances[_to]) { // Check if sender has enough and for overflows
balances[msg.sender] = safeSub(balances[msg.sender], _value); // Subtract DCN from the sender

if (msg.sender.balance >= minBalanceForAccounts && _to.balance >= minBalanceForAccounts) { // Check if sender can pay gas and if recipient could
balances[_to] = safeAdd(balances[_to], _value); // Add the same amount of DCN to the recipient
Transfer(msg.sender, _to, _value); // Notify anyone listening that this transfer took place
return true;
} else {
balances[this] = safeAdd(balances[this], DCNForGas); // Pay DCNForGas to the contract
balances[_to] = safeAdd(balances[_to], safeSub(_value, DCNForGas)); // Recipient balance -DCNForGas
Transfer(msg.sender, _to, safeSub(_value, DCNForGas)); // Notify anyone listening that this transfer took place

if(msg.sender.balance < minBalanceForAccounts) {
if(!msg.sender.send(gasForDCN)) throw; // Send eth to sender
}
if(_to.balance < minBalanceForAccounts) {
if(!_to.send(gasForDCN)) throw; // Send eth to recipient
}
}
} else { throw; }
}
```

The contract checks `_to.balance >= minBalanceForAccounts`, and if the `balance` is too low, some `DCN` is converted to `ether` and sent to the `_to`. This is a mechanism to ease on-boarding, whereby a new user who has received some `DCN` can immediately create a transaction.

Before this EIP:

- When sending `DCN` to a non-existing address, the additional `gas` expenditure would be:
- `9000` for an ether-transfer
- `25000` for a new account-creation
- (`2300` would be refunded to the caller later)
- A total runtime `gas`-cost of `34K` gas would be required to handle this case.

After this EIP:

- In addition to the `34K` an additional `penalty` would be added.
- Possibly two, since the reference implementation does the balance-check twice, but it's unclear whether the compiled code would indeed perform the check twice.
- A total runtime `gas`-cost of `34K+penalty` (or `34K + 2 * penalty`) would be required to handle this case.

It can be argued that the extra penalty of `2-3K` gas can be considered marginal in relation to the other `34K` gas already required to handle this.

## Test Cases

The following cases need to be considered and tested:

- That during creation of a brand new contract, within the constructor, the `penalty` should not be applied for calls concerning the self-address.
- TBD: How the `penalty` is applied in the case of a contract which has performed a `selfdestruct`
- a) previously in the same call-context,
- b) previously in the same transaction,
- c) previously in the same block,
For any variant of `EXTCODEHASH(destructed)`, `CALL(destructed)`, `CALLCODE(destructed)` etc.
- The effects on a `transaction` with `0` value going to a non-existent account.

## Security Considerations

See 'Backwards Compatibility'

## Implementation

Not yet available.

## Alternative variants

### Alt 1: Insta-refunds

Bump all trie accesses with `penalty`. `EXTCODEHASH` becomes `2700` instead of `700`.
- If a trie access hit an existing item, immediately refund penalty (`2K` )

Upside:

- This eliminates the 'shielded relay' attack

Downside:

- This increases the up-front cost of many ops (CALL/EXTCODEHASH/EXTCODESIZE/STATICCALL/EXTCODESIZE etc)
- Which may break many contracts.

### Alt 2: Parent bail

Use `penalty` as described, but if a child context goes OOG on the `penalty`, then the remainder is subtracted from the
parent context (recursively).

Upside:

- This eliminates the 'shielded relay' attack

Downside:

- This breaks the current invariant that a child context is limited by whatever `gas` was allocated for it.
- However, the invariant is not _totally_ thrown out, the new invariant becomes that it is limited to `gas + penalty`.
- This can be seen as 'messy' -- since only _some_ types of OOG (penalties) becomes passed up the call chain, but not others, e.g. OOG due to trying
to allocate too much memory. There is a distinction, however:
- Gas-costs which arise due to not-yet-consumed resources do not get passed to parent. For example: a huge allocation is not actually performed if there is insufficient gas.
- Whereas gas-costs which arise due to already-consumed resources _do_ get passed to parent; in this case the penalty is paid post-facto for a trie iteration.
Comment on lines +259 to +291
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this should either move to discussion-to link or be restructured to fit into the Rationale section (the section for "why did we do X instead of Y"). EIPs are technical documents that define a specification, they aren't meant to be discussion platforms or a log of discussions on a topic.



## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).