Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skinny CREATE2 #1014

Merged
merged 9 commits into from Jun 18, 2018

Conversation

Projects
None yet
@vbuterin
Copy link
Collaborator

vbuterin commented Apr 20, 2018

Adds a new opcode at 0xf5, which takes 4 stack arguments: endowment, memory_start, memory_length, salt. Behaves identically to CREATE, except using sha3(msg.sender ++ salt ++ init_code)[12:] instead of the usual sender-and-nonce-hash as the address the contract is created.

Update 2018.09.02: Version 2: use sha3(msg.sender ++ salt ++ sha3(init_code))[12:]

vbuterin added some commits Apr 20, 2018

@emansipater

This comment has been minimized.

Copy link

emansipater commented Apr 20, 2018

I am extremely in support of this. We NEED deterministic, immutable addresses for so many applications (state channels plus offline multisig plus a ton more).

@vbuterin vbuterin changed the title Create Skinny_CREATE2.md Skinny CREATE2 Apr 20, 2018

@snario

This comment has been minimized.

Copy link

snario commented Apr 20, 2018

This EIP allows for a significant performance increase in state channels by removing the need for an additional contract to allow for counterfactual addressing. I'm highly in favour of accepting it as soon as possible. :)

@SilentCicero

This comment has been minimized.

Copy link

SilentCicero commented Apr 20, 2018

I am in complete support of this, there are countless governance use-cases which need this to be efficient and successful. Hurray!


### Specification

Adds a new opcode at 0xf5, which takes 4 stack arguments: endowment, memory_start, memory_length, salt. Behaves identically to CREATE, except using `sha3(msg.sender ++ salt ++ init_code)[12:]` instead of the usual sender-and-nonce-hash as the address where the contract is initialized at.

This comment has been minimized.

@LefterisJP

LefterisJP Apr 20, 2018

Contributor

@vbuterin The salt can have any arbitrary length?

This comment has been minimized.

@holiman

holiman Apr 20, 2018

Contributor

Existing scheme looks like this for low nonces (nonce 1 below):

sha3(0xd6 ++ 0x94 ++ sender ++ 0x01)[12:]

So if you mine an address starting with d694, it seems possible to create destination collisions. Using larger nonces give more room for collisions.

This means that you could say "look, this contract can only be create2:ed with the initcode x. But in fact, you can create arbitrary contracts there using old-style create.

I may be wrong, thinking while writing here... A trivial way to get around this would be to prefix the entire thing with something that is invalid rlp.

And @LefterisJP @vbuterin I assume the salt is fixed-size, and the size of that would naturally affect the ability to do the attack described here.

This comment has been minimized.

@vbuterin

vbuterin Apr 21, 2018

Author Collaborator

@vbuterin The salt can have any arbitrary length?

It's a stack argument, hence 32 bytes.

A trivial way to get around this would be to prefix the entire thing with something that is invalid rlp

If we want, we can prefix with 0xff; the only valid RLP that starts with 0xff would be petaby long.

@@ -0,0 +1,16 @@
```

This comment has been minimized.

@Arachnid

Arachnid Apr 20, 2018

Collaborator

Please update this to the new format: frontmatter should start and end with --- on its own line, and keys are lower-case.

@@ -0,0 +1,16 @@
```
EIP: <to be assigned>

This comment has been minimized.

@Arachnid

Arachnid Apr 20, 2018

Collaborator

Please number this 1014 and rename the file to eip-1014.md.

This comment has been minimized.

@holiman

holiman Apr 21, 2018

Contributor

0xff would work, but 0x01 would be not even theoretically possible to collide (on the preimage side)

EIP: <to be assigned>
Title: Skinny CREATE2
Author: Vitalik Buterin
Category: Core

This comment has been minimized.

@Arachnid

Arachnid Apr 20, 2018

Collaborator

This should specify Type: Standards Track, as well.

```
EIP: <to be assigned>
Title: Skinny CREATE2
Author: Vitalik Buterin

This comment has been minimized.

@Arachnid

Arachnid Apr 20, 2018

Collaborator

Please include a username or email address (parentheses for Github username) or .

This comment has been minimized.

@SilentCicero

SilentCicero Apr 20, 2018

lol. @vbuterin please follow protocol around here... just because you created Ethereum doesn't mean you get to break EIP specification.

vbuterin added some commits Apr 21, 2018

---
eip: 1014
title: Skinny CREATE2
author: Vitalik Buterin (vbuterin)

This comment has been minimized.

@nicksavers

nicksavers Apr 21, 2018

Collaborator

GitHub username is recognized when adding the @ in front of it. Like '@vbuterin'

@CoinHodl

This comment has been minimized.

Copy link

CoinHodl commented Apr 22, 2018

Make state channels great again👍🏽

@Arachnid

This comment has been minimized.

Copy link
Collaborator

Arachnid commented Apr 22, 2018

Needs a discussions-to URL, but otherwise good to go.

@SilentCicero

This comment has been minimized.

Copy link

SilentCicero commented Apr 22, 2018

@chriseth

This comment has been minimized.

Copy link
Contributor

chriseth commented Jun 15, 2018

What happens on address collisions? How exactly is an address collision defined (existing code, existing balance, pre-existing code, pre-existing balance, etc.)?

Also, in which way is this different from the earlier proposal about create2?

@AlexeyAkhunov

This comment has been minimized.

Copy link
Contributor

AlexeyAkhunov commented Jun 15, 2018

@karalabe Has a good point here: ethereum/pm#44

@emansipater @snario Will state channel mechanisms be able to detect if the code of counterfactual contract will behave differently depending on the environment? Because currently contracts can do anything in their constructor, which is the code that runs during the instantiation.

(Copying his comment below).

Regarding CREATE2, do we have any restriction on the execution context, or it remains the same? The reason I'm asking is because the EIP states:

Allows interactions to be made with addresses that do not exist yet on-chain but can be relied on to only possibly eventually contain code that has been created by a particular piece of init code.
Even though this statement is true, the final deployed contract can behave arbitrarily differently depending on who deploys it and when (since the execution environment changes). This is in itself fine, but I think it's an important limitation of the opcode to explicitly highlight, otherwise we'll see many many abuses around this.

Alternatively we could enforce CREATE2 to not have access to environmentals, but that might be an ugly complication.

Just food for thought.

@AlexeyAkhunov

This comment has been minimized.

Copy link
Contributor

AlexeyAkhunov commented Jun 15, 2018

@chriseth As far as I understand, now "msg.sender" is kind of in full control of the contract address. And I assume that CREATE2 will only create contract if it does not exist (so not to break the invariant of contract code not changing after initial deployment). In the state channel settings, possibility of doing CREATE2 is a leverage that participants use against each other. So they will not compromise their leverage by producing an address collision.

Arachnid added some commits Jun 18, 2018

@Arachnid Arachnid merged commit b582c65 into master Jun 18, 2018

0 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/travis-ci/push The Travis CI build is in progress
Details

@Arachnid Arachnid deleted the vbuterin-patch-2 branch Jun 18, 2018

@SergioDemianLerner

This comment has been minimized.

Copy link

SergioDemianLerner commented Aug 19, 2018

From the point of view of a hardware wallet, a factory contract is not better than a contract with unknown code because the problem is only moved on level deeper but now is worse: how can it validate that the code of the factory contract is correct? It would need to read the codehash of the accountstate of the Factory in certain given blokchain, but it can't decide if the blockchain given is the correct one.

I would propose that CREATE2 adds a gas cost per byte hashed in init_code (e.g. 6 gas for every 32 bytes, plus 30) [cost edited]

@holiman

This comment has been minimized.

Copy link
Contributor

holiman commented Aug 20, 2018

From the point of view of a hardware wallet, a factory contract is not better than a contract with unknown code because the problem is only moved on level deeper

By this, do you mean that the CREATE2 opcode should not use address/msg.sender at all? If it did not, then the contract created could be created by anyone -- e.g using a transaction with an empty to (invoking CREATE2 within the initcode of a 'failed' CREATE). Or do you mean something else?

Because as the EIP is written now, there is no getting around using a factory for the counterfactual usecase, afaict.

I would propose that CREATE2 adds a gas cost per byte hashed in init_code (e.g. 12 gas for every 32 bytes, plus 601)

Makes sense, although gascost per 32 bytes for sha3 is only 6, not 12, so why not just use the same?

@ameensol

This comment has been minimized.

Copy link

ameensol commented Aug 20, 2018

Chiming in because @lrettig let me know this discussion was taking place. I'm not sure how L4 plan on doing "counterfactual" in practice, but we've found that deploying a new contract for each individual dispute would be pretty expensive, especially if the contracts are complex (e.g. @funfair-tech casino games). Instead, we think it makes more sense to reuse contracts already deployed onchain for disputes.

Maybe I'm missing something, but this seems like a bit of a distraction. @emansipater @snario @SilentCicero could you please expand on why this is valuable—are you assuming contract deployment on every dispute? What other use cases does this enable/optimize?

cc @ConnextProject @nginnever

@ArjunBhuptani

This comment has been minimized.

Copy link

ArjunBhuptani commented Aug 20, 2018

Interested to learn more about tradeoffs here too!

FYI In our construction, we're working towards a dispute registry with predeployed dispute contracts. When opening a "thread" (virtual channel), participants can sign a hash associated with the specific dispute they want to reference.

At the very least this means that participants around a specific use-case don't need to deploy duplicate contracts for byzantine cases and ideally this also means that we can have community-sourced standard disputes for various use-cases. This also removes the need for deterministic addressing I think?

@IIIIllllIIIIllllIIIIllllIIIIllllIIIIll

This comment has been minimized.

Copy link

IIIIllllIIIIllllIIIIllllIIIIllllIIIIll commented Aug 22, 2018

Few points to talk about here:

  1. There is no advantage (AFAICT) of the current CREATE address computation scheme, in which the address depends on account nonce, over schemes where the address does not depend on account nonce.

  2. Using commitments to deploy contracts ("deployment commitment") does not imply no opportunity for code reuse - the deployed contract can still share code with other instances and marginally be quite small.

  3. In the case that one wants to enter ad-hoc/custom/private contracts in a channel, the only efficient way to do so is deployment commitment.

  4. Our specific metachannel constructions rely on deployment commitments; I haven't seen any constructions that achieve time-based lockup as well as constant locktime without deployment commitment, although this might well be just from people not trying.

  5. To be clear, we're not blocked by this, since we can emulate deterministic addresses at an application level.

  6. A system of multiple contracts which share code but have separate storage can be restructured into a single contract with an "indexed" storage scheme, and under the current gas schedule this is often more efficient, however I think this is a non-permanent accidental feature of how storage is charged (e.g. the different costs charged for SSTORE vs contract data, and the 32k upfront cost of creating a contract). Certainly research discussions like https://ethresear.ch/t/cross-shard-contract-yanking/1450 and the various rent discussions suggest to me that we eventually want to not have a penalty for per-user contracts over giant contracts shared by many people.

  7. There are applications that will benefit from deterministic addressing, but haven't realised it yet. An example is airgapped cold storage and hardware wallets; say you're on a cold storage machine and want to deploy a new policy. You have literally no idea what state the blockchain is in, because you're airgapped. You need deterministic contracts to know what you are authorising sending money to, etc.

  8. Jeff Coleman thinks existing teams like Gnosis and Plasma implementers will find this useful. Personally I think we should get their feedback in order to work out the details and make sure we didn't miss anything.

@kaibakker

This comment has been minimized.

Copy link

kaibakker commented Aug 25, 2018

Am I correct in comparing this to p2sh (Pay to script hash) like functionality? Where value can be allocated to a specific script hash instead of the full script?

@SergioDemianLerner

This comment has been minimized.

Copy link

SergioDemianLerner commented Aug 25, 2018

Yes, I think it's a good comparison.

@Arachnid

This comment has been minimized.

Copy link
Collaborator

Arachnid commented Aug 28, 2018

From the point of view of a hardware wallet, a factory contract is not better than a contract with unknown code because the problem is only moved on level deeper but now is worse: how can it validate that the code of the factory contract is correct? It would need to read the codehash of the accountstate of the Factory in certain given blokchain, but it can't decide if the blockchain given is the correct one.

Either way the code of the deployed contract has to be verified, which is out of scope for the hardware wallet. I don't see how using a factory contract makes this any worse.

@SergioDemianLerner

This comment has been minimized.

Copy link

SergioDemianLerner commented Aug 28, 2018

@Arachnid Let's say the hardware wallet is built to work with certain known wallets, such as Gnosis (the same works if you register the code hash into the hardware wallet later, and the HW shows this hash in the display for you to check).
When a hardware wallet needs to deploy a new Gnosis-wallet contract, it can use CREATE2 and pass the sha3 of the EVM code. It can have this hash pre-stored. The user can also validate with other offhchain tools that the created wallet is in fact a Gnosis-wallets (by recomputing the created contract address)
Now you say the same would work if the hardware knows the factory contract address, let's say it's hard-coded. So the hardware wallet can create a message that requests the factory contract to create the final Gnosis-wallet. But how does the hardware wallet knows the address ? It must receive it from the outside world. So all verification rests on humans to get information onchain. The hardware wallet cannot help. This is very bad for the creation of institutional multi-sig wallets, where you want the whole process to be auditable and repeatable, and not require a step where you must download and sync a full node, or where you must peek into etherscan.

@SergioDemianLerner

This comment has been minimized.

Copy link

SergioDemianLerner commented Sep 1, 2018

As @vbuterin still hasn't reviewed my proposed change to this proposal, I will try to emphasize more the benefits of using sha3(init_code) instead of just init_code.

In the future Ethereum might want to do parallel transaction processing. Check for example https://github.com/rsksmart/RSKIPs/blob/master/IPs/RSKIP04.md
Even if you may don't like this RSKIP04 proposal, it's clear that if you implement parallel transaction processing many contracts will need to create child contracts to distribute the load without becoming a bottleneck in transaction parallelization. In those cases contracts would need to dynamically compute a child-contract address without accessing a local mapping (which would make them become a parallelization bottleneck). CREATE2 seems ideal to solve this problem IF we make the amount of data to dynamically hash fixed-length.

For example, our RSK's RSKIP59 proposal won't be needed anymore:
https://github.com/rsksmart/RSKIPs/blob/master/IPs/RSKIP59.md

Think of an ERC20 contract that, depending on the source and destination address of a transfer(), calls two per-user child-contracts (where the EIP-1024 nonce is the src/dst address) that subtract and increment child contract storage cells where per-user balances are stored.

Even without taking into account parallelization, I'm sure there are many more examples where dynamically computing the destination address is highly more efficient than getting it from a mapping.

(It could be even better, to hash sha3(sha3(msg.sender ++ sha3(init_code)) ++ salt), so that it requires even less gas to dynamically compute the child contract address. But this is obviously too complicated for so little benefit. And also there is the problem of size collision with other types of addresses )

@holiman

This comment has been minimized.

Copy link
Contributor

holiman commented Sep 1, 2018

@SergioDemianLerner oh we've already decided to use the sha3 of the initcode, the EIP hasn't been updated yet. (decided on a coredev-call a few weeks ago, and I posted a comment here to that effect)

@holiman

This comment has been minimized.

Copy link
Contributor

holiman commented Sep 1, 2018

I'll submit a PR next week to update this accordingly

@vbuterin

This comment has been minimized.

Copy link
Collaborator Author

vbuterin commented Sep 2, 2018

I'm ok with sha3(msg.sender ++ sha3(init_code) ++ salt); can add to the EIP.

@holiman

This comment has been minimized.

Copy link
Contributor

holiman commented Sep 2, 2018

@vbuterin please see #1014 (comment) . Thanks for updating, but you forgot the 0xFF

@MoonMissionControl

This comment has been minimized.

Copy link

MoonMissionControl commented Sep 14, 2018

@vbuterin @holiman

Since msg.sender, init_code, salt are all stack items what happens if I try to deploy a contract twice? What if I selfdestruct a contract and then recreate it? It will be possible to use the same address since it can be regenerated?

EDIT: What happens to the constructor if I recreate an already deployed contract? Does it re-run twice? (If it actually gets deployed I assume the constructor indeed runs twice)

@SergioDemianLerner

This comment has been minimized.

Copy link

SergioDemianLerner commented Sep 23, 2018

@MoonMissionControl I suppose that without any other change in the EVM, a SELFDESTRUCT will take precedence and the contract will be destructed at the end of the processing of the transaction.
E.g. After CREATE2 SELFDESTRUCT CREATE2 the contract will be destructed.
It's a bit weird, but it's ok as long as the semantic is clear.
A better semantic would be that you can't CREATE2 a contract that has been destructed in the same transaction.

Regarding multiple creations, I think that is a real problem and is similar to the problem where there is a previous balance on an account which then turns into a contract.
But in the case of double-creation is worse because the constructor may assume some fields are zero, which may not be, and then the newly created contract would be in a invalid state.
I would suggest that double-creation is avoided. It should make CREATE2 fail and return 0.

If this is not enforced by the EVM, then a way to prevent this at the application level would be to set a storage cell "initialized" to true just after the constructor finishes and check for this in the first line of the constructor { if (initialized) revert(); }.

@holiman

This comment has been minimized.

Copy link
Contributor

holiman commented Sep 23, 2018

Collision cause deployment to fail. Collision occurs if nonce or code is nonzero.

Since nonce is set to 1 at creation (nowadays), even empty create can't be overwritten later.

Since selfdestruct takes effect post-tx, there can be no double-create during one tx.

@holiman

This comment has been minimized.

Copy link
Contributor

holiman commented Sep 23, 2018

A better semantic would be that you can't CREATE2 a contract that has been destructed in the same transaction.

So basically, that is already the case

@SergioDemianLerner

This comment has been minimized.

Copy link

SergioDemianLerner commented Sep 23, 2018

It would be very informative if the EIP states it.

@holiman

This comment has been minimized.

Copy link
Contributor

holiman commented Sep 23, 2018

Yes, it should have a reference to eip 684 (iirc) which defines the collision behavior.

@holgerd77

This comment has been minimized.

Copy link

holgerd77 commented Sep 24, 2018

Could we have 4-5 (or at least one) example cases for the hash creation in the EIP?


#### Option 2

Use `sha3(0xff ++ msg.sender ++ salt ++ init_code)[12:]`

This comment has been minimized.

@holgerd77

holgerd77 Sep 24, 2018

Reading from the comments it's:

sha3(0xff ++ msg.sender ++ salt ++ sha3(init_code))[12:]

instead of

sha3(0xff ++ msg.sender ++ salt ++ init_code)[12:]

so with the hash value sha3(init_code) and not the code itself. Am I correct on this?

This comment has been minimized.

@holiman

holiman Sep 24, 2018

Contributor

Please see #1375 . But yes, the init_code, not the code itself

This comment has been minimized.

@holgerd77
@tersec

This comment has been minimized.

Copy link

tersec commented Oct 5, 2018

This is looking good for Nimbus use cases, with reasonable design tradeoffs.

@bmann

This comment has been minimized.

Copy link
Contributor

bmann commented Dec 4, 2018

This is still listed as Draft, and should move through Last Call and Accepted if it's supposed to be in Constantinople.

@jochem-brouwer

This comment has been minimized.

Copy link

jochem-brouwer commented Jan 3, 2019

Is there any reason why endowment is not hashed when the contract address is computed?

@vbuterin @holiman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.