Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing Contract Size Limit #1662

Open
maxsam4 opened this issue Dec 18, 2018 · 23 comments

Comments

@maxsam4
Copy link

commented Dec 18, 2018

Abstract
A contract size limit of 24kb was introduced by EIP #170 . To solve the following problem problem: "when a contract is called, even though the call takes a constant amount of gas, the call can trigger O(n) cost in terms of reading the code from disk, preprocessing the code for VM execution, and also adding O(n) data to the Merkle proof for the block's proof-of-validity". I think we can solve this problem while still allowing infinite contract size.

Motivation
Complex dApps require complex smart contracts. I have seen a lot of dApp developers struggle with this and a lot of alternative solutions are being used like using delegate calls. Delegate calls reduce code readability while developing and while verifying source code on tools like etherscan. They also introduce another attack surface and added complexity.

Specification

  1. The ethereum account array in state trie saves another element codeSize apart from existing 4: [nonce,balance,storageRoot,codeHash]
  2. Whenever a new contract is deployed, its size is stored in codeSize of the account object.
  3. If a contract is destructed, the codeSize should also be reset.
  4. opcodes like CALL, DELEGATECALL, CALLCODE etc should charge additional X(3?) gas per extra word if the contract code size is greater than 24kb.

Rationale
The only reason why contract size was limited was to prevent people from exploiting the fixed costs. We can overcome this by making the costs variable. The codeSize element will help in calculating call cost before reading the whole contract and eventually throwing OOG. Merkle proofs will also be generated at fixed cost as we won't have to load the huge contracts from disk first. The codeSize should be enough.

Backwards Compatibility
We don't necessarily need to refactor the existing account arrays as all the existing accounts have less than 24kb of code so no extra cost has to be charged from calls being sent to them. We can assume X = 0 if it's not available.
This will mean that there are no backward compatibility issues.

This is just an early discussion Issue. I will create a properly formatted draft with specifications after getting some more feedback from the community.

References
#170
#659
https://github.com/ethereum/wiki/wiki/Patricia-Tree#tries-in-ethereum

@postables

This comment has been minimized.

Copy link

commented Dec 19, 2018

If your contract is 24KB or larger, your contract architecture needs a seriously big redesign

@maxsam4

This comment has been minimized.

Copy link
Author

commented Dec 19, 2018

If your contract is 24KB or larger, your contract architecture needs a seriously big redesign

Thank you for your feedback. That's a debatable point but I get why you think so.

Contract size increases due to multiple factors that don't really add to the complexity of the code. For example,

  1. Reason strings take a hell lot of space.
  2. Getter functions for ease of use of front-end devs or normal users take a lot of space.
  3. Solidity 0.5.0 produces larger bytecode than solidity 0.4.24 for the same contract with required explicit definitions.
    etc

Obviously, the contract size also increases when you add functionality.

@nervehammer

This comment has been minimized.

Copy link

commented Dec 19, 2018

i agree with @maxsam4, we at NexusMutual have complex logic that are split into multiple smaller contract. Have a look here: https://github.com/somish/NexusMutual/tree/token/contracts

@0xKiwi

This comment has been minimized.

Copy link

commented Dec 28, 2018

I would support at least a minor increase in the smart contract max size, sure you can usually split logic into several contracts but with reason strings and developers learning more about options for Smart Contract design, I believe an increase (even if minor) is due.

@mudgen

This comment has been minimized.

Copy link
Contributor

commented Dec 28, 2018

As a note, the contract size limit can be removed by implementing ERC1538.

@maxsam4

This comment has been minimized.

Copy link
Author

commented Jan 3, 2019

As a note, the contract size limit can be removed by implementing ERC1538.

Yep, that is one way to do it but it's not optimal because of limitations like:

  1. It adds extra unnecessary code.
  2. Adds another attack vector.
  3. Makes intra contract calls more expensive.
  4. Makes it harder for non tech-savvy users to interact with it. Like it or not but most people can barely use the read/write feature in etherscan, they can't load a custom ABI and make web3 calls.
  5. Makes it harder for beginners to verify the code and hence reduces trust.

I am still collecting feedback on this. I think there are 3 basic ways that this proposal can go:

  1. Ethereum is not meant for complex ecosystems, smart contracts must be less than 24kb always. Alternative solutions that currently work like ERC1538 can still be used.
  2. A variant of this proposal is accepted and when combined with the scalability of serenity, ethereum will be able to support really complex dApps.
  3. Changes are made into EVM to support pagination or maybe keeping the EVM as it is and adding pagination to high-level languages like solidity. i.e. whenever solidity will see the contract size go above ~20kb, it will split it into two contracts and connect them through proxies. the end user will never know the difference.
@amiromayer

This comment has been minimized.

Copy link

commented Jan 5, 2019

👍🏻 Agree with @maxsam4. Would also support the increase in contract size limit.

As a note, the contract size limit can be removed by implementing ERC1538.

Yep, that is one way to do it but it's not optimal because of limitations like:

  1. It adds extra unnecessary code.
  2. Adds another attack vector.
  3. Makes intra contract calls more expensive.
  4. Makes it harder for non tech-savvy users to interact with it. Like it or not but most people can barely use the read/write feature in etherscan, they can't load a custom ABI and make web3 calls.
  5. Makes it harder for beginners to verify the code and hence reduces trust.

I am still collecting feedback on this. I think there are 3 basic ways that this proposal can go:

  1. Ethereum is not meant for complex ecosystems, smart contracts must be less than 24kb always. Alternative solutions that currently work like ERC1538 can still be used.
  2. A variant of this proposal is accepted and when combined with the scalability of serenity, ethereum will be able to support really complex dApps.
  3. Changes are made into EVM to support pagination or maybe keeping the EVM as it is and adding pagination to high-level languages like solidity. i.e. whenever solidity will see the contract size go above ~20kb, it will split it into two contracts and connect them through proxies. the end user will never know the difference.
@ayushgupta0610

This comment has been minimized.

Copy link

commented Feb 11, 2019

I agree with @maxsam4 , even after making use of library contracts, proxy patterns, breaking huge contracts to multiple smaller contracts, there still comes a point which requires the smart contracts' byte code to be extended beyond a certain limit. It's just not developer-friendly to restrict the developers to develop smart contracts to limited complexity.

@jpmonettas

This comment has been minimized.

Copy link

commented Feb 16, 2019

I also agree with the importance of fixing the contract size limit issue.
Auditability and trust are very important for this systems and a big part of that comes from the languages letting you express your logic as clean and straight forward as possible. It is much harder
to do it if you also need to consider this limitation.
Don't know if the limitation should be fixed at Ethereum level or compilers one.

@chris-shyft

This comment has been minimized.

Copy link

commented Mar 29, 2019

I'm hitting the same limit. Libraries aren't enough.. I don't care about high(er) deploy costs at whatever level, for some larger contracts that need to deal with asset transfers, you need the transaction costs to be low and deploy cost means very little at scale.

Meanwhile, cross-links (contracts with state to member contracts via interfaces) are useful as well. But they still run into the same issue as mentioned above.

Increasing it to one of the earlier suggestions of 2^15 would be almost a substantial amount more headroom than currently.

@maxsam4

This comment has been minimized.

Copy link
Author

commented Mar 29, 2019

Thanks everyone for the feedback! I think I have got enough feedback to justify moving forward with this Idea. I will try to draft a proper proposal/document this weekend and will collect feedback on it via different mediums.

@maxsam4

This comment has been minimized.

Copy link
Author

commented Mar 31, 2019

I have summed up most of my thoughts regarding this in a small write up available at https://ethereum-magicians.org/t/removing-or-increasing-the-contract-size-limit/3045

Please have a look. All feedback is welcome.

@0zAND1z

This comment has been minimized.

Copy link

commented Apr 1, 2019

@gcolvin & @chriseth , could you please take a look into this & guide this thread ahead?

After multiple audits & optimizations, a lot of contract developers seems to be sharing a common observation. We would love to have your inputs on this.

Cheers.

@zmitton

This comment has been minimized.

Copy link

commented Jun 3, 2019

@maxsam4 over on Ethereum Classic we don't yet have a size limit and are debating adding EIP-170. This is a viable alternative, I have a question. Why do we need to cache the size data into the account tuple. Can't we do it without changing the account format? Cant we just have the EVM load a contract's bytecode and then determine the gas cost and charge it at that point?

@maxsam4

This comment has been minimized.

Copy link
Author

commented Jun 6, 2019

Cant we just have the EVM load a contract's bytecode and then determine the gas cost and charge it at that point?

That introduces another attack vector where the attacker forces the EVM to load a huge contract even when there is not enough gas remaining to load it. If the size is not cached, EVM will be forced to fetch the large contract only to realize that there isn't enough gas left to load that contract in the first place.

@zmitton

This comment has been minimized.

Copy link

commented Jun 7, 2019

@maxsam4 I have a couple of arguments against that, if at least one of them is right, my amendment to your proposal should be fine.

  1. Isn't the bytecode stored in some kind of Array anyway? doesn't that array have some size data at the first element (as per how arrays are generally implemented). If not, can't the levelDB implementations be slightly amended to store that?
  2. There is still a larger contract size limit based on the block-gas-limit. So even if the client had to fetch the whole contract to determine size, it still would only have to fetch one contract, (the attack that was used required hundreds or even thousands of contracts to be loaded). The "slightly quadratic" resources I think assumes the loading of many many contracts like this. If the client must only load 1, and that one already has a moderately limited size this may no longer be an issue. However, I might be wrong about that because, as I understand it, an out-of-gas transaction only applies to the current contract scope, and therefore a single transaction might be able to CALL multiple contracts, even after the first one fails with an out-of-gas error. But at least, when the CALL fails, the client can release the memory used on that contract, BEFORE calling the next one, so I think again, that our problem is solved.

Thanks

@maxsam4

This comment has been minimized.

Copy link
Author

commented Jun 7, 2019

@zmitton Thanks for your feedback! I am afraid that neither of those solutions will work. Here are some of my thoughts

  1. It doesn't matter how the bytecode is actually stored. AFAIK, the EVM needs to generate proper proofs and hence the EVM needs access to either the size of the smart contract code or to the code itself. We can not standardize how the data must be stored under the hood (levelDB). We can just standardize what data we want. The data we want is contract size. If the implementations can modify their storages slightly to provide us with the required data without recalculating anything, that's great but we can't force it. The requirement of saving contract size available is not blocking this PR. That change CAN be done. Please refer to https://ethereum-magicians.org/t/removing-or-increasing-the-contract-size-limit/3045 to see what's blocking this. That stuff may not be applicable to ETC though.

  2. One contract can load multiple large sized contracts by creating external calls and passing low gas to them (using .call() in solidity). Memory is not the main concern here. Random disk read is. Disks are super slow and finding and reading data from them is expensive.

To be honest, on the ETC side, if you are not planning on increasing block gas limit by much then it might be best to increase the price of all calls, delegate calls etc by a fixed amount and remove the contract size limit. With 8m block gas limit, there is a natural limit of ~40 KB. On the ETH front, there are talks to increase the gas limit by as much as 300% so that is not an as viable solution.

@zmitton

This comment has been minimized.

Copy link

commented Jun 7, 2019

If the implementations can modify their storages slightly to provide us with the required data without recalculating anything, that's great but we can't force it.

But we don't need to force it. The consensus rule is that payment is dependent on bytecode length, and how that is implemented is up to the client. Caching the length data somewhere is simply an optimization, that can and (I would argue) should be done by any production client software.

@zmitton

This comment has been minimized.

Copy link

commented Jun 7, 2019

The point is that if there are reasonable ways to do it, then there isn't a problem implementing the rule.

@zmitton

This comment has been minimized.

Copy link

commented Jun 7, 2019

By the way, skimming through the link you posted, I don't think any of that stuff applies. It's mostly about Ethereum-specific design plans

@maxsam4

This comment has been minimized.

Copy link
Author

commented Jun 11, 2019

But we don't need to force it. The consensus rule is that payment is dependent on bytecode length, and how that is implemented is up to the client. Caching the length data somewhere is simply an optimization, that can and (I would argue) should be done by any production client software.

That's what the original proposal wants. However, you need to see the practical aspect as well. If the clients don't implement the optimization then it is ddos vector. Therefore, this proposal mandates the clients to implement this optimization. As with the gas mechanics, you set the price of any action to anything but the price is actually determined by practical costs. For some clients, the cost may differ but we try to pick a generic cost. Feel free to implement this however you wish in ETC but as ETH has so many clients, I think it is necessary to make this "optimization" mandatory so that all the current and future clients end up implementing it.

By the way, skimming through the link you posted, I don't think any of that stuff applies. It's mostly about Ethereum-specific design plans

Yeah, I felt that too.

@zmitton

This comment has been minimized.

Copy link

commented Jun 16, 2019

ok thanks @maxsam4 I think I have a good understanding now. So I now realize that the difference mostly a backwards compatibility issue I was likely trying to avoid. i.e. having the extra field affects the tree hashing and therefore affects the root and therefore the consensus and is a hard fork. But of course changing the gas costs is a hard fork anyway, so it's all pretty similar. My way would affect a little less related software, but would make client implementation easier to screw up. Thanks for the ideas!

@BelfordZ

This comment has been minimized.

Copy link

commented Jul 17, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.