EIP 105 (Serenity): Binary sharding plus contract calling semantics #53

Open
vbuterin opened this Issue Jan 13, 2016 · 26 comments

Projects

None yet
@vbuterin
Collaborator

This EIP is a solidification of a proposal to implement a simple binary sharding scheme in Ethereum, in the spirit of what was shown at Devcon 1 here: https://docs.google.com/presentation/d/1CjD0W4l4-CwHKUvfF5Vlps76fKLEC6pIwu1a_kC_YRQ

Design goals include:

  • Minimal initial implementation complexity, while at the same time serving as an effective "on-ramp" easing the transition toward a fully scalable Ethereum 2.0
  • Backward-compatibility with the existing Ethereum 1.0 contracts/state/applications (though the economics of ethereum 2.0 may eventually render some of the contract interactions that are currently happening prohibitively expensive even if they remain theoretically possible)
  • Flexibility, including the ability to expand and decrease the de-facto number of "shards" as needed
  • The ability to offer something equivalent to sharding at a "sub-contract" level, or some similar mechanism in order to allow individual contract developers to easily create applications that are themselves sharded and scalable.
  • The ability to offer a flexible tradeoff to developers between transaction cost and being able to make synchronous calls across a large state

We can describe the scheme as follows:

  1. There is now a new type of object, called a transaction group. A transaction group specifies a list of transactions, a gas limit, and an activity range of size 2**k between 2**k * m and 2**k * (m+1), for some values k, m where 144 <= k <= 160,m >= 0 and 2**k * (m+1) <= 2**160. The intuition is that any valid activity range constitutes a "shard", with each block of 2**144 being a "bottom level shard". Every shard that is not bottom level can also be decomposed into two "child shards", one for the left half of its range and one for the right half.
  2. Transactions now also specify activity ranges. Transaction groups cannot include transactions whose activity range is outside their own.
  3. Instead of containing a tree of transactions, a block now contains a list of transaction groups, which MUST have disjoint activity ranges.
  4. When executing a transaction, any CALL or other operation that attempts to access an address which is outside of the transaction's containing group's activity range immediately triggers an out-of-range exception. New RANGEMIN and RANGEMAX opcodes are added to make it easier for developers to deal with this.
  5. When creating a new account, a transaction will now automatically set the first 160-k bits of the address so that the account that it creates will fit into the containing transaction group's range. CREATE operations work similarly, except that they always set the first 16 bits of the target address to fit into the same bottom-level shard.
  6. New specialized SSTORE and SLOAD opcodes are introduced along the existing ones, such that these opcodes take an additional "shard ID" as an argument, where the shard ID should be in the range 0 < k <= 65535. A contract at address A can use these opcodes to read and write the storage of addressk * 2**144 + A % 2**144, provided that this address is within the current activity range. Use of these opcodes will also fill in the target address's code to equal the sender's, if it is not yet set.
  7. Contracts now store a code hash at key '' in the tree, not the actual code (this reduces data duplication arising from (5)).
  8. The receipts for a transaction are now saved in the leftmost shard allowed by the transaction group (ie. at MINRANGE + (d % 2**144) where d is the current address of the receipt storing contract).
  9. If GL is the global gas limit, the transaction group gas limits must satisfy the formula sum([max(L[i]**1.5, L[i] *GL / 4) for L in transaction_group_gas_limits]) < GL**1.5. This allows blocks to include more gas, up to a maximum 4x increase, if they include transactions that can be executed in parallel (this limit can later on be removed or relaxed once proper sharding is implemented).

Philosophically speaking, this introduces the following considerations:

  • There are now three types of exceptional conditions, rather than two as before: (i) out-of-gas, (ii) preventable errors arising from badly written code, and (iii, newly) out-of-range. Out of range should be viewed as being philosophically similar to out of gas, and similar kinds of guards in the code should be used to prevent attacks in both cases.
  • The preferred paradigm for making scalable applications will now be to create a receipt in one shard representing half of a completed operation, and then consume the receipt in another shard, verifying it using a merkle branch plus the STATEROOT opcode. A precompile contract for doing this will likely be added. Applications designed to be asynchronous will thus always benefit from the highest gas discounts as each transaction will only need to touch one contract.
  • High-level languages will likely include a shardedMap primitive, perhaps allowing a user-specified sharding schema (eg. shard = address // 2**144, shard = sha3(name) % 65536, etc), allowing contracts to store state across multiple shards.
  • Contract code can now be safely processed in parallel, introducing an immediate ~2-8x scalability benefit for the public ethereum blockchain assuming that miners have multicore processors, and a much larger scalability benefit for ethereum private chains; in a private chain context, the problems in http://www.multichain.com/blog/2015/11/smart-contracts-slow-blockchains/ would be completely solved.
  • It may be possible to make Ethereum massively scalable (defined either as "can safely process 10000+ tx/sec" or "transaction processing capacity quadratic in the processing power of a single node") with no further changes, if almost all validators under this scheme become comfortable mining/staking without running full validating nodes, instead employing collaborative validation strategies where they randomly poll other nodes on the validity of blocks on some shards and individually validate other shards. Hence, in the event that the Ethereum developers get blown up by [insert crazies here] or the foundation goes bankrupt, Ethereum will be much more well-suited to scale over time with only minimal further work on the core protocol if need be.

From an economics perspective, the following observations can be made:

  • Contract creators now have the ability to make a choice between being in shards where the contracts that they care to interact with are highly concentrated (and thereby benefit from network effects) and being in shards where contracts are rarefied (and thereby benefit from cost savings). This is very similar to the tradeoff faced by humans deciding whether to live in a big city or in a smaller city or the countryside, and so many insights from urban economics may be transferable.
  • Existing contracts do NOT have the ability to move between shards if they are unhappy with a change in the congestion/cost tradeoff of a given shard that arises from changes in the ecosystem; addresses are static. However, they do have an alternative escape valve: de-facto splitting and merging the shards that they are in. If a given shard becomes too congested, its gas price will go up, and so users will increasingly prefer to make operations that are limited to one of its two sub-shards; if enough people do this, then transaction groups at the higher shard level will be rare and transaction groups at the sub-shard level will be more frequent, thereby increasing the "distance" between the shards. Instead of (or rather, alongside) the citizens moving, the city itself shrinks or grows.

In order to process all transaction execution in parallel between transaction groups in a completely disjoint fashion, there are two possible routes from a data structure standpoint. The first is to modify the tree structure so as to have a depth-16 binary tree at the top level. The second is to implement Patricia tree splitting and merging (see python implemented code here: ethereum/pyethereum@81d37fc ), and do it all with the existing hexary Patricia tree: when processing every block, the process becomes split -> process in parallel -> merge. The former route may be slightly more efficient in the long term, the latter seems more efficient in the short term and may require writing slightly less code (though the code that does need to be written for split/merge is somewhat involved).

Note that if the 144-bit address plus 16-bit sharding scheme is undesirable because it unreasonably splits up existing contracts, we can also take the route of 160-bit addresses plus 16-bit shards for a total of 176 bit addresses.

@janx
Member
janx commented Jan 13, 2016

Very cool.

New specialized SSTORE and SLOAD opcodes are introduced along the existing ones, such that these opcodes take an additional "shard ID" as an argument, where the shard ID should be in the range 0 < k <= 65535.

Shard id can't be zero or typo?

176-bits address sounds more compatible with existing tools.

@chriseth
Contributor

Thinking about the "prime use case" of a sharded currency contract, a transfer would always touch at most two shards, but it could be hard to get the activity ranges of multiple transactions disjoint even though each shard is only touched by at most one transaction. What would be the drawbacks of allowing a list of activity ranges for each transaction group?

@janx
Member
janx commented Jan 13, 2016

How does this work with account abstraction? Does transactions from address 0 require special treatment? Will this make the shard 0 a special one because of pre-deployed contracts?

@Smithgift

I see a issue with Out-Of-Range being philosophically treated like Out-of-Gas. If I run out out of gas, well, the miner did the work, so I still should pay them. If my code is bad, that's my fault, and the miner should still get paid.

But activity groups, and therefore whether a transaction goes out of range, is totally under the miner's control. Suppose they use introspection to find if my high-gas high-price transaction crosses at least one shard boundary, and then manipulate the transaction group to glean as much possible gas from me without actually executing the transaction. Or perhaps the miner made a legitimate bad guess about the activity range, and only after spending over a million gas in computation did they discover the problem. Should the miner still get paid?

Something makes me uneasy about having two types of errors, one of which is the user's fault and the miner gets paid, and another which is the miner's fault, and there's not an obvious answer whether the miner should be paid or not. Sure, the programmer via the silent assistance of the compiler can deal with it, but there's a significant philosophical difference.

On a similar topic, what algorithm can miners use to determine transaction groups? Seems to be a kind of packing problem, except the bins themselves have to be packed into bigger bins and objects can grow and shrink depending on arbitrary bounded computation. There's probably a reasonably good algorithm to get an answer, in the same way that current miners deal with packing transactions into blocks.

Off the top of my head, a naive system might be:

  1. Sort the transactions by effective gas price and divide them into queues based on the shard of the contract they'll start with (or hit directly after the origin address.)
  2. Start processing the queues into their own proto-transaction groups, and when a transaction crosses an activity range delete the queue that pays less and dump those transactions on the "winning" queue.
  3. When all transactions have been processed, pick the set of proto-transaction groups that pays the most and make those the final transaction groups.

There's clearly issues, though (How much gas should each proto-transaction group be allowed?)

@jpritikin

"Applications designed to be asynchronous will this always benefit from the highest gas" -- Typo here but the meaning seems clear enough

@vbuterin
Collaborator

Ok, I added an important point: transactions now also specify their own activity ranges, and a transaction group's activity range must be a superset of that of all transactions that are contained in it. This should resolve most of the issues around OOR being something that miners can "force" to happen to any transaction.

@vbuterin
Collaborator

Regarding special addresses, the zero address is just a virtual sender account, so I don't see any problems there wrt sharding; if really desired I suppose we can make the sender 0 + m * 2**k to be in the same shard, but not sure that's necessary. Precompiles can probably be copied to every shard. Casper, we'll see; there should be some way to architect it so that it benefits from parallelization but that may have to wait for v2.

@simondlr

Very awesome. 2 questions:

I'm unsure precisely how this choice works/looks?

Contract creators now have the ability to make a choice between being in shards where the contracts that they care to interact with are highly concentrated (and thereby benefit from network effects) and being in shards where contracts are rarefied (and thereby benefit from cost savings).

Would you at a contract level choose which shard to store your contract's state?

What if you don't have knowledge on how far the activity range could be? It could be a tx with long cascading hops between contracts, but you are not sure what will be exeuted as a result of it. I assume, one would be able to check, like with current web3 tools to see what happens with a potential state change before submitting it. ie, run this transaction to find the activity range, then submit this along with the transaction? Is this a correct assumption?

@vbuterin
Collaborator

Would you at a contract level choose which shard to store your contract's state?

This happens at two levels. First, when creating a contract, you can choose the prefix at which you create it. Second, the contract itself can choose which shard IDs it tries to SSTORE data at.

What if you don't have knowledge on how far the activity range could be? It could be a tx with long cascading hops between contracts, but you are not sure what will be exeuted as a result of it.

Then, you would have to set the activity range to be very large, and thus wait a long time and pay high gas costs for it to be accepted (or else set a smaller activity range and risk some of the child messages OOR'ing). Creating long and unpredictable cascading hops between contracts is precisely the thing that is deliberately economically discouraged by this scheme.

@sillytuna

Taking the Alarm contract as an example, does that mean it's best to put my 'callee' contract on the same shard as the alarm contract, and also that (in principle) the alarm contract could live on multiple shard (via multiple contracts) if that provided a good cost saving?

@simondlr

Creating long and unpredictable cascading hops between contracts is precisely the thing that is deliberately economically discouraged by this scheme.

It does to some extent feel like one is disincentivizing potentially emergent behaviour due to stifling costs? ie.

A thought could occur: "We could automatically build a prediction market on information being produced automatically from contract A. But, argh. The prediction market platform is on a shard too far away. If only the 2 developers saw there could be potential synergy! Contract B should perhaps have chosen to deploy the contract in a closer group to the prediction market platform. Now we won't know what the interesting outcomes would be. Too expensive and will take too long to confirm. :("

It's a concern, but a trade-off for scalability's sake. Scalability > emergence at this point in time. I guess it should just be mitigated as much as possible to ensure it keeps most of the benefits. I guess, now your mention of urban economics makes sense.

Keep up the good work!

@vbuterin
Collaborator

And then a lightbulb goes off in your head and you realize that you could implement your application and get the best of both worlds by making the information grabbing from contract A asynchronous :)

@vbuterin
Collaborator

You probably want an instance of the alarm contract in every shard.

@simondlr

And then a lightbulb goes off in your head and you realize that you could implement your application and get the best of both worlds by making the information grabbing from contract A asynchronous :)

Ah, yes! Using log proofs, right?

@Smithgift

Suggestion: define CALLCODE and DELEGATECALL to use the code of the target contract as of last block and make those two opcodes always synchronous no matter the activity range (and work regardless of it).

Reasoning: Libraries can work cross-shard without further effort. Keeping the static code of popular libraries (possibly JIT compiled as well) in memory is little burden to miners, and libraries can't alter the state of their own shard anyway. It also helps forward-compatibility: no need to make sure your libraries will be on the same shard as your contract in the current blockchain so your dapp won't become expensive in the future.

Libraries can be created and selfdestruct, true. But if we just use the last block's code I can't conceive of any normal situation where it would matter.

@chriseth
Contributor

The semantics of self destruct do not make too much sense anyway without blockchain rent. Perhaps we could also consider to add some DELEGATECALL or CALLCODE variant that does not use an address but a code hash.

@VoR0220
VoR0220 commented Jan 14, 2016

Why wouldn't the semantics of self destruct not make sense here? Would not a shard be able to shorten the amount of "block" it puts in with the self destruct mechanism? I'm very curious and want to better understand all of this.

@vbuterin
Collaborator

Ah, yes! Using log proofs, right?

Yep :)

Suggestion: define CALLCODE and DELEGATECALL to use the code of the target contract as of last block and make those two opcodes always synchronous no matter the activity range (and work regardless of it).

This feels like it's going into the "protocol-level proofs" area, as it violates the assumption that every shard knows about nothing but itself and previous state roots (an assumption that's not true in Serenity but will be in the further future). If we go that route, then this seems completely sensible, though note that it does mean that transactions will need to include proofs of the code they are calling. Otherwise, the most realistic route may be to simply copy the library code to each shard (note that this can be done just-in-time, and since we store the hash of the code in the tree and not the actual code the duplication costs are greatly reduced). You could argue that this is close to equivalent to a delegate call that uses a code hash.

@vladzamfir

The main problem that I see with this proposal is the following:

By setting m = 0 and k = 160, an adversary can create a block with a single transaction group that touches as much of the state as possible, being limited only by the global gas limit GL. The adversary can thereby force validators who don't have enough storage capacity to SPV validate their block. In the worst imaginable case, an adversary may be the only node with enough capacity to validate their block, and can thereby introduce an invalid state transition.

This begs the following questions:

  • Is the global gas limit the only handle that exists under EIP 105 to mitigate this attack?
  • Will we require that the global state is small or well-behaved enough that any validator with retail hardware can validate a block containing a transaction group with m = 0, k = 160?
@vbuterin
Collaborator

The solution there is to make sure that the gas rules impose a sane upper bound on the portion of the state that a single block can read, and thereby indirectly impose a sane upper bound on the size of the required merkle proof for a block. Read operations probably need to be made more expensive, perhaps to the point where they are a substantial portion of the cost of write operations.

@VoR0220
VoR0220 commented Jan 24, 2016

How would you accomplish that? When you say read operations, you mean inputs into the chain correct?

@vbuterin
Collaborator

I mean SLOAD (and less-known ones like BALANCE).

@Smithgift

I realized a bizarre but potentially quite useful feature of sharding. Suppose the network is now huge and every validator has to trust the validators of other shards. Then suppose out in shard 3f41 a group of validators agree to change consensus rules to add the FOO_BAR opcode. In shard 3f41, the FOO_BAR opcode now works. In all other shards, it doesn't. But as long as the outside validators trust the FOO_BARites of 3f41, there remains only one blockchain. We essentially have a partial hardfork (or a hardfork and a soft-fork at the same time.)

In reality, the amount of drama that accompanies any change would mean this could never happen in secret. On the other hand, some gargantuan drama-inducing change could have a compromise: everyone on side A sticks to side A shards, everyone on side B sticks to side B shards, and there's just a few validators which overlap to make sure that no one on the other side is cheating.

@mcelrath

This proposal effectively outsources the problem, via an incentive, without actually solving it. I don't think it's reasonable that a contract or person would know what "activity range" it needs, and I anticipate a ton of out-of-gas conditions as a consequence, and general breakage. Absent a good solution to sharding, incentivizing a solution is an ugly, if not terrible idea, but will most definitely result in unexpected breakage.

@zack-bitcoin

Potential attack:
If one of the rarified shards is filled with money from only a handful of people, and the handful colludes to introduce an invalid state transition that gives themselves tons more money.
Is there a way to recover from this?

@VoR0220
VoR0220 commented Mar 11, 2016

@zack-bitcoin not really doable I think....they still have to eventually verify with the other shards. That's when this would fall through.

@AnhNDT AnhNDT referenced this issue in prdlab/blockchain-guide Feb 15, 2017
Open

"Scalability/sharding" và PoS trong Ethereum #2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment