Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
EVM 2/wasm 105 #48
This EIP proposes upgrading the VM by converting to use a subset of Webassembly (Wasm). Wasm is a new Assembly standard being built for the web. The main advantage of using wasm is performance (both speed and size). The main disadvantage would be the need to port our existing infrastructure to use a new ISA. Most of this EIP is taken from Wasm's design docs which should be referenced for futher details.
To truly distinguish Ethereum as the World Computer we need to have a performant VM. The current architecture of the VM is one of the greatest blockers to raw performance. Being stack-based and the 256-bit word size make translation from EVM opcodes to hardware instructions more difficult than needed.
With an architecture that provides a closer mapping to hardware the VM will have a considerably enhanced performance which will effectively open the door to a much wider array of uses that require a much higher performance/throughput. Also, by choosing a common and more standardized architecture anyone will be able to compile C/C++, Solidity (, etc.) once, and the compiled code will run in multiple environments. Using the new Assembly standard will make running a program either directly on Ethereum, on a cloud hosting environment, or on one's local machine - a friction-less process.
Wasm is a kin to LLVM IR and Low Level Instruction Set Architectures (ISAs). It is an defined as an Abstract Syntax Tree (AST) which has a textual representation using s-expressions. As an AST it has some higher level semantics than pure hardware ISAs. Including functions and basic flow control. Wasm is not finished a specification and still in flux. This EIP suggest using a subset of wasm to allow us to tailor the semantics to Ethereum. This spec is designed so that the minimum amount of changes a wasm VM would need to be implemented. This should make existing implementations of wasm easy to integrate and provide a way to run Ethereum flavored wasm on a unmodified wasm VM.
A Wasm module expressed in s-expressions looks like
Eth Flavored wasm
For Ethereum we can consider each contract as a module therefore we can have an intrinsic modules and we do not need to specify the module’s closure (we leave it now for clarity). Furthermore if we restrict the types to 64 bit then we can eliminate the type information from the locals definition.
Difference from WASM
Metering can initial be accomplished by injecting the counting code into the AST then passing the modified AST to a wasm VM. Modifying the AST is done by traversing the AST and adding a gas check immediately after each branch condition and at the start of functions and loops. For a more performant version gas counting could possibly be done at the VM directly. But from initial trials injecting gas at the AST level does not greatly affect performance. Since the gas counting code must be protected it would have to be implemented in a separate module.
Ethereum Specific Opcodes
The above opcodes should be used with WASM's call.
Whether the module system could be used for contracts in general is an open questions. For now we will not use it in favor of using the
Modified Ethereum operations
note on CALLs
The module loading system can’t yet support
Abstract Syntax Tree Semantics
For the current full definition see here I’m only going to post here the core semantics that we are intersted in. For more details see the accompanying wiki. A formal spec such for each opcode will be worked out if this EIP gains Momentum. Currently this EIP suggests limiting the functions to only 64 bit Integer operations.
Internal Function Calls
Each function has a signature, which consists of: Return types, which are a sequence of value types Argument types, which are a sequence of value types
WebAssembly doesn't support variable-length argument lists (aka varargs). Compilers targeting WebAssembly can instead support them through explicit accesses to linear memory. The length of the return types sequence may only be 0 or 1
Direct calls to a function specify the callee by index into a main function table.
The main storage of a WebAssembly instance, called the linear memory, is a contiguous, byte-addressable range of memory spanning from offset
Linear Memory Accesses
Linear memory access is accomplished with explicit load and store operators. Integer loads can specify a storage size which is smaller than the result type as well as a signedness which determines whether the bytes are sign- or zero- extended into the result type.
Stores have an additional input operand which is the value to store to memory. Like loads, integer stores may specify a smaller storage size than the operand size in which case integer wrapping is implied.
Each function has a fixed, pre-declared number of local variables which occupy a single index space local to the function. Parameters are addressed as local variables. Local variables do not have addresses and are not aliased by linear memory. Local variables have value types and are initialized to the appropriate zero value for their type at the beginning of the function, except parameters which are initialized to the values of the arguments passed to the function. Local variables can be thought of as registers
If the module has a start node defined, the function it refers should be called
For example, a start node in a module will be:
In the first example, the environment is expected to call the function $start_function
A module can:
Further attention to the binary encode can be tracked here. Currently wasm only has an early binary encoding spec.
Rationale For Registered Based ISA.
64-bit vs 256-bit
A 64 bit word size would map to real world hardware. Most the time we don’t need 256 bit arithmetic. 256 was chosen to work with 256-bit hashes but we can easily store hash results in memory. Some opcodes such as CALL would have to load the to address from memory.
While the complete EIP is not implemented many pieces of it are.
There are several Wasm implementations; here are a few.
@CJentzsch You would have to transcompile all the existing EVM code to WASM. Something like this has been done before with the EVM JIT. Except that it targeted LLVM IR. There are a couple of opcodes that are not backwards compatible.
referenced this issue
Dec 22, 2015
Although this idea does have its merits, I honestly don't think this will ever happen. We've just launched Ethereum and the EVM, and even getting a single opcode added to it is a horribly painful and error prone procedure. By replacing the entire EVM we might as well just launch a new network from scratch. I don't think we could rationalize any speed increases with the enormous costs associated with implementing the proposed WASM based VM in all official languages, battle testing that they work and also potentially having them audited.
@karalabe yep that is the key question. C++ wouldn't have much to implement since they would just have to wrap Intel's JIT. But it might be a lot of work for go and Python. Lucky WASM is taking a similar approach to testing as us. They have a standalone test repo that all implementation must pass. We wouldn't be doing everything from scratch.
@wanderer Is there something specific to WASM which makes it a better fit than just using LLVM IR, which is already defined, and becoming very close to being a universal AST standard?
If it was feasible to use LLVM IR as the target then all the numerous existing LLVM backends could be used off-the-shelf (C++, C#, JS, and many more) right now, and you would still be able to target WASM as-and-when that settles down.
@bobsummerwill I actually compiled some notes on this subject. Here the are. Let me know if you have anymore questions!
From LLVM Language Reference:
Response from Derek Schuff (one of the engineers for pNACL) from google on WASM vs LLVM
Further research on LLVM
None of these problems are insurmountable. For example PNaCl defines a small portable subset of the IR with reduced undefined behavior, and a stable version of the bitcode encoding. It also employs several techniques to improve startup performance. However, each customization, workaround, and special solution means less benefit from the common infrastructure
update, the binary encoding has now been speficied https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md