contracts: Support RISC-V bytecode #115

athei · 2023-03-19T16:27:11Z

⚠️ : The support for RISC-V will only be in addition to the Wasm support. Wasm is not going anywhere. It is also a non breaking change. Meaning it does not matter which bytecode a contract uses. You can call it in the same way. That is true no matter how a contract is called (by another contract vs. by an extrinsic).

Here is a write up by @koute with more information: https://forum.polkadot.network/t/exploring-alternatives-to-wasm-for-smart-contracts/2434

Why we need a new bytecode

The idea of supporting an alternative to WebAssembly (wasm) on pallet-contracts is an idea that developed for the last couple of month. It started with discussions between various engineers. We came to the conclusion that wasm is not the optimal byte code to formulate contracts in. It comes down to few key insights but from which a lot of consequences arise:

A stack machine does not work well for performance. A compiler that needs to transform it to a real world machine (register machine) has either non linear compilation time (wasmtime) or produces slow code (wasmer). Both of them are severely behind in startup time when compared to even a non in-place interpreter (wasmi). Since startup time is as important as execution speed we stuck with an interpreter so far.
Wasm is complex. Due to its high level structure validation of the code is required before it can be compiled and ran. This validation can of course contain bugs that lead to catastrophic events. Compare that to a simple ISA (like RISC-V) which does not need global validation. Any invalid operation just traps deterministically.

The bottom line is that we want a different byte code that does not have this properties. It was unclear so far if want to use an existing architecture or write our own (based on an existing design of course).

First trial: BPF

As a first attempt to check whether supporting a new bytecode is viable we hacked together a node which supports the BPF on pallet-contracts. I strongly recommend reading this report: https://forum.polkadot.network/t/ebpf-contracts-hackathon/1084

eBPF is an interesting target because it is designed to be trivially compileable to the architectures the Linux kernel supports. Essentially just a mapping between instructions. The key inside is that it needs to be RISC and low general purpose register count at the same time. However, BPF has its problems. It is not designed for performance and the upstream LLVM backend doesn't compile all code. This is because it is designed for in-kernel use. Hence we can't use the stock Rust compiler.

While this was an interesting experiment it is probably not the bytecode we want to settle for.

Better: RISC-V

This is another bytecode that was floated as a candidate for a while. It is a logical choice: A modern clean sheet design that is modular instead of incremental just as wasm is. Quite exceptional for a real world architecture. That allows us to only support the instructions we need for contracts. The only downside when compared to BPF is that it has 32 general purpose registers instead of 11. This is a problem as our main host architecture amd64 has only 16 registers. This prevents us from mapping RISC-V registers 1to1 to native host registers. But being able to do that is what enables us to have the best of both worlds: Compile as fast as wasmi while emitting code that performs in the same order of magnitude as native code.

However, after reaching out to @koute for help he came to the conclusion that RISC-V is still a viable target and we have the following option which all come with their caveats:

Use the riscv32e target for contracts which has a reduced register set (16 regs). This is the preferred solution. However, the LLVM backend for this target is not merged and hence it is not yet supported by upstream Rust.
While JITing we just spill the high registers to the stack. Execution performance seems to be low as minimal as there are diminishing returns with more registers. However, we are interested in the worst case and this might even attack able by a malicious contract. Additionally, it adds complexity to the consensus critical JIT.
Add an offline post processing step that transforms a riscv32i (32 regs) program to a riscv32e program. This would be added to cargo-contract. Since it happens offline it can do non linear optimizations and register allocations. However, writing and maintaining this would probably be more work than just spilling the registers in JIT. It might still be worth it to reduce complexity of consensus critical code.

I cannot stress enough how instrumental @koute was for the research into RISC-V. He wrote a RISC-V to amd64 JIT in a day to proof that the plan to have a trivial JIT is viable. This is why we can be somewhat confident that RISC-V is the way forward.

This is the execution performance of that JIT (lower is better):

wasmi: 108ms
wasmer singlepass: 10.8ms
wasmer cranelift: 4.8ms
wasmtime: 5.3ms
koute JIT: 25ms

Keep in mind that zero optimization went into the JIT. It is a completely naive implementation just to proof that it works. It is reasonable to expect that we eventually perform better than wasmer singlepass while having interpreter style startup speeds.

cc @pepyakin

Next Steps

Grab the rv32e patch for LLVM, apply it and compile rustc that can emit rv32e code, and see how this affects performance, the size and JIT complexity. If this turns out to be very valuable we might want to fund the completion of the patch.
Rig ink so that it can emit RISC-V: Initial RISC-V support use-ink/ink#1718
Add a host function in substrate to execute this bytecode: Add virtualization host functions #3520
Make contracts pallet support this, and just see how a more real world use of it goes.
Write a spec for everything and further discuss the details, most likely while implementing a production-ready prototype (and there's a bit of stuff to decide here; e.g. the container to hold the bytecode [we probably don't want to use ELF], versioning, runtime memory layout, syscall interface, metering, etc.).

The text was updated successfully, but these errors were encountered:

koute · 2023-03-19T16:36:55Z

Just a quick FYI to anyone reading - regarding my RISC-V experiment, soon I will be writing a more detailed writeup of my research into this and what I've learned in more detail (I just need to finish dealing with some higher priority issues first). The TLDR version is that the RISC-V target is very promising, we should explore it further, and in my opinion is definitely a better target than eBPF in almost every aspect.

koute · 2023-03-27T15:45:58Z

Here's a link to the full writeup of my experience: https://forum.polkadot.network/t/exploring-alternatives-to-wasm-for-smart-contracts/2434

vivekvpandya · 2023-04-04T06:27:51Z

Can we make it directly support LLVM ByteCode?

koute · 2023-04-04T07:23:57Z

Can we make it directly support LLVM ByteCode?

I don't think that's a good idea. LLVM bytecode is significantly more complex, will arbitrarily change in the future (it's not a stable target like RISC-V is) and requires costly register allocation step to JIT it into native code (since it's in the SSA form). It essentially has all of the downsides of WASM, and more.

Lohann · 2023-04-10T04:54:14Z

Question: if the goal is optimize the smart-contract execution, I also think you must also consider update the SEAL interface for something that more likely supports parallel smart-contract execution.

Solana have a solution called Sealevel, which basically allows them to deterministically execute smart-contracts in parallel:

The reason why Solana is able to process transactions in parallel is that Solana transactions describe all the states a transaction will read or write while executing. This not only allows for non-overlapping transactions to execute concurrently, but also for transactions that are only reading the same state to execute concurrently as well.

Support for parallelism must be thinking from the beginning, and honestly must be considered for the Runtime too, the support for sp-tasks was removed: paritytech/substrate#12639

Lohann · 2023-04-10T05:02:10Z

~~Probably you guys already have explored this, but did you guys took a look at Wasm3 M3: Massey Meta Machine architecture? Can't pallet-contracts use a similar approach?~~

Nvm, it was already explored by wasmi: wasmi-labs/wasmi#314 (comment)

koute · 2023-04-10T06:29:51Z

Question: if the goal is optimize the smart-contract execution, I also think you must also consider update the SEAL interface for something that more likely supports parallel smart-contract execution.

I'd say the main goal is simplification; extra performance's just a bonus.

but did you guys took a look at Wasm3 M3: Massey Meta Machine architecture?

It's a neat idea for speeding up interpreters, but from the example generated machine code in the README it most likely would be significantly slower. (For that particular example my RISC-V recompiler can directly map a RISC-V or operation to the AMD64 or operation generating a single machine instruction, while theirs generates 5, and one of them is a jump.)

Lohann · 2023-04-10T20:01:07Z

For that particular example my RISC-V recompiler can directly map a RISC-V or operation to the AMD64 or operation generating a single machine instruction

Ok now I'm confused, to be able to generate 1:1 machine code, you need to compile the RISC-V into Host's machine code, and the pallet-contracts itself is compiled using WasmTime, because of that is not possible to use inline assembly in any pallet, you can't even know what is the host's architecture from inside the runtime.
I may be wrong, but in my undertanding the only way to achieve near native performance using RISC-V for dynamic code, is creating a new host function to compile it into host machine code (which may be safe it the compilation time is deterministic 1:1 instruction).

koute · 2023-04-11T05:07:46Z

is creating a new host function to compile it into host machine code

That's the plan, which is why it's important to keep it simple.

burdges · 2023-04-11T05:13:35Z

I believe state parallelism remains a general parachain/parathread issue, not smart contract specific. We'll simply cram more parachain blocks into the window between relay chain blocks, so those blocks would not themselves be parallel, aka they have a sequence, but they'll all have different approval checkers, so their workload becomes parallel to the relay chain and its validators. We'll hold inclusion and finality upon them all being included or approved.

athei · 2023-04-12T15:50:09Z

Please not discuss parallelism here. It is completely orthogonal and not at all related to pallet-contracts. If we ever decide that we want parallel runtimes (AFAIK as of right now we don't) we can look into this. I know about sea level and the required changes to our interface if that day will come. But it has nothing to do with RISC-V.

vivekvpandya · 2023-04-13T04:57:53Z

Is it possible to do first step with gccrs ? As GCC already has RVE support in upstream?
Also I read that RVE also changes stack alignment to 32 bits, I hope this is fine.

koute · 2023-04-13T06:02:33Z

Is it possible to do first step with gccrs ? As GCC already has RVE support in upstream?

No. Right now gccrs can't even compile core. Maybe in the future, but definitely not in the near future. rustc_codegen_gcc could work, although it might require some extra work (I don't know whether anyone tried to use it to generate RISC-V code nor whether it'd support RV32E out of box or need extra patches)

Also I read that RVE also changes stack alignment to 32 bits, I hope this is fine.

Doesn't matter. In the current experiment we're not even using the native stack at all. (Which could be worse for performance, but it's nice due to its simplicity.)

athei · 2023-04-14T08:46:19Z

Getting the patch into LLVM seems much closer than getting gcc to work.

Polkadot-Forum · 2023-06-01T04:34:25Z

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/ebpf-contracts-hackathon/1084/13

Polkadot-Forum · 2023-12-20T05:58:38Z

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/polkadot-release-analysis-v1-5-0/5358/1

athei · 2024-04-08T16:01:45Z

Some updates:

The PR to make PolkaVM available to pallet-contracts is open here: Add virtualization host functions #3520
The rv32e changes were finally merged into LLVM. So we can compile for PolkaVM with a stock toolchain soon: [RISCV] CodeGen of RVE and ilp32e/lp64e ABIs llvm/llvm-project#76777
The tests for pallet-contracts were converted from wat (wasm assembly) to Rust and are already compiled (but not run) for PolkaVM: Contracts: Translate .wat fixtures to rust #2654
Some runtimes can already be compiled to PolkaVM and the CI does just that: Initial support for building RISC-V runtimes targeting PolkaVM #3179 Build more runtimes targeting PolkaVM #3209
An executor for PolkaVM was merged that runs PolkaVM runtimes: Add a PolkaVM-based executor #3458

athei added J0-enhancement labels Mar 19, 2023

athei self-assigned this Mar 19, 2023

xermicus mentioned this issue Apr 12, 2023

Solang: Compiling solidity contracts to RISC-V paritytech/roadmap#15

Open

burdges mentioned this issue Aug 24, 2023

Non-laziness hashes #639

Closed

athei transferred this issue from paritytech/substrate Aug 24, 2023

the-right-joyce added I5-enhancement An additional feature request. I6-meta A specific issue for grouping tasks or bugs of a specific category. and removed J0-enhancement labels Aug 25, 2023

athei mentioned this issue Nov 7, 2023

contracts: Rewrite test fixtures in Rust #2189

Closed

lexnv pushed a commit that referenced this issue Apr 3, 2024

Change the error code returned by chainHead_follow (#115)

10d44a1

athei mentioned this issue Apr 17, 2024

contracts: Make host function benchmarks architecture independent #4163

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contracts: Support RISC-V bytecode #115

contracts: Support RISC-V bytecode #115

athei commented Mar 19, 2023 •

edited

koute commented Mar 19, 2023

koute commented Mar 27, 2023

vivekvpandya commented Apr 4, 2023

koute commented Apr 4, 2023

Lohann commented Apr 10, 2023 •

edited

Lohann commented Apr 10, 2023 •

edited

koute commented Apr 10, 2023

Lohann commented Apr 10, 2023 •

edited

koute commented Apr 11, 2023

burdges commented Apr 11, 2023 •

edited

athei commented Apr 12, 2023

vivekvpandya commented Apr 13, 2023

koute commented Apr 13, 2023

athei commented Apr 14, 2023

Polkadot-Forum commented Jun 1, 2023

Polkadot-Forum commented Dec 20, 2023

athei commented Apr 8, 2024 •

edited

contracts: Support RISC-V bytecode #115

contracts: Support RISC-V bytecode #115

Comments

athei commented Mar 19, 2023 • edited

Why we need a new bytecode

First trial: BPF

Better: RISC-V

Next Steps

koute commented Mar 19, 2023

koute commented Mar 27, 2023

vivekvpandya commented Apr 4, 2023

koute commented Apr 4, 2023

Lohann commented Apr 10, 2023 • edited

Lohann commented Apr 10, 2023 • edited

koute commented Apr 10, 2023

Lohann commented Apr 10, 2023 • edited

koute commented Apr 11, 2023

burdges commented Apr 11, 2023 • edited

athei commented Apr 12, 2023

vivekvpandya commented Apr 13, 2023

koute commented Apr 13, 2023

athei commented Apr 14, 2023

Polkadot-Forum commented Jun 1, 2023

Polkadot-Forum commented Dec 20, 2023

athei commented Apr 8, 2024 • edited

athei commented Mar 19, 2023 •

edited

Lohann commented Apr 10, 2023 •

edited

Lohann commented Apr 10, 2023 •

edited

Lohann commented Apr 10, 2023 •

edited

burdges commented Apr 11, 2023 •

edited

athei commented Apr 8, 2024 •

edited