Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

GSoC 2020: Webassembly backend for the Ergo compiler #777

Closed
wants to merge 30 commits into from

Conversation

pkel
Copy link
Collaborator

@pkel pkel commented Aug 29, 2020

GSoC 2020: Webassembly Backend for the Ergo compiler

I worked on translating Ergo smart contracts to Webassembly as part of Google Summer of Code (GSoC) 2020. This PR documents

  • the goals of the project,
  • technical aspects of the implementation,
  • the achievements made, and
  • future work.

I will submit a link to this PR for final evaluation of my GSoC project. We probably want to freeze this PR on Monday 8/31/2020 and keep it around for future reference. Future additions should go to Jerome's PR #773.

Goals

Webassembly (Wasm) is a novel binary instruction format designed as portable compilation target for higher programming languages. It was primarily designed for efficient, fast, and safe execution on the web stack. Unlike previous attempts for bringing portable executables to the web (Flash, Java Applets, ActiveX), Wasm succeeded in being adopted in all major browsers. Additionally, Wasm is increasingly used outside the browser, e.g.:

  • Cloud providers investigate how Wasm could enable safe execution of functions on Edge hardware (e.g. Cloudflare, Fastly).
  • Various blockchains adopt Wasm as executable format for their distributes state machines (e.g. Ethereum, EOS, Substrate).
  • Various projects work on interoperability between different language platforms based on Wasm (e.g. Bytecode Alliance, Wasmer).

Wasm as a compilation target is interesting for the Accord project because it provides safe execution and access to wider range of platforms (relative to the currently used Javascript backend). I provide a more detailed argument in this Tech-WG call recording.

The goal of this GSoC project was to investigate how Ergo smart contracts can be translated to and executed as Wasm. A complete compiler backend and execution engine seemed unrealistic from the beginning. We thus settled on working towards a proof-of-concept implementation that can compile and run basic Ergo contracts.

Technical Overview

Ergo is based on the verified query compiler Q*cert which is implemented in Coq. Q*cert translates from various input query languages such as SQL to Spark, Java, and Javascript. Accord projects adds an additional frontend for their smart contract language Ergo. Naturally, we implement the Wasm backend for Ergo by adding an additional compiler backend in Q*cert. Consequently, most of my work happened in the Q*cert repository. I document the technical details over there.

In a nutshell: Q*cert translates Ergo to an imperative intermediate representation (Imp) which has only basic control flow constructs, few operators, and operates on JSON-like values. Q*cert gives strong correctness guarantees about this part of the compilation: evaluation based equality of Ergo input and Imp output is proven in the proof assistant Coq. The translation step between Imp and Javascript is relatively small. This last part of Ergo's existing compilation not verified and implemented in OCaml.

For the Wasm backend, I start from the same Imp. I translate the Imp control flow to Wasm using OCaml. I directly target the abstract syntax tree (AST) provided with the Wasm reference implementation (also OCaml). This allows to reuse their serialization of the AST into the Wasm binary and text formats. It also enables testing against the reference Wasm interpreter.

The generated Wasm binaries rely on a Wasm runtime for execution. This runtime supplies a runtime encoding for Imp's JSON-like values and implements Imp's operators. The runtime is implemented in Assemblyscript. This setup allows us to reuse existing implementations for memory allocation, garbage collection, strings, arrays and dictionaries. Unfortunately, we also lose control about the exact memory representation of Imp values. In order to supply arguments to and read results from the Wasm contract, I developed a binary encoding of Imp's JSON values. This encoding is heavily inspired by MessagePack but I omit some features and optimization for easier implementation.

Execution of ergo smart contracts on the new Wasm backend goes as follows.

  1. The compiler generates a logic.wasm that describes the clauses of as smart contract as Wasm functions.
  2. The engine instantiates and links logic.wasm and runtime.wasm.
  3. The engine encodes the arguments into the binary encoding and writes them the runtime's linear memory.
  4. The engine calls the function corresponding to the to-be-evaluated contract clause with the address of the binary encoded values as argument.
  5. The Wasm runtime (NodeJS) executes the program given in logic.wasm which includes the following steps.
    • decoding the binary value into a runtime value
    • calling operators in runtime.wasm
    • encoding the resulting runtime value into the binary format
    • returning the address of the resulting binary value
  6. The engine reads the binary return value from the runtime's linear memory and translates it into the output format.

I provide execution engines for OCaml and NodeJS. The former is used for testing in the Q*cert repository. The latter enables debugging with Chrome DevTools. This PR integrates the NodeJS engine with the Ergo command line utility (mostly my supervisor's @jeromesimeon work since he is familiar with the code base).

Achievements

We can compile basic Ergo smart contracts to Wasm and execute them using the Ergo command line utility. You can watch a recorded demo in my final presentation of the project on the TechWG call (21m25s into the video). If you want to try it yourself, you can follow these steps.

# setup
git clone -b wasm git@github.com:accordproject/ergo.git
cd ergo
git clone -b wasm https://github.com/querycert/qcert
cd qcert
opam install .
cd ..
make configure && make all
cd tests
# compile to and run on javascript backend
../packages/ergo-cli/index.js compile --target es6 helloworld/model/model.cto helloworld/logic/logic.ergo
./packages/ergo-cli/index.js trigger --template helloworld --data helloworld/data.json --request helloworld/request.json --state helloworld/state.json --target es6
# compile to and run on wasm backend
../packages/ergo-cli/index.js compile --target wasm helloworld/model/model.cto helloworld/logic/logic.ergo
./packages/ergo-cli/index.js trigger --template helloworld --data helloworld/data.json --request helloworld/request.json --state helloworld/state.json --target wasm

This also works on non-trivial tests like volumediscount, helloworldenforce, and integertest. However, not all Ergo concepts are supported.

Future Work

In order to turn this PoC into a usable piece of software, some steps are missing.

Essentials:

  • support all Imp operators (qcert #133)
  • advanced types (e.g. dates)

Improvements:

  • clean up Ergo integration (#769)
  • verify parts of the translation (qcert #146)

Integration:

  • into the AP platform (i.e. cicero archive --target wasm)
  • potentially: Ergo on GraalVM (replacing Java backend?)
  • potentially: Ergo in a database (e.g. Postgres) or on a Blockchain (e.g. Ethereum, Substrate)
  • potentially: Wasm FFI (e.g. call Rust from Ergo via Wasm) (Wasm Interoperability #727)

jeromesimeon and others added 30 commits August 24, 2020 17:33
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Patrik Keller <patrik@keller-re.de>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
… execution

Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
… target

Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Enables true ergo -> wasm -> invoke/trigger pipeline.

Signed-off-by: Patrik Keller <patrik@keller-re.de>
Signed-off-by: Patrik Keller <patrik@keller-re.de>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
… how it is being distributed

Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
Signed-off-by: Jerome Simeon <jeromesimeon@me.com>
@coveralls
Copy link

Coverage Status

Coverage decreased (-1.5%) to 94.069% when pulling 1ee3619 on gsoc2020-wasm into 1a11e14 on release-1.0.

@pkel pkel changed the title GSoC 2020: Webassembly Backend for the Ergo compiler GSoC 2020: Webassembly backend for the Ergo compiler Aug 30, 2020
@jeromesimeon
Copy link
Member

GSoC 2020: Webassembly Backend for the Ergo compiler

I worked on translating Ergo smart contracts to Webassembly as part of Google Summer of Code (GSoC) 2020. This PR
...

@pkel thanks for a great contribution to Accord Project! (and for the great time this summer). We are planning to review this work / merge and test it further and release it in the next major iteration of the Ergo compiler and Accord Project templating system. 🥇

Some notes on this PR:

  • Please make it "ready for review" ( I think it is ready to be reviewed and merged soon in the upcoming release-1.0 branch of Ergo)
  • The main part of the new compiler code is actually in GSoC 2020: Webassembly backend querycert/qcert#147
  • CircleCI tests are currently not-passing due to the fact that the PR requires the Q*cert changes to be merged and published first. But I have tested this locally and it works with a local build.
  • Merge plan:
    1. I have a discussion with the Q*cert team scheduled for Wednesday where we will do a final review of the work on the compiler backend and (hopefully) merge that PR.
    2. Republish Q*cert on the OCaml opam package manager with the new WASM support and use that version in this Ergo PR (this will make the CircleCI tests pass again)
    3. Merge this PR.

@pkel pkel marked this pull request as ready for review August 31, 2020 16:26
@jeromesimeon
Copy link
Member

jeromesimeon commented Mar 14, 2021

Closing this PR (now that GSoC 2020 is far behind us) in favour of #773. Will keep the branch around as a documentation to the GSoC work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants