From 8c2118386a01268f3bd66a902eea71e1410164ad Mon Sep 17 00:00:00 2001 From: Sylvain Henry Date: Thu, 22 Sep 2022 11:58:37 +0200 Subject: [PATCH] JS backend status update --- blog/2022-09-22-ghc-js.md | 218 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 218 insertions(+) create mode 100644 blog/2022-09-22-ghc-js.md diff --git a/blog/2022-09-22-ghc-js.md b/blog/2022-09-22-ghc-js.md new file mode 100644 index 00000000..8ee2280b --- /dev/null +++ b/blog/2022-09-22-ghc-js.md @@ -0,0 +1,218 @@ +--- +slug: 2022-09-22-ghc-js +title: Status of GHC's JavaScript backend (September 2022) +authors: [sylvain] +tags: [ghc] +--- + +For a few months, the GHC DevX team at IOG has been working on implementing a JavaScript +backend for GHC. Luckily we didn't have to start from scratch because it had +been worked on for a decade in the GHCJS project conducted by our teammate Luite +Stegeman. The first release of this work is planned for the upcoming GHC 9.6. +This post contains details about the current status and about our roadmap. + +## From GHC 8.10 to GHC 9.6 + +The latest version of GHCJS is based on a fork of GHC 8.10.7. One of our tasks +has been to adapt it to GHC head. In practice it meant adding support for new +primops; adapting to ghc-bignum, to internal GHC API changes, and to various +other changes (function prototypes)... + +The JS backend is developed in a branch that we rebase on GHC head every week. + +## Unified build system + +GHCJS is known to be complex to build, relying on custom build scripts to deal +with the GHC fork it uses, etc. +The JS backend however is as easy to build as any other GHC. It doesn't require +any wrapper script, only the emscripten "emconfigure" tool. +You can build the JS backend by following these instructions: + +TODO instructions + +The Hadrian build system has been adapted to support Cabal's `js-sources` +stanzas that are to support user-provided `.js` files. The RTS and the `base` +package both require this feature. + + +## Testsuite + +Hadrian can also be used to run GHC's testsuite. +We fixed it so that it can now be used to test cross-compilers. +Testing the JS backend can now be done as follows: + +TODO instructions + +At the time of writing there are still many failing tests. +If you want to help fixing them, be sure to get in touch with us so that we +coordinate. +Our aim is to remove all the failures before the merge into the main branch. +In some cases (e.g. tests for compact regions) we may have to disable some +broken tests (only for the JS backend of course) and to open issues in the bug +tracker. + + +## Removal of external dependencies + +GHCJS made use of non-boot libraries (text, lens, megaparsec, aeson, etc.) that +GHC's JS backend can't use. +We've modified the code to avoid the use of these libraries. + +GHCJS provided a few libraries (ghcjs-base, ghcjs-prim, etc.). +Instead of adding them as new boot libraries, we merged them into the existing +boot libraries (base, ghc-prim, etc.). + + +## General cleanup and documentation + +GHC provides some utilities (pretty-printer, binary serialization, string +interning, etc.) that weren't used before. +We adapted the code to use them to make the JS backend similar to other +backends and for performance reasons. + +3 of us (out of 4) were totally new to GHCJS's code base. +We strived to understand the code and to make it easier to understand by adding +a lot of comments and by refatoring it. + +We have a few blog posts explaining some technical details about GHCJS's +internals. +Most of the contents of the blog posts should have made its way into comments in +the code itself. + +TODO list blog posts + +Some modules of the JS backend still need to be cleaned up before the merge +though. + + +## FFI + +Usual FFI imports can be used to call a foreign function. +However GHCJS also supports some fancy syntax that indicates that a foreign +import is in fact a template of code to inline at call sites. +The JS backend only supports FFI calls for now. +Converting codes making use of the fancy syntax is usually straightforward, for +example with fat arrows: + +TODO example of code change + +We did this for the FFI imports in base. +Performance may not be as optimal as with the fancy syntax, depending on if the +JS engine optimises anonymous function applications like this. + +Reintroducing the fancy syntax could be done later. +It could use a ghc-proposal to motivate it with performance data and +bikeshedding of the syntax. +The JS backend could also automatically figure out how to perform the FFI +function application statically. + + +## Template Haskell + +GHC supports external interpreters for Template Haskell that are especially +useful for cross-compilation (the internal interpreter requires the target code to be +linked with the compiler, which is impossible with a cross-compiler). + +Incidentally the design of the external interpreter (also called "Iserv") +originated in GHCJS's TH server, but there are quite a few differences between +the two. +We are currently investigating how to retrofit GHCJS's TH runner into GHC's +Iserv but we don't have results to present yet. + + +## Plugins + +GHC doesn't support plugins when it is built as a cross-compiler (cf #14335). +This is because it isn't modular enough to support two environments at once: one +for the target code (JS code here) and one for the host (native x86 or ARM code +for the plugin). +We've spent a lot of time making it more modular (see our white paper and HIW +lightning talk TODO links) but there is a lot more to do to achieve this. + +GHCJS used a fragile hack to support plugins: at plugin loading time it would +substitute the plugin package with another corresponding one from another +package database. +It was fragile because it could mess up with GHC's single environment +assumptions. + +We didn't port GHCJS's hack. Nevertheless we have implemented a new way for GHC +to load plugins directly from libraries instead of packages. +This method doesn't require GHC to load module interfaces for the plugin and its +dependencies. +You can load a plugin with the following flags: + +TODO plugin load example + + +## Performance and code size + +Performance and code size should be worse with the JS backend in its current +state than with GHCJS! + +We have been following roughly this roadmap: +- make it work: JS backend able to build a runnable HelloWorld program +- make it correct: fixing bugs found by the testsuite (current stage) +- make it fast: not yet started! + +In particular we haven't yet ported the following GHCJS passes: +- JS code optimiser +- link-time-optimizations ("compactor") + +We do plan to reimplement them in some form in the future. +We may use this refactoring opportunity to introduce an intermediate AST between +STG and JS that would make them more elegant (e.g. no need to parse JS +identifiers generated by GHCJS). + +The JS backend performance itself may be suboptimal as we haven't made any +serious profiling of it yet. +Any help appreciated! Get in touch with us if you want to help so that we can +coordinate our efforts. + + +## Libraries: C sources and shims + +Libraries that use C sources (`c-sources` Cabal stanza) aren't supported by the +JS backend. +In the future we could probably use Emscripten to compile the C source and +generate some adapter code for it, but this isn't done yet. + +There are two ways to fix libraries that use C sources. The C code has to be +rewritten either in Javascript or in Haskell. Then it is possible to use Cabal +predicates (e.g. `arch(js)`) to select between the different versions. + +We do have a preference for writing a pure Haskell version because it is more +future proof. For example if someone adds some new backends for Lua, Java, CLR, +etc. +That's basically what we've done when we wrote ghc-bignum which provides a +"native" implementation written in Haskell that is functionally equivalent to +the GMP based implementation. +Also writing Haskell is way more pleasant than writing Javascript code. +We wrote the JS backend to avoid you this pain so please use it. + +Note that GHCJS came with a "shim" library where a shim is a JS source for some +package. +The JS backend won't provide shims so these JS sources will have to be +upstreamed or reimplemented in Haskell. +We already started to implement a pure Haskell version of ByteStrin: + +TODO link + +Any help with this is of course welcome! + + +## How to help? + +We have now reached a point where anyone can easily build and test the JS +backend. +If you want to join the effort, get in touch with us so that we can coordinate. +We have set up #ghcjs IRC channel on libera.chat or by mail. + +TODO: DO IT + +When the branch will be merged, it'll be simpler to coordinate via GHC's gitlab +too. + +A few people already offered their help: thank you! +Until recently it was difficult to split the work into independent tasks: one +fix led to a new failure, etc. +But now it's the good time to do it!